scpr is a simple and straightforward webscraping CLI tool made to scrape page as markdown content, and developed to be used both by humans and by coding agents (either as an MCP server or as a skill).
scpr is written in Go and based on colly for web scraping and html-to-markdown for converting HTML pages to markdown.
Install with Go (v1.24+ required):
go install github.com/AstraBert/scprInstall with NPM:
npm install @cle-does-things/scprIf you are on Windows, scpr might not be available right after global installation with npm. In that case, you might need to take extra steps:
- Find where the
nodeexecutable is stored on your machine:
Get-Command nodeThis will print the directory where node.exe is stored: scpr will be installed at .\bin\scpr.exe in that folder.
Note
If you are using nvm for Windows, node.exe will be at C:\Users\nvm4w\nodejs
- Add
{NODE_FOLDER}\bin(in the case of nvm:C:\Users\nvm4w\nodejs\bin) to the PATH environment variables. Follow this guide for instructions on how to set PATH env variables. - Restart your computer
- Test
scpr --helpfrom your terminal. The execution might be challenged by your antivirus, but, since the executable does not contain any harmful code, the antivirus will eventually allow it
Basic usage (scrape a single page):
scpr --url https://example.com --output ./scrapedThis will scrape the page and save it as a markdown file in the ./scraped folder.
Recursive scraping
To scrape a page and all linked pages within the same domain:
scpr --url https://example.com --output ./scraped --recursive --allowed example.com --max 3Parallel scraping
Speed up recursive scraping with multiple threads:
scpr --url https://example.com --output ./scraped --recursive --allowed example.com --max 2 --parallel 5Additional options
--log- Set logging level (info, debug, warn, error)--max- Maximum depth of pages to follow (default: 1)--parallel- Number of concurrent threads (default: 1)--allowed- Allowed domains for recursive scraping (can be specified multiple times)
For more details, run:
scpr --helpStart the MCP server with:
scpr mcpAnd configure it in agents using:
{
"mcpServers": {
"web-scraping": {
"type": "stdio",
"command": "scpr",
"args": [
"mcp"
],
"env": {}
}
}
}The above JSON snippet is reported as used by Claude Code, adapt it to your agent before using it
Contributions are welcome! Please read the Contributing Guide to get started.
This project is licensed under the MIT License