A robust, serverless Model Context Protocol (MCP) server for web scraping, designed to give LLMs (like Claude, Gemini, or custom bots) access to live web data.
- Serverless: Runs on AWS Lambda (or generically via Docker). Scale-to-zero cost.
- Stateless Architecture: Uses HTTP JSON-RPC (
POST /mcp) for robust Lambda compatibility. - Security: Protected by API Key Authentication and SSRF validation (TODO).
- Tools:
scrape_url: Fetches page, strips clutter (ads/nav/scripts), and returns clean Markdown.get_page_metadata: Extracts title, description, and author.
-
Install Dependencies
python -m venv venv source venv/bin/activate # or venv\Scripts\activate on Windows pip install -r requirements.txt
-
Run Server
uvicorn app.main:app --port 8000 --reload
-
Test It
Linux / Mac:
curl -X POST http://localhost:8000/mcp \ -H "Content-Type: application/json" \ -H "X-API-Key: dev-key" \ -d '{"jsonrpc": "2.0", "method": "tools/call", "params": {"name": "scrape_url", "arguments": {"url": "https://example.com"}}, "id": 1}'
Windows PowerShell:
Invoke-RestMethod -Uri "http://localhost:8000/mcp" ` -Method Post ` -Headers @{ "X-API-Key" = "dev-key" } ` -ContentType "application/json" ` -Body '{"jsonrpc": "2.0", "method": "tools/call", "params": {"name": "scrape_url", "arguments": {"url": "https://example.com"}}, "id": 1}' ` | Select-Object -ExpandProperty result | Select-Object -ExpandProperty content | Select-Object -ExpandProperty text
Since Claude Desktop runs locally, you can point it to your local server or the deployed URL.
Modify your claude_desktop_config.json:
{
"mcpServers": {
"scraper": {
"command": "uv",
"args": ["run", "mcp-scraper"],
"env": {"MCP_API_KEY": "dev-key"},
"url": "http://localhost:8000/mcp"
}
}
}See our Chatbot Tutorial.
Your bot needs to perform a rudimentary "Tool Call" flow:
- LLM outputs JSON:
{"tool": "scrape_url", "url": "..."}. - Bot makes POST request to
https://<lambda-url>/mcp. - Bot feeds response back to LLM.
This project is configured for AWS SAM.
-
Build
sam build
-
Deploy
sam deploy --guided
Enter your desired API Key when prompted.
-
Get URL The output will provide your public Function URL. Use this in your
mcp_server_config.json.