A standalone MCP (Model Context Protocol) server providing web research tools. Three battle-tested tools for AI assistants: search the web, fetch & convert pages to markdown, and run compound multi-source research — all via the MCP stdio protocol.
| Tool | Description |
|---|---|
web_search |
3-tier search cascade: Brave API → DuckDuckGo → HTML scraper |
fetch_url |
Fetch any URL → clean markdown, with SSRF protection and 24h cache |
research |
Compound pipeline: query rewrite → search → parallel fetch → summarize → synthesize |
All tools are read-only — they fetch and transform public web content, never modify anything.
pip install mcp-researchOr run directly with uvx (zero-install):
uvx mcp-researchAll configuration is via environment variables — no config files needed.
| Variable | Default | Description |
|---|---|---|
BRAVE_API_KEY |
(empty) | Brave Search API key. Falls back to DuckDuckGo if unset. |
OLLAMA_URL |
http://localhost:11434 |
Ollama endpoint for summarization/synthesis. Set empty to disable. |
OLLAMA_MODEL |
qwen2.5:14b |
Model to use for summarization and synthesis. |
MCP_RESEARCH_CACHE_DIR |
~/.mcp-research/cache/ |
URL fetch cache directory. |
MCP_RESEARCH_CACHE_TTL |
24 |
Cache TTL in hours. |
MCP_RESEARCH_LOG_DIR |
~/.mcp-research/logs/ |
Search log directory (NDJSON). |
MCP_RESEARCH_MAX_RESULTS |
10 |
Default max search results. |
Add to your Claude Code MCP config (~/.claude/settings.json or project .mcp.json):
{
"mcpServers": {
"research": {
"command": "uvx",
"args": ["mcp-research"],
"env": {
"BRAVE_API_KEY": "BSA...",
"OLLAMA_URL": "http://localhost:11434"
}
}
}
}Add to claude_desktop_config.json:
{
"mcpServers": {
"research": {
"command": "uvx",
"args": ["mcp-research"],
"env": {
"BRAVE_API_KEY": "BSA..."
}
}
}
}web_search(query, max_results=5, summarize=False, auto_fetch_top=False)
Searches the web using a 3-tier cascade for maximum reliability:
- Brave Search API — fast, high quality (requires
BRAVE_API_KEY) - DuckDuckGo library — no API key needed, retries on rate limit
- DuckDuckGo HTML scraper — last-resort fallback
Options:
summarize: Use Ollama to summarize results (requires running Ollama)auto_fetch_top: Also fetch and return the full content of the top result
fetch_url(url, summarize=False, max_chars=50000)
Fetches a URL and converts it to clean markdown:
- SSRF protection: Blocks localhost, private IPs, non-HTTP schemes
- Smart retry: Exponential backoff on 429/5xx, per-hop redirect validation
- 24h cache: SHA-256 keyed, configurable TTL
- Content support: HTML → markdown, JSON → code block, binary → rejected
- Smart truncation: Breaks at heading/paragraph boundaries, not mid-text
research(query, depth="standard", context="")
Compound research pipeline:
- Query rewrite — Ollama optimizes your question into search keywords
- Web search — finds relevant pages (with zero-result retry expansion)
- Parallel fetch — fetches top N pages concurrently
- Summarize — Ollama summarizes each page
- Synthesize — Ollama produces a final cited answer
Depth levels:
| Depth | Pages | Synthesis |
|---|---|---|
quick |
2 | No |
standard |
5 | Yes |
deep |
10 | Yes |
All steps gracefully degrade without Ollama — you still get search results and raw page content.
git clone https://github.com/MABAAM/Maibaamcrawler.git
cd Maibaamcrawler
pip install -e .
python -m mcp_researchMIT