The thing about information on the web is that it doesn't want to be found. It wants to hide behind cookie banners, keep itself to itself, and generally behave like a cat that knows it's time for the vet. Web Forager is the sort of dogged, slightly grubby assistant who goes out there anyway — accompanied by a duck of questionable temperament — rummages through DuckDuckGo, grabs pages directly when it can, and calls in Jina Reader when things get complicated. The results come back neatly converted for LLM consumption, which is to say, in a format that would make a librarian weep with either joy or despair, depending on the librarian.
A search-and-fetch toolkit for AI agents, available as an MCP server and as standalone Agent Skills:
- Search the web via DuckDuckGo
- Search news via DuckDuckGo News
- Fetch and convert web pages (direct HTTP + trafilatura, Jina Reader fallback)
Also ships five Agent Skills that work independently — no MCP required — for research, fact-checking, news monitoring, competitive analysis, and technology evaluation.
- DuckDuckGo web search with safe search controls
- DuckDuckGo news search with date-sorted results and source attribution
- Fetch and convert URLs to markdown or JSON (direct HTTP + trafilatura, Jina Reader fallback)
- LLM-friendly output format option for search results
- CLI for search, news, fetch, serve, and version commands
- MCP tools for LLM integration
- Five standalone Agent Skills for specialized research workflows
- Docker support for containerized deployment
- Python 3.10 or higher
- uv (recommended) or pip
# Using uv (recommended)
uv pip install web-forager
# Or using pip
pip install web-forager# Install UVX if you haven't already
pip install uvx
# Install the Web Forager package
uvx install web-foragerFor development or to get the latest changes:
# Clone the repository
git clone https://github.com/CyranoB/web-forager.git
cd web-forager
# Install with uv (recommended)
uv pip install -e .
# Or with pip
pip install -e .Build and run with Docker:
# Build the image (uses version from latest git tag)
docker build --build-arg VERSION=$(git describe --tags --abbrev=0 | sed 's/^v//') -t web-forager .
# Or specify a version manually
docker build --build-arg VERSION=2.0.2 -t web-forager .
# Run the server (MCP servers use STDIO, so typically run within an MCP client)
docker run -i web-forager# Start the server in STDIO mode (for use with MCP clients like Claude)
web-forager serve
# Enable debug logging
web-forager serve --debug# Search DuckDuckGo (JSON output, default)
web-forager search "your search query" --max-results 5 --safesearch moderate
# Search with LLM-friendly text output
web-forager search "your search query" --output-format text# Search DuckDuckGo news (JSON output, default)
web-forager news "your search query" --max-results 10 --safesearch moderate
# Search with LLM-friendly text output
web-forager news "your search query" --output-format text# Fetch a URL and return markdown
web-forager fetch "https://example.com" --format markdown
# Fetch a URL and return JSON
web-forager fetch "https://example.com" --format json
# Limit output length
web-forager fetch "https://example.com" --max-length 2000
# Include generated image alt text
web-forager fetch "https://example.com" --with-images# Show version
web-forager version
# Show detailed version info
web-forager version --debugThis MCP server works with any MCP-compatible client. Use one of the setups below.
Python 3.10-3.13 is supported (3.14 not yet). Use --python ">=3.10,<3.14" with uvx to enforce. Verified with Python 3.12 and 3.13.
- Open Claude Desktop > Settings > Developer > Edit Config.
- Edit the config file:
- macOS:
~/Library/Application Support/Claude/claude_desktop_config.json - Windows:
%APPDATA%\Claude\claude_desktop_config.json
- macOS:
- Add the server config under
mcpServers:{ "mcpServers": { "web-forager": { "command": "uvx", "args": ["--python", ">=3.10,<3.14", "web-forager", "serve"] } } } - Restart Claude Desktop.
Add a local stdio server:
claude mcp add --transport stdio web-forager -- uvx --python ">=3.10,<3.14" web-forager serveOptional: claude mcp list to verify, or claude mcp add-from-claude-desktop to import.
Add via CLI:
codex mcp add web-forager -- uvx --python ">=3.10,<3.14" web-forager serveOr configure ~/.codex/config.toml:
[mcp_servers.web-forager]
command = "uvx"
args = ["--python", ">=3.10,<3.14", "web-forager", "serve"]Add to your OpenCode config (~/.config/opencode/opencode.json or project opencode.json):
{
"$schema": "https://opencode.ai/config.json",
"mcp": {
"web-forager": {
"type": "local",
"command": ["uvx", "--python", ">=3.10,<3.14", "web-forager", "serve"],
"enabled": true
}
}
}Or run opencode mcp add and follow the prompts.
Add to ~/.cursor/mcp.json (global) or .cursor/mcp.json (project):
{
"mcpServers": {
"web-forager": {
"command": "uvx",
"args": ["--python", ">=3.10,<3.14", "web-forager", "serve"]
}
}
}Verify with:
cursor-agent mcp listAdd to your Gemini CLI settings file:
- Global:
~/.gemini/settings.json - Project:
.gemini/settings.json
{
"mcpServers": {
"web-forager": {
"command": "uvx",
"args": ["--python", ">=3.10,<3.14", "web-forager", "serve"],
"timeout": 30000
}
}
}Verify the server is configured:
gemini tools listThe server exposes these tools to MCP clients:
@mcp.tool()
def duckduckgo_search(
query: str,
max_results: int = 5,
safesearch: str = "moderate",
output_format: str = "json"
) -> list | str:
"""Search DuckDuckGo for the given query."""@mcp.tool()
def duckduckgo_news_search(
query: str,
max_results: int = 10,
safesearch: str = "moderate",
output_format: str = "json"
) -> list | str:
"""Search DuckDuckGo for recent news articles."""@mcp.tool()
def web_fetch(url: str, format: str = "markdown", max_length: int | None = None, with_images: bool = False) -> str | dict:
"""Fetch a URL and convert it to markdown or JSON.
Tries direct HTTP fetch first, falls back to Jina Reader."""Example usage in an MCP client:
# This is handled automatically by the MCP client
results = duckduckgo_search("Python programming", max_results=3)
news = duckduckgo_news_search("AI regulation 2026", max_results=5)
content = web_fetch("https://example.com", format="markdown")
# Get LLM-friendly text output
text_results = duckduckgo_search("Python programming", output_format="text")- Tool Name:
duckduckgo_search - Description: Search the web using DuckDuckGo (powered by the
ddgslibrary)
query(string, required): The search querymax_results(integer, optional, default: 5): Maximum number of search results to returnsafesearch(string, optional, default: "moderate"): Safe search setting ("on", "moderate", or "off")output_format(string, optional, default: "json"): Output format - "json" for structured data, "text" for LLM-friendly formatted string
JSON format (default): A list of dictionaries:
[
{
"title": "Result title",
"url": "https://example.com",
"snippet": "Text snippet from the search result"
}
]Text format: An LLM-friendly formatted string:
Found 3 search results:
1. Result title
URL: https://example.com
Summary: Text snippet from the search result
2. Another result
URL: https://example2.com
Summary: Another snippet
- Tool Name:
duckduckgo_news_search - Description: Search for recent news articles using DuckDuckGo (powered by the
ddgslibrary)
query(string, required): The news search querymax_results(integer, optional, default: 10): Maximum number of news results to returnsafesearch(string, optional, default: "moderate"): Safe search setting ("on", "moderate", or "off")output_format(string, optional, default: "json"): Output format - "json" for structured data, "text" for LLM-friendly formatted string
JSON format (default): A list of dictionaries:
[
{
"title": "News headline",
"url": "https://example.com/article",
"snippet": "Article summary text",
"date": "2026-03-01T12:00:00+00:00",
"source": "News Outlet"
}
]Text format: An LLM-friendly formatted string:
Found 3 news results:
1. News headline
URL: https://example.com/article
Date: 2026-03-01T12:00:00+00:00
Source: News Outlet
Summary: Article summary text
- Tool Name:
web_fetch - Description: Fetch a URL and convert it to markdown or JSON. Tries direct HTTP fetch with trafilatura for fast content extraction, falls back to Jina Reader for JavaScript-heavy or bot-protected pages.
url(string, required): The URL to fetch and convertformat(string, optional, default: "markdown"): Output format ("markdown" or "json")max_length(integer, optional): Maximum content length to return (None for no limit)with_images(boolean, optional, default: false): Whether to include images in the output
For markdown format: a string containing markdown content
For JSON format: a dictionary with the structure:
{
"url": "https://example.com",
"title": "Page title",
"content": "Markdown content"
}This repo includes five Agent Skills that orchestrate the MCP's search and fetch tools into specialized workflows. Each skill follows the open Agent Skills specification and works with Claude Code, Codex CLI, and other compatible agents.
All skills work without the MCP configured — they use the ddgs Python library and the Jina Reader HTTP API directly. If MCP tools are available in the session, they prefer those automatically.
Register this repo as a plugin marketplace, then install all five skills at once:
# Add the marketplace
/plugin marketplace add CyranoB/web-forager
# Install all 5 skills
/plugin install forager-skills@web-foragerClaude Code:
# Install a specific skill
claude install-skill ./skills/web-research
# Or from GitHub
claude install-skill github:CyranoB/web-forager/skills/web-researchManual (any agent):
Copy a skill folder from skills/ into your agent's skills directory (e.g., ~/.claude/skills/ or .claude/skills/).
| Skill | Triggers on | Output |
|---|---|---|
| web-research | "research X", "look up X", "deep dive into X" | Adaptive report (quick answer / standard / deep dive) with citations |
| fact-check | "is it true that X", "verify this claim", "fact check this" | Verdict (Confirmed -> False) with evidence for and against |
| news-monitor | "what's new with X", "recent news about X", "catch me up on X" | Chronological news briefing with headlines and details |
| competitive-intel | "competitive landscape for X", "market study", "how do we compare to competitors" | Market landscape map or competitive positioning analysis with pricing, gaps, and recommendations |
| tech-advisor | "should we adopt X", "is X production ready", "X vs Y for my needs" | Maturity scorecard (Adopt/Trial/Assess/Hold) or product comparison with evidence |
- Search and news search use the
ddgspackage (renamed fromduckduckgo-search). - Fetch tries direct HTTP + trafilatura first for speed, falls back to Jina Reader for JavaScript-heavy or bot-protected pages.
Contributions are welcome! Here's how you can contribute:
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add some amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
If you encounter any issues or have questions, please open an issue.
This project is licensed under the MIT License - see the LICENSE file for details.
