This project provides a minimal, async MCP (Model Context Protocol) server that exposes a tool for retrieving and cleaning official documentation content for popular AI / Python ecosystem libraries. It uses:
fastmcp
to define and run the MCP server over stdio.httpx
for async HTTP calls.serper.dev
for Google-like search (via API).groq
API (LLM) to clean raw HTML into readable text chunks.python-dotenv
for environment variable management.uv
as the package manager & runner (fast, lockfile-based, Python 3.11+).
- Search restricted to official docs domains (
uv
,langchain
,openai
,llama-index
). - Tool:
get_docs(query, library)
returns concatenated cleaned sections withSOURCE:
labels. - Streaming-safe async design (chunking large HTML pages before LLM cleaning).
- Separate
client.py
demonstrating how to connect as an MCP client and call the tool, then post-process with an LLM.
Prerequisites:
- Python 3.11+
uv
installed (https://docs.astral.sh/uv/)- API keys for:
SERPER_API_KEY
,GROQ_API_KEY
git clone <your-repo-url> mcp-server-python
cd mcp-server-python
uv sync
This will create/refresh a .venv
based on pyproject.toml
+ uv.lock
.
Create a .env
file in the project root:
SERPER_API_KEY=your_serper_api_key_here
GROQ_API_KEY=your_groq_api_key_here
Optional: add other model settings if you later extend functionality.
uv run mcp_server.py
The server will start and wait on stdio (no extra output unless you add logging). It registers the tool get_docs
.
uv run client.py
You should see something like:
Available tools: ['get_docs']
ANSWER: <model-produced answer referencing SOURCE lines>
If the list is empty, ensure the server started correctly and no exceptions were raised (add logging—see below).
Signature:
get_docs(query: str, library: str) -> str
Supported libraries (keys): uv
, langchain
, openai
, llama-index
.
Flow:
- Build a site-restricted query:
site:<docs-domain> <query>
. - Call Serper API for organic results.
- Fetch each result URL (async) via
httpx
. - Split HTML into ~4000‑char chunks (memory safety & LLM limits).
- Clean each chunk using Groq LLM (
openai/gpt-oss-20b
) with a system prompt. - Concatenate and label each block with
SOURCE: <url>
for traceability.
Returned value: A large text blob suitable for retrieval-augmented prompting, preserving source attribution lines.
File overview:
File | Purpose |
---|---|
mcp_server.py |
Defines FastMCP instance and implements search_web , fetch_url , and the get_docs tool. |
client.py |
Launches server via stdio, lists tools, calls get_docs , then feeds result to an LLM for a user-friendly answer. |
utils.py |
HTML cleaning helper (currently uses LLM + trafilatura for extraction and Groq for chunk transformation). |
.env |
Environment variables (excluded from VCS). |
pyproject.toml |
Declares dependencies and metadata. |
uv.lock |
Reproducible lockfile generated by uv . |
Core runtime deps (from pyproject.toml
):
fastmcp
– MCP server helper.httpx
– async HTTP client.groq
– Groq API client.python-dotenv
– load variables from.env
.trafilatura
– heuristic content extraction (currently partially used / can be extended).
Tip: If you add more scraping tools, reuse a single
httpx.AsyncClient
for performance.
To see what the server is doing, you can temporarily add:
import logging, sys
logging.basicConfig(level=logging.INFO, stream=sys.stderr)
Place near the top of mcp_server.py
after imports. Since protocol uses stdout for JSON-RPC, send logs to stderr only.
Common issues:
- Empty tool list: The server exited early or crashed—add logging.
SERPER_API_KEY
missing → 401 or empty search results.GROQ_API_KEY
missing → LLM cleaning fails (exception inget_response_from_llm
).- Network timeouts: Adjust
timeout
inhttpx.AsyncClient
calls.
Ideas:
- Add caching layer (e.g.,
sqlite
or in-memory dict) to avoid re-fetching same URLs. - Parallelize URL fetch + clean with
asyncio.gather()
(mind rate limits / LLM cost). - Add another tool (e.g.,
summarize_diff
,list_endpoints
). - Provide structured JSON output (list of sources + cleaned text) instead of concatenated string.
- Add tests using
pytest
+pytest-asyncio
(mock Serper + LLM APIs).
If you want to call the tool directly in a Python script using the client-side MCP library:
from mcp.client.stdio import stdio_client
from mcp import ClientSession, StdioServerParameters
import asyncio
async def demo():
params = StdioServerParameters(command="uv", args=["run", "mcp_server.py"])
async with stdio_client(params) as (r, w):
async with ClientSession(r, w) as session:
await session.initialize()
tools = await session.list_tools()
print([t.name for t in tools.tools])
docs = await session.call_tool("get_docs", {"query": "install", "library": "uv"})
print(docs.content[:500])
asyncio.run(demo())
If you have an already activated virtual environment and want to use that instead of the project’s pinned environment, you can force uv to target it:
uv run --active client.py
Otherwise, uv will warn that your active $VIRTUAL_ENV
differs from the project .venv
but continue using the project environment.
Add a license section here (e.g., MIT) if you intend to distribute.
Symptom | Cause | Fix |
---|---|---|
No tools listed | Server not running / crashed | Add stderr logging; run uv run mcp_server.py manually |
AttributeError on .text |
Cleaner returned None | Ensure you return actual string from fetch_url / LLM call |
401 from Serper | Bad/missing API key | Check .env and reload shell |
Empty search results | Narrow query | Simplify query or verify domain key |
High latency | Many sequential LLM chunk calls | Batch or reduce chunk size |
- Fork & branch.
- Run
uv sync
. - Add tests for new tools (if added).
- Open PR with clear description.
- [] Add JSON schema metadata for tool params.
- [] Structured response format (list of {source, text}).
- [] Add caching layer.
- [] Add rate limiting/backoff.
- [] Add CI workflow (lint + tests).
- Serper.dev for search API
- Groq for fast OSS model serving
- Astral for
uv
- MCP ecosystem for protocol foundation