A Model Context Protocol (MCP) server providing local-first access to Pydantic and Pydantic AI documentation with BM25-powered full-text search.
- Local-first architecture with offline-only mode (configurable)
- BM25 full-text search across all documentation
- Pre-processed JSONL data included for fast setup
- Intelligent path resolution (no hardcoded paths)
- Complete coverage of Pydantic v2 and Pydantic AI documentation
- Path validation and security controls
- Python 3.12+
- uv package manager
- ~15MB disk space (with indices)
# Clone repository
git clone <repository-url>
cd mcp_pydantic_docs
# Create virtual environment and install dependencies
uv sync
# The server will auto-build indices on first run
# Or manually build them:
uv run python -m mcp_pydantic_docs.indexer
# Verify installation
uv run mcp-pydantic-docsNote: The server automatically builds search indices from included JSONL files on first startup if they don't exist. This typically takes 5-10 seconds.
- Create virtual environment and install dependencies:
cd mcp_pydantic_docs
uv syncThis creates .venv/ and installs all dependencies from pyproject.toml.
- Build search indices:
uv run python -m mcp_pydantic_docs.indexerGenerates:
data/pydantic_all_bm25.pkl(~3.2MB)data/pydantic_all_records.pkl(~6MB)
- Test the server:
uv run mcp-pydantic-docsOption 1: Direct execution with uv
uv --directory /path/to/mcp_pydantic_docs run mcp-pydantic-docsOption 2: Build and install wheel
# Build distribution
uv build
# Install wheel
uv pip install dist/mcp_pydantic_docs-0.1.0-py3-none-any.whl
# Run
mcp-pydantic-docsOption 3: Install in editable mode
uv pip install -e /path/to/mcp_pydantic_docsAdd to your MCP settings (e.g., cline_mcp_settings.json):
{
"mcpServers": {
"pydantic-docs": {
"disabled": false,
"timeout": 60,
"type": "stdio",
"command": "uv",
"args": [
"--directory",
"/absolute/path/to/mcp_pydantic_docs",
"run",
"mcp-pydantic-docs"
]
}
}
}Replace /absolute/path/to/mcp_pydantic_docs with your installation path.
mcp_pydantic_docs/
├── pyproject.toml # Package configuration
├── uv.lock # Locked dependencies
├── mcp_pydantic_docs/ # Source code
│ ├── __init__.py
│ ├── mcp.py # MCP server implementation
│ ├── indexer.py # BM25 index builder
│ ├── normalize.py # HTML to JSONL converter
│ └── setup.py # Setup utilities
├── data/ # Search data (in git: JSONL only)
│ ├── pydantic.jsonl # Pydantic docs (2.9MB)
│ ├── pydantic_ai.jsonl # Pydantic AI docs (3.3MB)
│ ├── pydantic_all_bm25.pkl # BM25 index (generated)
│ └── pydantic_all_records.pkl # Document records (generated)
├── docs_raw/ # Raw HTML (not in git)
│ ├── pydantic/
│ └── pydantic_ai/
└── docs_md/ # Markdown cache (not in git)
The server automatically locates data directories:
- Searches up from
mcp.pyfordata/ordocs_raw/ - Falls back to relative paths from package directory
- Can be overridden with environment variables:
PDA_DOC_ROOT- Path to Pydantic v2 HTML docsPDA_DOC_ROOT_AI- Path to Pydantic AI HTML docsPDA_DATA_DIR- Path to data directory
health.ping
Returns: "pong"health.validate
Returns: {
"valid": bool,
"message": str,
"bm25_present": bool,
"records_present": bool,
"bm25_size_mb": float,
"records_size_mb": float
}pydantic.search
Parameters:
- query: str (search query)
- k: int = 10 (number of results)
Returns: SearchResponse {
"results": [
{
"title": str,
"url": str,
"anchor": str | null,
"snippet": str
}
]
}pydantic.get
Parameters:
- path_or_url: str (relative path or full URL)
Returns: GetResponse {
"url": str,
"path": str,
"text": str,
"html": str
}pydantic.section
Parameters:
- path_or_url: str
- anchor: str (section ID)
Returns: SectionResponse {
"url": str,
"path": str,
"anchor": str,
"section": str,
"truncated": bool
}pydantic.api
Parameters:
- symbol: str (e.g., "BaseModel", "TypeAdapter")
- anchor: str | null (optional section)
Returns: dict {
"symbol": str,
"url": str,
"section": str | "text": str
}pydantic.mode
Returns: {
"offline_only": bool,
"doc_root": str,
"doc_root_ai": str,
"data_dir": str,
"bm25_present": bool,
"counts": {
"pydantic_html": int,
"pydantic_ai_html": int
},
"display_bases": dict
}admin.cache_status
Returns: {
"paths": dict,
"documentation": dict,
"jsonl_data": dict,t
"search_indices": dict,
"offline_mode": bool
}admin.rebuild_indices
Returns: {
"success": bool,
"message": str,
"bm25_size_mb": float,
"records_size_mb": float
}uv run python -m mcp_pydantic_docs.indexer# Check current status
uv run python -m mcp_pydantic_docs.setup --status
# Download and build indices
uv run python -m mcp_pydantic_docs.setup --download --build-index
# Force re-download
uv run python -m mcp_pydantic_docs.setup --download --force
# Clean cache
uv run python -m mcp_pydantic_docs.setup --cleanOFFLINE_ONLY = Trueinmcp.py- Blocks all HTTP/HTTPS requests except known base URLs as identifiers
- File path validation prevents directory traversal
- All content served from local cache
Edit mcp_pydantic_docs/mcp.py:
OFFLINE_ONLY = False # Allow remote fetchingNote: Online mode is not recommended for production use.
uv run pytest# Format code
uv run black mcp_pydantic_docs/
# Lint
uv run ruff check mcp_pydantic_docs/
# Type check
uv run mypy mcp_pydantic_docs/# Build wheel and sdist
uv build
# Output: dist/mcp_pydantic_docs-0.1.0-py3-none-any.whl
# dist/mcp_pydantic_docs-0.1.0.tar.gz- Source code
uv.lock(reproducible builds)data/*.jsonl(~6MB, pre-processed data)- Documentation and configuration
.venv/(virtual environment)data/*.pkl(binary indices, rebuilt from JSONL)docs_raw/(45MB HTML, downloadable)docs_md/(derived data)
uv run python -m mcp_pydantic_docs.indexerEnsure Python 3.12+ is active:
uv python list
uv python install 3.12Set explicit paths:
export PDA_DATA_DIR=/path/to/mcp_pydantic_docs/data
export PDA_DOC_ROOT=/path/to/mcp_pydantic_docs/docs_raw/pydantic
export PDA_DOC_ROOT_AI=/path/to/mcp_pydantic_docs/docs_raw/pydantic_aiVerify server runs standalone:
uv run mcp-pydantic-docs
# Should start and listen on stdioThis project is licensed under the MIT License - see the LICENSE file for details.
Contributions are welcome! Please see CONTRIBUTING.md for guidelines on:
- Development setup
- Code style and standards
- Testing requirements
- Pull request process
- Commit message conventions
For bugs and feature requests, please open an issue on GitHub.