Python Documentation MCP Servers

Two Python-based MCP servers for documentation scraping and reading. ~~Copied~~ Heavily inspired by docs-mcp-server.

Overview

Scraper Server: Crawls documentation sites and stores content in SQLite
Reader Server: Searches and retrieves stored documentation

Features

Scraper Server

🕷️ Site crawling with configurable depth and page limits
📄 HTML content extraction (title, content, metadata)
💾 SQLite storage with FTS5 full-text search
🔧 MCP tools for Claude integration
💻 CLI for direct usage

Reader Server

🔍 Keyword search across all documentation
📚 List available libraries and versions
📖 Browse documentation by library
🎯 Retrieve specific documents by URL
🔧 5 MCP tools for Claude integration

Installation

# Clone the repository
cd docs-mcp-server-python

# Install dependencies
uv sync --extra dev

Docker Deployment

Deploy both servers using Docker for production or containerized development.

Quick Start

# Build images
cd docker && ./build.sh

# Deploy services
cd ../deploy && docker-compose up -d

Both servers are now running:

Scraper MCP Server: http://localhost:6281
Reader MCP Server: http://localhost:6282

Key Features

🐳 Containerized: Isolated, reproducible environments
🔒 Network Isolation: Reader server cannot access internet (security)
💾 Shared Database: Single SQLite database via Docker volume
🔄 Auto-restart: Services restart on failure
⚙️ Configurable: Customize via .env file

Architecture

Scraper (6281)  ──┬──> Shared SQLite Database
                  │
Reader (6282)   ──┘    (Network Isolated)

Documentation

Building Images: See docker/README.md
Deployment Guide: See deploy/README.md
Full Details: Complete instructions, troubleshooting, and configuration options

Docker Commands

# Service management
docker-compose up -d              # Start all
docker-compose logs -f reader     # View logs
docker-compose restart scraper    # Restart service
docker-compose down               # Stop all

# Individual services
docker-compose up -d scraper      # Start scraper only
docker-compose stop reader        # Stop reader only

Usage

Scraper Server

Via CLI

# Scrape React documentation
uv run python scraper/cli.py scrape \
    --url https://react.dev \
    --library react \
    --version 19.0 \
    --max-depth 2 \
    --max-pages 50

# View all options
uv run python scraper/cli.py scrape --help

Via MCP Server (Local Development)

Add to your MCP client configuration (e.g., Claude Desktop) for local stdio-based execution:

{
  "mcpServers": {
    "docs-scraper": {
      "command": "uv",
      "args": [
        "--directory",
        "/path/to/docs-mcp-server-python",
        "run",
        "python",
        "scraper/server.py"
      ],
      "env": {
        "MCP_TRANSPORT": "stdio"
      }
    }
  }
}

Then use the scrape_documentation tool in Claude.

Note: For Docker deployments, see MCP Configuration for Docker Containers below.

Reader Server

Via MCP Server (Local Development)

Add to your MCP client configuration for local stdio-based execution:

{
  "mcpServers": {
    "docs-reader": {
      "command": "uv",
      "args": [
        "--directory",
        "/path/to/docs-mcp-server-python",
        "run",
        "python",
        "reader/server.py"
      ],
      "env": {
        "MCP_TRANSPORT": "stdio"
      }
    }
  }
}

Available MCP tools:

search_documentation - Search with keywords
list_libraries - List all available libraries
list_versions - List versions for a library
get_document - Retrieve document by URL
browse_library - Browse all documents for a library

MCP Configuration for Docker Containers

The servers run with HTTP transport in Docker, making them accessible as network services.

Using Claude CLI (Recommended)

After starting the Docker containers (docker-compose up -d), add the servers:

# Add scraper server
claude mcp add --transport http docs-scraper http://localhost:6281/mcp

# Add reader server
claude mcp add --transport http docs-reader http://localhost:6282/mcp

Manual Configuration

Alternatively, add to your MCP client configuration:

Scraper Server (Docker)

{
  "mcpServers": {
    "docs-scraper": {
      "transport": "http",
      "url": "http://localhost:6281/mcp"
    }
  }
}

Reader Server (Docker)

{
  "mcpServers": {
    "docs-reader": {
      "transport": "http",
      "url": "http://localhost:6282/mcp"
    }
  }
}

Requirements:

Docker containers must be running (docker-compose up -d)
Servers use HTTP transport for remote access
Database is shared between containers via bind mount at deploy/data/

Note: The servers auto-detect transport mode. They use HTTP by default (for Docker), but you can force stdio mode by setting MCP_TRANSPORT=stdio environment variable.

Configuration

Environment variables (optional):

# Database location (default: ~/.docs-mcp/documentation.db)
export DOCS_MCP_DB_PATH=/path/to/database.db

# Scraping settings
export DOCS_MCP_USER_AGENT="MyBot/1.0"
export DOCS_MCP_TIMEOUT=30
export DOCS_MCP_DELAY=0.5
export DOCS_MCP_MAX_DEPTH=3
export DOCS_MCP_MAX_PAGES=100

Project Structure

docs-mcp-server-python/
├── scraper/              # Scraper server
│   ├── server.py         # MCP server
│   ├── cli.py            # CLI commands
│   ├── crawler.py        # Site crawler
│   ├── extractors/       # Content extractors
│   │   ├── base.py       # Base interface
│   │   └── html.py       # HTML extractor
│   └── strategies/       # Scraping strategies
│       ├── base.py       # Base interface
│       └── site_crawler.py # Site crawler strategy
├── reader/               # Reader server
│   ├── server.py         # MCP server
│   ├── search.py         # Search functionality
│   └── query.py          # Query builder
├── shared/               # Shared components
│   ├── database.py       # Database manager
│   ├── schema.py         # Database schema
│   ├── models.py         # Data models
│   └── config.py         # Configuration
├── docker/               # Docker image building
│   ├── Dockerfile.scraper    # Scraper image
│   ├── Dockerfile.reader     # Reader image
│   ├── build.sh              # Build script
│   ├── push.sh               # Push to registry
│   └── README.md             # Build documentation
├── deploy/               # Deployment configuration
│   ├── docker-compose.yml    # Service orchestration
│   ├── .env.example          # Configuration template
│   └── README.md             # Deployment guide
└── tests/                # Tests

Database Schema

Simple schema with FTS5 full-text search:

documents (
    id, library, version, url, title, content,
    raw_html, metadata, scraped_at, updated_at
)
documents_fts (FTS5 virtual table for search)

Development

Running Tests

# Run all tests
uv run pytest -v

# Run specific test file
uv run pytest tests/shared/test_database.py -v

Type Checking

uv run mypy shared/ scraper/ reader/ --strict

Linting

uv run ruff check .
uv run ruff format .

Examples

Example 1: Scrape and Search

# 1. Scrape Python documentation
uv run python scraper/cli.py scrape \
    --url https://docs.python.org/3/ \
    --library python \
    --version 3.13

# 2. Start reader server and search
# (In Claude with reader MCP configured)
# Use: search_documentation("async generators")

Example 2: Browse Libraries

# Start reader MCP server
# (In Claude)
# Use: list_libraries()
# Use: browse_library("python", "3.13")

Roadmap

See Implementation Plan for:

Future enhancements (Playwright, semantic search, etc.)
Testing strategy
Detailed implementation notes

MVP Status

✅ Complete - All core functionality implemented:

Phase 1: Foundation & Shared Components
Phase 2: Scraper Server (MCP + CLI)
Phase 3: Reader Server (MCP tools)

License

MIT

Acknowledgments

~~Copied~~ Heavily inspired by docs-mcp-server.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
deploy		deploy
docker		docker
docs/plans		docs/plans
reader		reader
scraper		scraper
shared		shared
tests		tests
.gitignore		.gitignore
README.md		README.md
pyproject.toml		pyproject.toml

sandeshghanta/docs-mcp-server

Folders and files

Latest commit

History

Repository files navigation

Python Documentation MCP Servers

Overview

Features

Scraper Server

Reader Server

Installation

Docker Deployment

Quick Start

Key Features

Architecture

Documentation

Docker Commands

Usage

Scraper Server

Via CLI

Via MCP Server (Local Development)

Reader Server

Via MCP Server (Local Development)

MCP Configuration for Docker Containers

Using Claude CLI (Recommended)

Manual Configuration

Configuration

Project Structure

Database Schema

Development

Running Tests

Type Checking

Linting

Examples

Example 1: Scrape and Search

Example 2: Browse Libraries

Roadmap

MVP Status

License

Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages