Docs MCP Server - Secure Local Setup

Run the Docs MCP Server (https://github.com/arabold/docs-mcp-server) completely isolated from the internet with local data storage.

Quick Start

1. Index Documentation (needs internet)

Note: Server does NOT need to be running for scraping. The scraper writes directly to ./data/docs.db.

./scrape-docs.sh react https://react.dev/reference/react 18.3.1
./scrape-docs.sh typescript https://www.typescriptlang.org/docs/

Data stored in: ./data/docs.db

2. Start Server (no internet, reads from database)

./start-docs-mcp-server.sh

Web interface: http://localhost:6280

3. Configure MCP Client

Add to your MCP client config (Claude Desktop, VS Code, Cline, etc.):

macOS: ~/Library/Application Support/Claude/claude_desktop_config.json

{
  "mcpServers": {
    "docs-mcp-server": {
      "type": "sse",
      "url": "http://localhost:6280/sse"
    }
  }
}

Restart your MCP client after updating the config.

4. Use It

Ask your AI assistant:

Search the React documentation for useState hooks

Security Features

✅ Complete network isolation - Cannot reach external internet ✅ Read-only mode - Cannot modify or delete indexed data ✅ Manual scraping only - You control what gets indexed (no AI-initiated scraping) ✅ SHA-pinned images - Verified checksums prevent tampering ✅ Local data storage - Everything in ./data/ directory ✅ No telemetry - No analytics or tracking

Available Commands

Scraping

# Web URL
./scrape-docs.sh library https://example.com/docs [version]

# Local files
./scrape-docs.sh mylib file:///path/to/docs

Server Management

./start-docs-mcp-server.sh   # Start server
./stop-docs-mcp-server.sh    # Stop server
docker logs -f docs-mcp-server  # View logs

Image Updates

./update-image.sh  # Update to latest image (with SHA verification)

MCP Tools Available

Once connected, your AI can use:

search_docs - Search indexed documentation
list_libraries - Show all indexed libraries
find_version - Find matching versions

Important Security Note:

The server runs in read-only mode with no internet access
scrape_docs and fetch_url tools are disabled/non-functional
You are responsible for manually indexing documentation via ./scrape-docs.sh
This prevents the AI from making unauthorized network requests

Directory Structure

docs-mcp/
├── data/                    # SQLite database (auto-created)
├── start-docs-mcp-server.sh # Start isolated server
├── stop-docs-mcp-server.sh  # Stop server
├── scrape-docs.sh          # Index documentation
├── update-image.sh         # Update Docker image
└── README.md               # This file

How It Works

┌────────────────────────────────────────────┐
│ Step 1: Scraping (separate container)     │
│  ./scrape-docs.sh                          │
│  • Temporary Docker container              │
│  • Has internet access                     │
│  • Fetches docs from web                   │
│  • Writes to ./data/docs.db                │
│  • Container exits when done               │
└────────────────────────────────────────────┘
                    ↓ (writes to)
              ./data/docs.db
                    ↓ (reads from)
┌────────────────────────────────────────────┐
│ Step 2: Server (persistent container)     │
│  ./start-docs-mcp-server.sh                │
│  • Isolated network (no internet)          │
│  • Read-only mode                          │
│  • Reads from ./data/docs.db               │
│  • Serves MCP tools on port 6280           │
└────────────────────────────────────────────┘
                    ↓
┌────────────────────────────────────────────┐
│ Step 3: MCP Client                         │
│  Connects via http://localhost:6280/sse    │
└────────────────────────────────────────────┘

Key Points:

Scraping and serving are separate containers
Server does NOT need to run during scraping
Both read/write the same ./data/docs.db file
Scrape anytime (even with server running)

Advanced

Backup Data

cp -r ./data/ ./data-backup/

View Database

sqlite3 ./data/docs.db "SELECT name FROM libraries;"

Change Network Settings

Edit start-docs-mcp-server.sh to modify:

PORT - Server port
NETWORK_NAME - Docker network name
IMAGE - Docker image SHA (update via ./update-image.sh)

Security Notes

During scraping: Container has internet access to fetch documentation During serving: Container is completely isolated:

Custom bridge network with no masquerading
DNS disabled (127.0.0.1)
Network capabilities dropped (NET_RAW, NET_ADMIN)
SHA-pinned Docker images with checksum verification

Your documentation stays private and local!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Docs MCP Server - Secure Local Setup

Quick Start

1. Index Documentation (needs internet)

2. Start Server (no internet, reads from database)

3. Configure MCP Client

4. Use It

Security Features

Available Commands

Scraping

Server Management

Image Updates

MCP Tools Available

Directory Structure

How It Works

Advanced

Backup Data

View Database

Change Network Settings

Security Notes

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.gitignore		.gitignore
README.md		README.md
scrape-docs.sh		scrape-docs.sh
start-docs-mcp-server.sh		start-docs-mcp-server.sh
stop-docs-mcp-server.sh		stop-docs-mcp-server.sh
update-image.sh		update-image.sh

sandeshghanta/documentation-scraping-mcp

Folders and files

Latest commit

History

Repository files navigation

Docs MCP Server - Secure Local Setup

Quick Start

1. Index Documentation (needs internet)

2. Start Server (no internet, reads from database)

3. Configure MCP Client

4. Use It

Security Features

Available Commands

Scraping

Server Management

Image Updates

MCP Tools Available

Directory Structure

How It Works

Advanced

Backup Data

View Database

Change Network Settings

Security Notes

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages