Docs Vector MCP

Vectorize GitHub tool documentation and provide MCP (Model Control Protocol) interface for AI Agents.

Features

🔄 Auto-fetch from GitHub - Automatically crawls and extracts documentation from GitHub repositories
🧠 Vector Embeddings - Uses OpenAI embeddings to store documentation in vector database
🔍 Semantic Search - Find relevant documentation using natural language queries
🔌 MCP Protocol - Standard Model Control Protocol interface for AI Agents
🎨 Modern Web UI - Built with Next.js 15 + TailwindCSS

Architecture

┌─────────────┐    ┌──────────────┐    ┌──────────────┐    ┌────────────┐
│ GitHub Repo │ →  │  Crawl Docs  │ →  │ Split Chunks│ →  │  Embedding │
└─────────────┘    └──────────────┘    └──────────────┘    └────────────┘
                          ↓
                    ┌──────────────┐
                    │ Vector DB    │ ←  Query  ┌──────────┐
                    │  (Upstash)   │ →  Result │ AI Agent │
                    └──────────────┘           └──────────┘
                          ↑
                     ┌───────────┐
                     │  MCP API  │
                     └───────────┘

Tech Stack

Framework: Next.js 15 + TypeScript + TailwindCSS
Vector Database: Upstash Vector (serverless, perfect for Cloudflare deployment)
Embeddings: OpenAI text-embedding-3-small
GitHub API: Octokit
MCP: @modelcontextprotocol/sdk

Environment Variables

Create a .env.local file:

# GitHub (optional but recommended for higher rate limits)
GITHUB_TOKEN=your_github_token

# OpenAI
OPENAI_API_KEY=your_openai_api_key

# Upstash Vector
UPSTASH_VECTOR_RESTAR_URL=your_upstash_vector_url
UPSTASH_VECTOR_RESTAR_TOKEN=your_upstash_vector_token

Getting Started

Install dependencies

npm install

Run development server

npm run dev

Open http://localhost:3000 in your browser.

CLI Usage

Index a GitHub repository

npx tsx cli/index.ts index <owner> <repo> [branch]

Example:

npx tsx cli/index.ts index openai openai-python main

Search indexed documentation

npx tsx cli/index.ts search "how to use embeddings"

Show statistics

npx tsx cli/index.ts stats

Clear all indexed documents

npx tsx cli/index.ts clear

Start MCP server (for AI Agent connection)

npx tsx cli/index.ts mcp

MCP Integration

Add this configuration to your AI Agent that supports MCP:

{
  "mcpServers": {
    "docs-vector": {
      "command": "node",
      "args": [
        "path/to/docs-vector-mcp/dist/cli/index.js",
        "mcp"
      ],
      "env": {
        "OPENAI_API_KEY": "<your-openai-api-key>",
        "UPSTASH_VECTOR_RESTAR_URL": "<your-upstash-url>",
        "UPSTASH_VECTOR_RESTAR_TOKEN": "<your-upstash-token>"
      }
    }
  }
}

Available MCP Tools

search_docs - Search documentation semantically
- Parameters:
  - query (string): The search query
  - limit (number, optional): Maximum number of results (1-20, default 5)
get_stats - Get statistics about stored documentation
- No parameters

Deployment

Cloudflare Pages

This project is optimized for Cloudflare Pages deployment:

Push your code to GitHub
Connect your repository to Cloudflare Pages
Set build command: npm install && npx next build
Set output directory: .next
Add all environment variables in Cloudflare dashboard
Deploy!

CI/CD with GitHub Actions

A sample workflow is included in .github/workflows/deploy.yml that automatically deploys to Cloudflare Pages on every push to main branch.

Project Structure

docs-vector-mcp/
├── app/                    # Next.js app router
│   ├── api/               # API routes
│   │   ├── index/         # Indexing endpoint
│   │   ├── search/        # Search endpoint
│   │   └── stats/         # Stats endpoint
│   ├── globals.css        # Global styles
│   ├── layout.tsx         # Root layout
│   └── page.tsx           # Home page
├── components/            # React components
│   ├── IndexForm.tsx      # Repository indexing form
│   └── SearchForm.tsx     # Search form
├── lib/                   # Core libraries
│   ├── github.ts          # GitHub fetcher
│   ├── text-processor.ts  # Text chunking
│   ├── embedding.ts       # Embedding generator
│   ├── vector-store.ts    # Vector storage
│   ├── mcp-server.ts      # MCP server
│   └── docs-service.ts    # Service orchestrator
├── cli/                   # CLI entry
│   └── index.ts           # CLI main
├── .github/
│   └── workflows/         # GitHub Actions
├── next.config.ts         # Next.js config
├── tailwind.config.ts     # Tailwind config
└── package.json           # Dependencies

How It Works

Add Repository: You input a GitHub repository that contains tool documentation
Crawling: The system fetches all documentation files (.md, .mdx, .rst, .txt, etc.) from the repo
Processing: Text is cleaned and split into overlapping chunks
Embedding: OpenAI generates vector embeddings for each chunk
Storage: Vectors are stored in Upstash Vector database
Search: When an AI Agent asks a question, the query is embedded and similar documents are retrieved
Response: Relevant documentation snippets are returned to the AI Agent for answering

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.clawhub		.clawhub
.github/workflows		.github/workflows
.openclaw		.openclaw
app		app
cli		cli
components		components
lib		lib
.env.example		.env.example
.eslintrc.json		.eslintrc.json
.gitignore		.gitignore
AGENTS.md		AGENTS.md
BOOTSTRAP.md		BOOTSTRAP.md
HEARTBEAT.md		HEARTBEAT.md
IDENTITY.md		IDENTITY.md
README.md		README.md
SOUL.md		SOUL.md
TOOLS.md		TOOLS.md
USER.md		USER.md
next.config.ts		next.config.ts
package-lock.json		package-lock.json
package.json		package.json
postcss.config.js		postcss.config.js
skills-lock.json		skills-lock.json
tailwind.config.ts		tailwind.config.ts
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Docs Vector MCP

Features

Architecture

Tech Stack

Environment Variables

Getting Started

Install dependencies

Run development server

CLI Usage

Index a GitHub repository

Search indexed documentation

Show statistics

Clear all indexed documents

Start MCP server (for AI Agent connection)

MCP Integration

Available MCP Tools

Deployment

Cloudflare Pages

CI/CD with GitHub Actions

Project Structure

How It Works

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Docs Vector MCP

Features

Architecture

Tech Stack

Environment Variables

Getting Started

Install dependencies

Run development server

CLI Usage

Index a GitHub repository

Search indexed documentation

Show statistics

Clear all indexed documents

Start MCP server (for AI Agent connection)

MCP Integration

Available MCP Tools

Deployment

Cloudflare Pages

CI/CD with GitHub Actions

Project Structure

How It Works

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages