codebase-rag-mcp

An MCP (Model Context Protocol) server that indexes any codebase using vector embeddings and enables semantic code search and grounded Q&A for AI assistants.

Works with Claude Desktop, Cursor, Copilot Chat, and any MCP-compatible client.

How It Works

Your Codebase → Chunker → OpenAI Embeddings → LanceDB (local disk)
                                                       ↑
AI Client → MCP Tool Call → Embed Query → Similarity Search → Cited Answer

Index — files are chunked at function/class boundaries, embedded with text-embedding-3-small, and stored locally in LanceDB (no server needed)
Search — queries are embedded and matched via cosine similarity
Ask — relevant chunks are injected into a gpt-4o-mini prompt; the answer is returned with source citations

Tools

Tool	Description
`index_repository`	Crawl + chunk + embed a local codebase
`search_code`	Semantic similarity search over indexed code
`ask_codebase`	Natural language Q&A with cited sources
`list_repositories`	View all indexed repos and stats

Setup

Prerequisites: Node.js 18+, OpenAI API key

git clone https://github.com/yourusername/codebase-rag-mcp
cd codebase-rag-mcp
npm install
cp .env.example .env
# Edit .env and add your OPENAI_API_KEY
npm run build

Add to Claude Desktop

Edit ~/Library/Application Support/Claude/claude_desktop_config.json (macOS) or %APPDATA%\Claude\claude_desktop_config.json (Windows):

{
  "mcpServers": {
    "codebase-rag": {
      "command": "node",
      "args": ["/absolute/path/to/codebase-rag-mcp/dist/server.js"],
      "env": {
        "OPENAI_API_KEY": "sk-..."
      }
    }
  }
}

Restart Claude Desktop. The 4 tools will appear automatically.

Example Usage

You: index_repository({ "path": "/Users/you/projects/my-app" })
→ { chunksIndexed: 247, tokensUsed: 4891, timeTakenMs: 12300, repoId: "my-app" }

You: search_code({ "query": "how does authentication work", "repo": "my-app" })
→ Returns top 5 code chunks ranked by semantic similarity

You: ask_codebase({ "question": "How does the JWT middleware validate tokens?" })
→ "The JWT middleware validates tokens by... [1][2] — auth/middleware.ts:23"

Running Tests

npm test                          # unit tests (no API key needed)
OPENAI_API_KEY=sk-... npm test    # includes integration tests

Tech Stack

TypeScript — end-to-end type safety
@modelcontextprotocol/sdk — MCP server protocol
LanceDB — embedded vector DB, zero config, persists to disk
OpenAI — text-embedding-3-small for embeddings, gpt-4o-mini for Q&A
Vitest — testing

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
src		src
tests		tests
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
package.json		package.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

codebase-rag-mcp

How It Works

Tools

Setup

Add to Claude Desktop

Example Usage

Running Tests

Tech Stack

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

codebase-rag-mcp

How It Works

Tools

Setup

Add to Claude Desktop

Example Usage

Running Tests

Tech Stack

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages