Skip to content

MCP Server for codebase indexing, semantic search, codebase graph creator and visualizer, and local persistent codebase knowledge

Notifications You must be signed in to change notification settings

candrle20/codebrain

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

3 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

🧠 CodeBrain MCP Server

Semantic code search powered by AI embeddings and vector similarity.

Integrate intelligent code search directly into Claude Desktop and Cursor through the Model Context Protocol (MCP).

🎯 What It Does

CodeBrain indexes your codebase using AST-based splitting and AI embeddings, enabling:

  • Semantic search - Find code by meaning, not just keywords
  • Smart chunking - AST-aware code splitting (respects functions, classes, etc.)
  • Fast retrieval - Vector similarity search with pgvector
  • Multi-project - Index and search across multiple codebases

πŸš€ Quick Start

1. Prerequisites

# Docker running (for PostgreSQL + pgvector)
docker ps | grep codebrain

# Node.js 20+
node --version

# Dependencies installed
cd /Users/conorandrle/Documents/Coding/CodeBrainMCP/CodeBrain
pnpm install

2. Setup Database

# Start PostgreSQL with pgvector (if not running)
docker run -d \
  --name codebrain \
  -p 5484:5432 \
  -e POSTGRES_PASSWORD=postgres \
  -e POSTGRES_DB=codebrain \
  pgvector/pgvector:pg15

# Setup database schema
pnpm db:setup
pnpm db:migrate

3. Configure Environment

Edit .env:

GEMINI_API_KEY=your_api_key_here
DATABASE_URL=postgresql://postgres:postgres@localhost:5484/codebrain?schema=cbmcp

4. Test the Server

# Run tests
pnpm test

# Should show: βœ… 28 tests passed

# Test MCP server starts
npx tsx src/index.ts
# Should output: πŸš€ CodeBrain MCP Server started (stdio mode)
# Press Ctrl+C to stop

πŸ”Œ Connect to Cursor/Claude

For Cursor

  1. Open Cursor Settings β†’ MCP Servers
  2. Add server named codebrain
  3. Copy this config:
{
  "command": "npx",
  "args": [
    "-y",
    "tsx",
    "/Users/conorandrle/Documents/Coding/CodeBrainMCP/CodeBrain/src/index.ts"
  ],
  "env": {
    "GEMINI_API_KEY": "your_key_here",
    "DATABASE_URL": "postgresql://postgres:postgres@localhost:5484/codebrain?schema=cbmcp"
  }
}
  1. Restart Cursor
  2. Verify - Check MCP panel shows "codebrain" connected

πŸ“– Detailed guide: See CURSOR_SETUP.md

For Claude Desktop

Edit ~/Library/Application Support/Claude/claude_desktop_config.json:

{
  "mcpServers": {
    "codebrain": {
      "command": "npx",
      "args": [
        "-y",
        "tsx",
        "/Users/conorandrle/Documents/Coding/CodeBrainMCP/CodeBrain/src/index.ts"
      ],
      "env": {
        "GEMINI_API_KEY": "your_key_here",
        "DATABASE_URL": "postgresql://postgres:postgres@localhost:5484/codebrain?schema=cbmcp"
      }
    }
  }
}

Restart Claude Desktop.

πŸ› οΈ Available MCP Tools

1. index_codebase

Index a codebase for semantic search.

Parameters:

{
  projectName: string;   // Unique project identifier
  rootPath: string;      // Absolute path to code
  force?: boolean;       // Re-index existing files
}

Example:

"Index my React project at /Users/me/projects/my-app with name 'my-app'"

2. semantic_search

Search code semantically across indexed projects.

Parameters:

{
  query: string;         // What to search for
  projectName?: string;  // Filter by project
  topK?: number;        // Number of results (default: 5)
  threshold?: number;   // Similarity threshold (default: 0.5)
}

Example:

"Find authentication logic in my-app"

3. list_projects

List all indexed projects.

Parameters: None

Example:

"Show me all indexed projects"

4. get_project_stats

Get statistics for a project.

Parameters:

{
  projectName: string;   // Project to query
}

Example:

"Show me stats for the my-app project"

πŸ“Š Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚         Cursor / Claude Desktop             β”‚
β”‚              (MCP Client)                   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                  β”‚ MCP Protocol (stdio)
                  β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚         CodeBrain MCP Server                β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”‚
β”‚  β”‚  AST Code Splitter                  β”‚   β”‚
β”‚  β”‚  - JavaScript/TypeScript            β”‚   β”‚
β”‚  β”‚  - Python, Go, Rust, Java, C++      β”‚   β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β”‚
β”‚                 β”‚                           β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”‚
β”‚  β”‚  Gemini Embeddings                  β”‚   β”‚
β”‚  β”‚  - 768-dimensional vectors          β”‚   β”‚
β”‚  β”‚  - Semantic descriptions            β”‚   β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β”‚
β”‚                 β”‚                           β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”‚
β”‚  β”‚  Vector Search                      β”‚   β”‚
β”‚  β”‚  - Cosine similarity                β”‚   β”‚
β”‚  β”‚  - Threshold filtering              β”‚   β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜   β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                  β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚      PostgreSQL + pgvector                  β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”‚
β”‚  β”‚  Projects β†’ Files β†’ Chunks β†’ Embeds β”‚  β”‚
β”‚  β”‚  Normalized relational schema        β”‚  β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

πŸ§ͺ Testing

# Run all tests
pnpm test

# Watch mode
pnpm test:watch

# Individual test suites
pnpm test:splitter     # AST code splitter
pnpm test:indexing     # Indexing workflow
pnpm test:search       # Semantic search
pnpm test:embedding    # Embedding generation

# Integration test (end-to-end)
pnpm test:integration

🌐 Graph Viewer (React)

Visualise the code graph in the browser with the React/Vite viewer.

# Start the Graph API server (serves graph JSON on http://localhost:4000)
pnpm graph:server

# In a separate terminal, install and run the viewer UI
cd apps/graph-viewer
pnpm install
pnpm dev

# Open the browser UI β†’ http://localhost:5173

Override the API target with VITE_GRAPH_API_URL (inside apps/graph-viewer/.env) if the server runs elsewhere.

πŸ“ Project Structure

CodeBrain/
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ index.ts              # MCP server entry point
β”‚   β”œβ”€β”€ core/
β”‚   β”‚   β”œβ”€β”€ indexing.ts       # Indexing orchestration
β”‚   β”‚   β”œβ”€β”€ search.ts         # Semantic search
β”‚   β”‚   β”œβ”€β”€ splitter.ts       # AST-based code splitting
β”‚   β”‚   └── embedding/
β”‚   β”‚       β”œβ”€β”€ base-embedding.ts      # Embedding interface
β”‚   β”‚       └── gemini-embedding.ts    # Gemini implementation
β”‚   └── test/
β”‚       β”œβ”€β”€ *.test.ts         # Unit tests
β”‚       └── utils.ts          # Test utilities
β”œβ”€β”€ db/
β”‚   β”œβ”€β”€ index.ts              # Prisma client
β”‚   β”œβ”€β”€ setup.ts              # Database setup script
β”‚   └── vector-indexes.ts     # Vector index management
β”œβ”€β”€ prisma/
β”‚   └── schema.prisma         # Database schema
β”œβ”€β”€ .env                      # Environment variables
β”œβ”€β”€ mcp-config.json          # MCP configuration template
β”œβ”€β”€ CURSOR_SETUP.md          # Cursor integration guide
└── README.md                # This file

πŸ—ƒοΈ Database Schema

Project (1) ─┐
             β”œβ”€> File (N) ─┐
                           β”œβ”€> Chunk (N) ─┐
                                          β”œβ”€> Embedding (N)
  • Project: Root container (name, rootPath)
  • File: Individual source files (path, language, hash)
  • Chunk: Code segments (text, lines, AST metadata)
  • Embedding: Vector representations (768-dim, model, similarity search)

πŸ”§ Development

Scripts

pnpm dev              # Start with auto-reload
pnpm start            # Start server
pnpm build            # Compile TypeScript

pnpm db:setup         # Setup database + pgvector
pnpm db:migrate       # Run migrations
pnpm db:generate      # Generate Prisma client
pnpm db:studio        # Open Prisma Studio

Environment Variables

# Required
GEMINI_API_KEY=your_gemini_api_key
DATABASE_URL=postgresql://user:pass@host:port/db?schema=cbmcp

# Optional
NODE_ENV=development

πŸ› Troubleshooting

MCP Connection Issues

Problem: Server won't connect in Cursor

Solutions:

  1. Test manually: npx tsx src/index.ts (should output startup message)
  2. Check absolute path in config matches your directory
  3. Verify environment variables in MCP config
  4. Restart Cursor completely (Cmd+Q, then reopen)
  5. Check MCP output panel for error logs

Database Issues

Problem: type "vector" does not exist

Solution:

pnpm db:setup  # This installs pgvector in cbmcp schema

Problem: Connection refused

Solution:

docker ps | grep codebrain  # Verify container running
docker start codebrain      # Start if stopped

Embedding Issues

Problem: GEMINI_API_KEY is required

Solution: Add API key to .env and MCP config

Performance Issues

Problem: Indexing is slow

Solutions:

  • Embeddings are cached - subsequent runs are faster
  • Adjust batch size in indexing.ts if needed
  • Consider excluding large directories (node_modules, etc.)

πŸ“ˆ Performance

  • Indexing: ~2-5 seconds per file (first time, includes embedding generation)
  • Re-indexing: ~100ms per file (if unchanged, uses hash comparison)
  • Search: ~500ms per query (includes embedding + vector search)
  • Storage: ~10KB per code chunk (text + embedding + metadata)

πŸ” Security

  • API keys stored in environment variables (not in code)
  • Database credentials configurable
  • MCP runs locally (no external API calls except Gemini)
  • Vector embeddings don't leave your machine

πŸ“ License

MIT

🀝 Contributing

This is a personal project, but feel free to fork and adapt for your needs!

πŸŽ“ Learn More

βœ… Status

  • βœ… Database setup and migrations
  • βœ… AST-based code splitting
  • βœ… Gemini embedding integration
  • βœ… Vector similarity search
  • βœ… MCP server implementation
  • βœ… Comprehensive test suite (28 tests)
  • βœ… Multi-project support
  • βœ… Cursor/Claude integration ready

Ready for production use! πŸš€

About

MCP Server for codebase indexing, semantic search, codebase graph creator and visualizer, and local persistent codebase knowledge

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published