Semantic code search powered by AI embeddings and vector similarity.
Integrate intelligent code search directly into Claude Desktop and Cursor through the Model Context Protocol (MCP).
CodeBrain indexes your codebase using AST-based splitting and AI embeddings, enabling:
- Semantic search - Find code by meaning, not just keywords
- Smart chunking - AST-aware code splitting (respects functions, classes, etc.)
- Fast retrieval - Vector similarity search with pgvector
- Multi-project - Index and search across multiple codebases
# Docker running (for PostgreSQL + pgvector)
docker ps | grep codebrain
# Node.js 20+
node --version
# Dependencies installed
cd /Users/conorandrle/Documents/Coding/CodeBrainMCP/CodeBrain
pnpm install# Start PostgreSQL with pgvector (if not running)
docker run -d \
--name codebrain \
-p 5484:5432 \
-e POSTGRES_PASSWORD=postgres \
-e POSTGRES_DB=codebrain \
pgvector/pgvector:pg15
# Setup database schema
pnpm db:setup
pnpm db:migrateEdit .env:
GEMINI_API_KEY=your_api_key_here
DATABASE_URL=postgresql://postgres:postgres@localhost:5484/codebrain?schema=cbmcp# Run tests
pnpm test
# Should show: β
28 tests passed
# Test MCP server starts
npx tsx src/index.ts
# Should output: π CodeBrain MCP Server started (stdio mode)
# Press Ctrl+C to stop- Open Cursor Settings β MCP Servers
- Add server named
codebrain - Copy this config:
{
"command": "npx",
"args": [
"-y",
"tsx",
"/Users/conorandrle/Documents/Coding/CodeBrainMCP/CodeBrain/src/index.ts"
],
"env": {
"GEMINI_API_KEY": "your_key_here",
"DATABASE_URL": "postgresql://postgres:postgres@localhost:5484/codebrain?schema=cbmcp"
}
}- Restart Cursor
- Verify - Check MCP panel shows "codebrain" connected
π Detailed guide: See CURSOR_SETUP.md
Edit ~/Library/Application Support/Claude/claude_desktop_config.json:
{
"mcpServers": {
"codebrain": {
"command": "npx",
"args": [
"-y",
"tsx",
"/Users/conorandrle/Documents/Coding/CodeBrainMCP/CodeBrain/src/index.ts"
],
"env": {
"GEMINI_API_KEY": "your_key_here",
"DATABASE_URL": "postgresql://postgres:postgres@localhost:5484/codebrain?schema=cbmcp"
}
}
}
}Restart Claude Desktop.
Index a codebase for semantic search.
Parameters:
{
projectName: string; // Unique project identifier
rootPath: string; // Absolute path to code
force?: boolean; // Re-index existing files
}Example:
"Index my React project at /Users/me/projects/my-app with name 'my-app'"
Search code semantically across indexed projects.
Parameters:
{
query: string; // What to search for
projectName?: string; // Filter by project
topK?: number; // Number of results (default: 5)
threshold?: number; // Similarity threshold (default: 0.5)
}Example:
"Find authentication logic in my-app"
List all indexed projects.
Parameters: None
Example:
"Show me all indexed projects"
Get statistics for a project.
Parameters:
{
projectName: string; // Project to query
}Example:
"Show me stats for the my-app project"
βββββββββββββββββββββββββββββββββββββββββββββββ
β Cursor / Claude Desktop β
β (MCP Client) β
βββββββββββββββββββ¬ββββββββββββββββββββββββββββ
β MCP Protocol (stdio)
β
βββββββββββββββββββΌββββββββββββββββββββββββββββ
β CodeBrain MCP Server β
β βββββββββββββββββββββββββββββββββββββββ β
β β AST Code Splitter β β
β β - JavaScript/TypeScript β β
β β - Python, Go, Rust, Java, C++ β β
β ββββββββββββββββ¬βββββββββββββββββββββββ β
β β β
β ββββββββββββββββΌβββββββββββββββββββββββ β
β β Gemini Embeddings β β
β β - 768-dimensional vectors β β
β β - Semantic descriptions β β
β ββββββββββββββββ¬βββββββββββββββββββββββ β
β β β
β ββββββββββββββββΌβββββββββββββββββββββββ β
β β Vector Search β β
β β - Cosine similarity β β
β β - Threshold filtering β β
β ββββββββββββββββ¬βββββββββββββββββββββββ β
βββββββββββββββββββΌββββββββββββββββββββββββββββ
β
βββββββββββββββββββΌββββββββββββββββββββββββββββ
β PostgreSQL + pgvector β
β ββββββββββββββββββββββββββββββββββββββββ β
β β Projects β Files β Chunks β Embeds β β
β β Normalized relational schema β β
β ββββββββββββββββββββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββ
# Run all tests
pnpm test
# Watch mode
pnpm test:watch
# Individual test suites
pnpm test:splitter # AST code splitter
pnpm test:indexing # Indexing workflow
pnpm test:search # Semantic search
pnpm test:embedding # Embedding generation
# Integration test (end-to-end)
pnpm test:integrationVisualise the code graph in the browser with the React/Vite viewer.
# Start the Graph API server (serves graph JSON on http://localhost:4000)
pnpm graph:server
# In a separate terminal, install and run the viewer UI
cd apps/graph-viewer
pnpm install
pnpm dev
# Open the browser UI β http://localhost:5173Override the API target with VITE_GRAPH_API_URL (inside apps/graph-viewer/.env) if the server runs elsewhere.
CodeBrain/
βββ src/
β βββ index.ts # MCP server entry point
β βββ core/
β β βββ indexing.ts # Indexing orchestration
β β βββ search.ts # Semantic search
β β βββ splitter.ts # AST-based code splitting
β β βββ embedding/
β β βββ base-embedding.ts # Embedding interface
β β βββ gemini-embedding.ts # Gemini implementation
β βββ test/
β βββ *.test.ts # Unit tests
β βββ utils.ts # Test utilities
βββ db/
β βββ index.ts # Prisma client
β βββ setup.ts # Database setup script
β βββ vector-indexes.ts # Vector index management
βββ prisma/
β βββ schema.prisma # Database schema
βββ .env # Environment variables
βββ mcp-config.json # MCP configuration template
βββ CURSOR_SETUP.md # Cursor integration guide
βββ README.md # This file
Project (1) ββ
ββ> File (N) ββ
ββ> Chunk (N) ββ
ββ> Embedding (N)- Project: Root container (name, rootPath)
- File: Individual source files (path, language, hash)
- Chunk: Code segments (text, lines, AST metadata)
- Embedding: Vector representations (768-dim, model, similarity search)
pnpm dev # Start with auto-reload
pnpm start # Start server
pnpm build # Compile TypeScript
pnpm db:setup # Setup database + pgvector
pnpm db:migrate # Run migrations
pnpm db:generate # Generate Prisma client
pnpm db:studio # Open Prisma Studio# Required
GEMINI_API_KEY=your_gemini_api_key
DATABASE_URL=postgresql://user:pass@host:port/db?schema=cbmcp
# Optional
NODE_ENV=developmentProblem: Server won't connect in Cursor
Solutions:
- Test manually:
npx tsx src/index.ts(should output startup message) - Check absolute path in config matches your directory
- Verify environment variables in MCP config
- Restart Cursor completely (Cmd+Q, then reopen)
- Check MCP output panel for error logs
Problem: type "vector" does not exist
Solution:
pnpm db:setup # This installs pgvector in cbmcp schemaProblem: Connection refused
Solution:
docker ps | grep codebrain # Verify container running
docker start codebrain # Start if stoppedProblem: GEMINI_API_KEY is required
Solution: Add API key to .env and MCP config
Problem: Indexing is slow
Solutions:
- Embeddings are cached - subsequent runs are faster
- Adjust batch size in
indexing.tsif needed - Consider excluding large directories (node_modules, etc.)
- Indexing: ~2-5 seconds per file (first time, includes embedding generation)
- Re-indexing: ~100ms per file (if unchanged, uses hash comparison)
- Search: ~500ms per query (includes embedding + vector search)
- Storage: ~10KB per code chunk (text + embedding + metadata)
- API keys stored in environment variables (not in code)
- Database credentials configurable
- MCP runs locally (no external API calls except Gemini)
- Vector embeddings don't leave your machine
MIT
This is a personal project, but feel free to fork and adapt for your needs!
- β Database setup and migrations
- β AST-based code splitting
- β Gemini embedding integration
- β Vector similarity search
- β MCP server implementation
- β Comprehensive test suite (28 tests)
- β Multi-project support
- β Cursor/Claude integration ready
Ready for production use! π