AI-powered semantic search for your codebase in GitHub Copilot
A Model Context Protocol (MCP) server that enables GitHub Copilot to search and understand your codebase using Google's Gemini embeddings and Qdrant vector storage.
- π Semantic Search: Find code by meaning, not just keywords
- π― Smart Chunking: Automatically splits code into logical functions/classes
- π Incremental Indexing: Only re-indexes changed files, saves 90%+ time
- πΎ Auto-save Checkpoints: Saves progress every 10 files, resume anytime
- π Real-time Progress: Track indexing status with ETA and performance metrics
- β‘ Parallel Processing: 25x faster indexing with batch parallel execution
- π Real-time Watch: Monitors file changes and updates index automatically
- π Multi-language: Supports 15+ programming languages
- βοΈ Vector Storage: Uses Qdrant for persistent vector storage
- π¦ Simple Setup: Just 4 environment variables to get started
- Gemini API Key: Get free at Google AI Studio
- Qdrant Cloud Account: Sign up free at cloud.qdrant.io
Step 1: Open MCP Configuration
- Open GitHub Copilot Chat (click Copilot icon in sidebar or press
Ctrl+Alt+I/Cmd+Alt+I) - Click the Settings icon (gear icon at top-right of chat panel)
- Select MCP Servers
- Click MCP Configuration (JSON) button
This will open ~/Library/Application Support/Code/User/mcp.json (macOS) or equivalent on your OS.
Step 2: Add Configuration
Add this to your mcp.json:
{
"servers": {
"codebase": {
"command": "npx",
"args": ["-y", "@ngotaico/mcp-codebase-index"],
"env": {
"REPO_PATH": "/absolute/path/to/your/project",
"GEMINI_API_KEY": "AIzaSyC...",
"QDRANT_URL": "https://your-cluster.gcp.cloud.qdrant.io:6333",
"QDRANT_API_KEY": "eyJhbGci..."
},
"type": "stdio"
}
}
}Note: If you already have other servers in
mcp.json, just add the"codebase"entry inside the existing"servers"object.
All 4 variables are required:
| Variable | Where to Get | Example |
|---|---|---|
REPO_PATH |
Absolute path to your project | /Users/you/Projects/myapp |
GEMINI_API_KEY |
Google AI Studio | AIzaSyC... |
QDRANT_URL |
Qdrant Cloud cluster URL | https://xxx.gcp.cloud.qdrant.io:6333 |
QDRANT_API_KEY |
Qdrant Cloud API key | eyJhbGci... |
You can customize the embedding model and output dimension:
{
"env": {
"REPO_PATH": "/Users/you/Projects/myapp",
"GEMINI_API_KEY": "AIzaSyC...",
"QDRANT_URL": "https://xxx.gcp.cloud.qdrant.io:6333",
"QDRANT_API_KEY": "eyJhbGci...",
"EMBEDDING_MODEL": "text-embedding-004",
"EMBEDDING_DIMENSION": "768"
}
}Supported embedding models:
text-embedding-004(β RECOMMENDED - default) - Best for all users, especially free tier- Dimension: 768 (fixed)
- Excellent for code search and documentation
- Works reliably with free tier Gemini API
- Optimized performance and accuracy
gemini-embedding-001(β οΈ NOT RECOMMENDED for free tier)- Flexible dimensions: 768-3072
- β May not work with free tier accounts due to quota/rate limits
- Only use if you have paid Gemini API access
Environment Variables:
EMBEDDING_MODEL: Choose embedding model (default:text-embedding-004)EMBEDDING_DIMENSION: Output dimension size (optional, auto-detected from model)text-embedding-004: 768 (fixed)gemini-embedding-001: 768-3072 (configurable, but not recommended for free tier)
π‘ Recommendation:
- All users (especially free tier): Use
text-embedding-004with 768 dimensions (default) - Paid API users only: Consider
gemini-embedding-001for multilingual projects - Large codebases (>10k files): Stick with 768 dimensions to save storage
β‘ Performance & Rate Limiting:
Optimized for text-embedding-004 (1,500 RPM):
- β Parallel batch processing: 25 chunks/second
- β Maximum API utilization: 1,500 requests/minute
- β Automatic retry with exponential backoff
- β No daily quota limits (unlimited indexing)
β±οΈ Indexing Speed:
- ~25 files/minute (2-2.5 seconds per file average)
- Small project (50-100 files): 2-4 minutes
- Medium project (200-400 files): 8-16 minutes
- Large project (500+ files): 20-25 minutes
- Speed varies based on file size, complexity, and API latency
Incremental Indexing:
- β First run: Indexes entire codebase (~20 mins for 500 files)
- β Subsequent runs: Only changed files (90%+ time savings)
- β Auto-save checkpoint: Every 10 files (safe to interrupt)
- β Resume on restart: Continues from last checkpoint
- Automatic queue management for large codebases
- Persistent state tracking with MD5 hashing
Real-time Status Tracking:
- Progress percentage and ETA
- Performance metrics (files/sec, avg time)
- Error tracking with timestamps
- Queue visibility for pending files
- Checkpoint progress indicators
The server will automatically:
- Connect to your Qdrant Cloud cluster
- Create a collection (if needed)
- Index your entire codebase
- Watch for file changes
Ask GitHub Copilot to search your codebase:
"Find the authentication logic"
"Show me how database connections are handled"
"Where is error logging implemented?"
"Find all API endpoint definitions"
Use the indexing_status tool to monitor progress:
"Check indexing status"
"Show me detailed indexing progress"
Status includes:
- Progress percentage and current file
- ETA (estimated time remaining)
- Performance metrics (speed, avg time)
- Quota usage and rate limits
- Recent errors with timestamps
- Files queued for next run
{
"env": {
"REPO_PATH": "/Users/you/Projects/myapp",
"GEMINI_API_KEY": "AIzaSyC...",
"QDRANT_URL": "https://xxx.gcp.cloud.qdrant.io:6333",
"QDRANT_API_KEY": "eyJhbGci..."
}
}{
"env": {
"QDRANT_COLLECTION": "my_project",
"WATCH_MODE": "true",
"BATCH_SIZE": "50",
"EMBEDDING_MODEL": "text-embedding-004"
}
}| Variable | Default | Description |
|---|---|---|
QDRANT_COLLECTION |
codebase |
Collection name in Qdrant |
WATCH_MODE |
true |
Auto-update on file changes |
BATCH_SIZE |
50 |
Embedding batch size |
EMBEDDING_MODEL |
text-embedding-004 |
Gemini embedding model (text-embedding-004 recommended, gemini-embedding-001 not recommended for free tier) |
- SETUP.md - Detailed setup walkthrough
- QDRANT_CLOUD_SETUP.md - Get Qdrant credentials
- QUICK_REF.md - Quick reference card
Python β’ TypeScript β’ JavaScript β’ Dart β’ Go β’ Rust β’ Java β’ Kotlin β’ Swift β’ Ruby β’ PHP β’ C β’ C++ β’ C# β’ Shell β’ SQL β’ HTML β’ CSS
βββββββββββββββ
β Your Code β
ββββββββ¬βββββββ
β
βΌ
βββββββββββββββββββ
β File Watcher β Monitors changes (MD5 hashing)
ββββββββ¬βββββββββββ
β
βΌ
βββββββββββββββββββ
β Code Parser β Splits into chunks (functions/classes)
ββββββββ¬βββββββββββ
β
βΌ
βββββββββββββββββββ
β Gemini API β Creates embeddings (768-dim vectors)
ββββββββ¬βββββββββββ
β
βΌ
βββββββββββββββββββ
β Qdrant Cloud β Stores vectors + metadata
ββββββββ¬βββββββββββ
β
βΌ
βββββββββββββββββββ
β Checkpoint β Auto-saves every 10 files
ββββββββ¬βββββββββββ
β
βΌ
βββββββββββββββββββ
β Copilot Chat β Semantic search queries
βββββββββββββββββββ
Smart Change Detection:
- Tracks file hashes (MD5) to detect changes
- Only indexes new/modified files on subsequent runs
- Automatically deletes vectors for removed files
Auto-save Checkpoints:
- Saves progress every 10 files during indexing
- Safe to stop VS Code anytime (Ctrl+C, close window)
- Resumes from last checkpoint on restart
- Memory stored in
{repo}/memory/:incremental_state.json- Indexed files list, quota trackingindex-metadata.json- MD5 hashes for change detection
Sync Recovery:
- Auto-detects if Qdrant collection was deleted
- Clears stale memory and re-indexes from scratch
- Validates checkpoint integrity on startup
Check server status:
- Open Copilot Chat
- Click Settings (gear icon) β MCP Servers
- Find your
codebaseserver - Click More (...) β Show Output
- Check the logs for errors
Common issues:
- β
REPO_PATHmust be absolute path - β All 4 env variables must be set
- β Qdrant URL must include
:6333port - β Gemini API key must be valid
Test connection:
curl -H "api-key: YOUR_KEY" \
https://YOUR_CLUSTER.gcp.cloud.qdrant.io:6333/collectionsShould return JSON with collections list.
- Large repos (1000+ files) take 5-10 minutes initially
- Reduce
BATCH_SIZEif hitting rate limits - Check Gemini API quota: aistudio.google.com
If you see errors like "quota exceeded" or "model not available":
β οΈ gemini-embedding-001often doesn't work with free tier accounts- β
Solution: Switch to
text-embedding-004(recommended for all users) - Update your config:
"EMBEDDING_MODEL": "text-embedding-004" - Reload VS Code and re-index
Indexing Speed (text-embedding-004):
- Parallel processing: 25 chunks/second = 1,500 chunks/minute
- Sequential fallback: 1 chunk/second (for gemini-embedding-001)
- First-time indexing: ~3-7 minutes for 5,000 chunks
- Incremental updates: Only changed files (typically <1 minute)
Real-world Examples:
- Small project (1,000 chunks): ~40 seconds
- Medium project (5,000 chunks): ~3.3 minutes
- Large project (10,000 chunks): ~6.7 minutes
Search Performance:
- Search latency: <100ms (Qdrant Cloud)
- Storage: ~3.5KB per code chunk (768-dim vectors)
- Recommended: <10K chunks per collection
Quota Savings with Incremental Indexing:
- Initial index: Uses daily quota
- Daily updates: Only 20-40 chunks (changed files)
- Savings: 90%+ reduction in API calls
MIT Β© NgoTaiCo
Issues and PRs welcome at github.com/NgoTaiCo/mcp-codebase-index