Self-hosted MCP server that adds semantic vector search over your local codebases to any MCP-capable client (GitHub Copilot, Cline, Claude Desktop, etc.).
Goal: Robust RAG for Copilot (or any MCP client) without paying for Cursor/Windsurf.
Zero cost. Zero limits. Full control.
- GitHub Copilot Pro has an excellent model but limited codebase RAG
- Cursor/Windsurf have good RAG but cost $15–20/month
- Continue.dev has RAG but doesn't integrate natively with MCP-aware agents
This MCP server provides:
- Indexing of local codebases using vector embeddings
- Semantic search via the
search_codebasetool — with keyword reranking (hybrid search) - Multi-project support with isolated ChromaDB collections
- Universal integration with any MCP client
| Component | Technology | Why |
|---|---|---|
| Embeddings | sentence-transformers (all-MiniLM-L6-v2) |
Fast, lightweight, 384-dim |
| Code embeddings | microsoft/unixcoder-base (optional) | Code-specific model, activated via EMBEDDING_MODEL |
| Vector DB | ChromaDB | Simple, persistent, zero config |
| Code parsing | Python AST + character chunking | Works across all supported languages |
| MCP SDK | modelcontextprotocol/python-sdk | Official standard |
| Runtime | Python 3.11+ | — |
- Python 3.11+
piporuv
git clone https://github.com/di5rupt0r/codebase-rag.git
cd codebase-rag
# Install as a package (adds the `codebase-rag` command to ~/.local/bin)
pip install -e .python scripts/health_check.pyExpected output:
🔍 MCP Codebase RAG Server Health Check
==================================================
Checking Embedding Provider... ✓ OK (3.03s)
Checking ChromaDB Connection... ✓ OK (0.14s)
Checking Search Functionality... ✓ OK (2.36s)
Checking Data Directory... ✓ OK (0.00s)
==================================================
Health Check Summary: 4/4 checks passed
🎉 All systems operational!
# Index the current directory
python scripts/index_project.py . --name my-project
# Index a specific path
python scripts/index_project.py ~/projects/api --name api-backend
# Force full reindex
python scripts/index_project.py . --name my-project --force
# Dry run to preview what will be indexed
python scripts/index_project.py . --name my-project --dry-runcodebase-ragMCP_TRANSPORT=streamable-http MCP_PORT=8080 codebase-ragAdd to your VS Code mcp.json:
{
"servers": {
"codebase-rag": {
"type": "stdio",
"command": "codebase-rag"
}
}
}{
"servers": {
"codebase-rag": {
"type": "http",
"url": "http://127.0.0.1:8080/mcp"
}
}
}{
"mcpServers": {
"codebase-rag": {
"command": "codebase-rag"
}
}
}Semantic search over an indexed project using vector embeddings + keyword reranking.
Input:
{
"query": "where is the authentication logic?",
"top_k": 5,
"project": "my-project",
"file_types": [".py", ".js"]
}Output:
{
"results": [
{
"path": "src/auth.py",
"content": "def authenticate_user(user, password):\n ...",
"score": 0.89
}
],
"total_indexed_chunks": 1247,
"query_time_ms": 23
}Re-index a project after large changes.
Input:
{
"project_path": "/path/to/your/project",
"project_name": "my-project",
"force": false
}List all indexed projects.
List indexed files in a project.
Input: { "project": "my-project" }
Return the full content of an indexed file.
Input: { "path": "src/main.py" }
# ChromaDB path (default: ./data/chroma_db relative to install dir)
export CHROMA_DB_PATH="/custom/path/to/chroma"
# Embedding model (default: all-MiniLM-L6-v2)
# Use microsoft/unixcoder-base for better code-specific embeddings (~2GB, requires torch)
export EMBEDDING_MODEL="microsoft/unixcoder-base"
# HTTP transport settings (only needed in HTTP/service mode)
export MCP_TRANSPORT="streamable-http"
export MCP_HOST="127.0.0.1"
export MCP_PORT="8080"
# Set this when exposing via reverse proxy or Tailscale Funnel
export MCP_ALLOWED_HOST="your-hostname.example.com"
# Log level (default: INFO)
export LOG_LEVEL="DEBUG"Edit src/codebase_rag/config.py:
CHUNK_SIZE = 500 # characters per chunk
CHUNK_OVERLAP = 50 # overlap between chunks
DEFAULT_TOP_K = 5 # default results per searchPython, JavaScript, TypeScript, JSX, TSX, Java, C, C++, Go, Rust, Ruby, PHP, C#, Shell, YAML, JSON.
*.pyc, __pycache__, .git, node_modules, .venv, venv, *.egg-info, .pytest_cache
| Operation | Expected Time | Notes |
|---|---|---|
| Index 20 .py files (~5k LOC) | ~5–8s | First run; incremental is much faster |
| Vector search (top_k=5) | ~20–50ms | ChromaDB in-process |
| Query embedding | ~10–20ms | sentence-transformers, CPU |
| Server cold start | ~2–3s | Model loaded into memory |
Scan a directory for Git repositories and index them all automatically:
python scripts/auto_index.py ~/projectsWatch a project for file changes and reindex incrementally (debounced, 5s):
python scripts/watch.py /path/to/project --name my-projectInstall a post-commit hook so changed files are reindexed automatically after every commit:
python scripts/setup_git_hook.py /path/to/your/repo my-project# All tests (116 passing)
pytest -v
# Specific modules
pytest tests/test_config.py -v
pytest tests/test_embeddings.py -v
pytest tests/test_indexer.py -v
pytest tests/test_server.py -v
# With coverage
pytest --cov=codebase_rag --cov-report=htmlA template service file is provided at systemd/codebase-rag-server.service.
Replace YOUR_USERNAME with your actual Linux username before installing:
# Substitute your username in-place
sed -i "s/YOUR_USERNAME/$USER/g" systemd/codebase-rag-server.service
# Install and start
sudo cp systemd/codebase-rag-server.service /etc/systemd/system/
sudo systemctl daemon-reload
sudo systemctl enable codebase-rag-server
sudo systemctl start codebase-rag-server
# Check
sudo systemctl status codebase-rag-server
sudo journalctl -u codebase-rag-server -fTo use the server from a remote machine (Codespaces, company laptop, etc.):
# Expose port 8080 via Tailscale Funnel
tailscale funnel 8080
# Add to your service file:
# Environment="MCP_ALLOWED_HOST=your-machine.your-tailnet.ts.net"
# Then in your remote mcp.json:
# "url": "https://your-machine.your-tailnet.ts.net/mcp"Slow first start: The embedding model (~100MB) is downloaded on first use. Run health_check.py to pre-load it.
High memory usage: The default model uses ~500MB RAM. If needed, use an even smaller model via EMBEDDING_MODEL.
Permission errors: Ensure the running user has write access to data/chroma_db/.
Debug mode:
LOG_LEVEL=DEBUG codebase-rag- Fork the project
- Create a feature branch:
git checkout -b feature/your-feature - Follow strict TDD: RED → GREEN → REFACTOR
- Atomic, descriptive commits
- Open a pull request with tests
# Dev setup
pip install -e ".[dev]"
pytest -v --cov=codebase_ragMIT License — see LICENSE.