RAG-based Q&A system for code repositories that provides grounded answers with verifiable citations.
# Install
pip install code-rag-me
# Configure (get free API key from https://console.groq.com/keys)
coderag setup
# Start web interface
coderag serveOpen http://localhost:8000 to use the web interface.
# Auto-configure Claude Desktop
coderag mcp-install
# Restart Claude DesktopNow you can use CodeRAG directly in Claude Desktop!
- Grounded Responses: Every answer includes citations to source code
[file:start-end] - Cloud or Local LLM: Use Groq (free), OpenAI, Anthropic, or run locally with GPU
- GitHub Integration: Index any public GitHub repository
- MCP Support: Integrate directly with Claude Desktop
- Semantic Chunking: Tree-sitter for Python, text fallback for other languages
- Web Interface: Gradio UI for easy interaction
- REST API: Programmatic access for integration
- CLI: Full command-line interface
coderag setup # Configure LLM provider and API key
coderag serve # Start web server
coderag mcp-install # Configure Claude Desktop for MCP
coderag mcp-run # Run MCP server (used by Claude Desktop)
coderag index <url> # Index a GitHub repository
coderag query <repo> "?" # Ask a question about code
coderag repos # List indexed repositories
coderag doctor # Diagnose setup issuesArch Linux uses PEP 668 to protect system Python. Use one of these methods:
Option A: pipx (Recommended for CLI tools)
sudo pacman -S python-pipx
pipx install code-rag-meOption B: Virtual environment
python -m venv ~/.local/share/coderag-venv
source ~/.local/share/coderag-venv/bin/activate
pip install code-rag-meTo always have coderag available, add to your ~/.bashrc or ~/.zshrc:
alias coderag="~/.local/share/coderag-venv/bin/coderag"# Install Python and pip
sudo apt update
sudo apt install python3 python3-pip python3-venv
# Option A: pipx (Recommended)
sudo apt install pipx
pipx install code-rag-me
# Option B: Virtual environment
python3 -m venv ~/.local/share/coderag-venv
source ~/.local/share/coderag-venv/bin/activate
pip install code-rag-me# Install Python and pip
sudo dnf install python3 python3-pip
# Option A: pipx (Recommended)
sudo dnf install pipx
pipx install code-rag-me
# Option B: Virtual environment
python3 -m venv ~/.local/share/coderag-venv
source ~/.local/share/coderag-venv/bin/activate
pip install code-rag-me# Create virtual environment
python3 -m venv ~/.local/share/coderag-venv
source ~/.local/share/coderag-venv/bin/activate
pip install code-rag-meOption A: pipx (Recommended)
# Install pipx via Homebrew
brew install pipx
pipx ensurepath
pipx install code-rag-meOption B: Virtual environment
python3 -m venv ~/.local/share/coderag-venv
source ~/.local/share/coderag-venv/bin/activate
pip install code-rag-meOption C: Homebrew Python
brew install python@3.11
pip3 install code-rag-me# Install pipx
pip install pipx
pipx ensurepath
# Install CodeRAG
pipx install code-rag-me# Create virtual environment
python -m venv %USERPROFILE%\coderag-venv
# Activate (Command Prompt)
%USERPROFILE%\coderag-venv\Scripts\activate.bat
# Activate (PowerShell)
& $env:USERPROFILE\coderag-venv\Scripts\Activate.ps1
# Install
pip install code-rag-mepip install code-rag-meNote: On Windows, you may need to run PowerShell as Administrator or enable script execution with
Set-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUser
git clone https://github.com/Sebastiangmz/CodeRAG.git
cd CodeRAG
pip install -e .
coderag setupgit clone https://github.com/Sebastiangmz/CodeRAG.git
cd CodeRAG
docker compose upAfter installing, configure your LLM provider:
coderag setupThis will prompt you to:
- Choose an LLM provider (Groq recommended - free tier available)
- Enter your API key (get one at https://console.groq.com/keys)
- Configure optional settings
- Run
coderag serve - Open http://localhost:8000
- Go to "Index Repository" → Enter GitHub URL → Click "Index"
- Go to "Ask Questions" → Select repo → Ask questions
# Index a repository
coderag index https://github.com/owner/repo
# Ask questions
coderag query abc12345 "How does authentication work?"
# List repositories
coderag repos# Index repository
curl -X POST http://localhost:8000/api/v1/repos/index \
-H "Content-Type: application/json" \
-d '{"url": "https://github.com/owner/repo"}'
# Query
curl -X POST http://localhost:8000/api/v1/query \
-H "Content-Type: application/json" \
-d '{"question": "How does X work?", "repo_id": "abc12345"}'After running coderag mcp-install and restarting Claude Desktop:
You: Use coderag to index https://github.com/owner/repo
Claude: I'll index that repository for you...
✅ Indexed! 150 files, 1,234 chunks.
You: How does the authentication system work?
Claude: Based on the code, authentication is handled in...
[src/auth/handler.py:45-78]
# LLM Provider (groq, openai, anthropic, openrouter, together, local)
MODEL_LLM_PROVIDER=groq
MODEL_LLM_API_KEY=your-api-key
# Embeddings (runs locally on CPU by default)
MODEL_EMBEDDING_DEVICE=auto # auto, cuda, or cpu
# Server
SERVER_HOST=0.0.0.0
SERVER_PORT=8000Configuration is stored in ~/.config/coderag/config.json after running coderag setup.
┌─────────────────────────────────────────────────────────────┐
│ User Interface │
│ (Gradio UI / REST API / MCP / CLI) │
└──────────────────────┬──────────────────────────────────────┘
│
┌──────────────────────┴──────────────────────────────────────┐
│ Ingestion Pipeline │
│ GitHub Clone → File Filter → Chunker (Tree-sitter/Text) │
└──────────────────────┬──────────────────────────────────────┘
│
┌──────────────────────┴──────────────────────────────────────┐
│ Indexing & Storage │
│ Embeddings (nomic-embed) → ChromaDB (Cosine) │
└──────────────────────┬──────────────────────────────────────┘
│
┌──────────────────────┴──────────────────────────────────────┐
│ Retrieval & Generation │
│ Query → Top-K Search → LLM (Cloud/Local) → Response │
└──────────────────────────────────────────────────────────────┘
src/coderag/
├── cli.py # Unified CLI
├── ingestion/ # Repository loading and chunking
├── indexing/ # Embeddings and vector storage
├── retrieval/ # Semantic search
├── generation/ # LLM inference and citations
├── mcp/ # Model Context Protocol server
├── ui/ # Gradio web interface
├── api/ # REST API endpoints
└── models/ # Data models
# Install dev dependencies
pip install -e ".[dev]"
# Run tests
pytest tests/
# Format code
black src/ tests/
# Lint
ruff check src/ tests/
# Type check
mypy src/- Indexing: ~1000 files in < 5 minutes
- Query: Response in < 10 seconds
- Embeddings: Runs on CPU (~275MB model)
- LLM: Cloud (instant) or Local (requires 8GB+ VRAM)
All responses include citations:
[file_path:start_line-end_line]
Example:
The authentication logic is in the login() function [src/auth.py:45-78].
Run diagnostics:
coderag doctorerror: externally-managed-environment
× This environment is externally managed
This happens on modern Linux distributions (Arch, Fedora 38+, Ubuntu 23.04+) that implement PEP 668. Solution: use pipx or a virtual environment. See the Installation section for your distribution.
coderag setup # Run interactive setupIf you don't have a GPU or encounter CUDA errors:
export MODEL_EMBEDDING_DEVICE=cpu
coderag serveOr add to your .env file:
MODEL_EMBEDDING_DEVICE=cpu
- Run
coderag mcp-install - Completely quit Claude Desktop (not just close the window)
- Restart Claude Desktop
- Check the MCP icon in Claude Desktop settings
# If using pipx
pipx ensurepath
source ~/.bashrc # or ~/.zshrc
# If using venv, make sure it's activated
source ~/.local/share/coderag-venv/bin/activateSet-ExecutionPolicy -ExecutionPolicy RemoteSigned -Scope CurrentUserMIT License - see LICENSE file
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests
- Submit a pull request
- Groq for fast, free LLM inference
- nomic-embed-text by Nomic AI
- ChromaDB for vector storage
- Tree-sitter for code parsing
- MCP by Anthropic