Skip to content

YEEthanCC/codebase-rag-api

Repository files navigation

Codebase RAG API

Project Reference: code-graph-rag

An accurate Retrieval-Augmented Generation (RAG) system that analyzes multi-language codebases using Tree-sitter, builds comprehensive knowledge graphs, and enables natural language querying of codebase structure and relationships as well as editing capabilities.

Prerequisites

  • Python 3.12+
  • Docker & Docker Compose (for Memgraph, Redis)
  • cmake (required for building pymgclient dependency)
  • For cloud models: Google Gemini API key
  • For local models: Ollama installed and running
  • uv package manager

Installing cmake

On macOS:

brew install cmake

On Linux (Ubuntu/Debian):

sudo apt-get update
sudo apt-get install cmake

On Linux (CentOS/RHEL):

sudo yum install cmake
# or on newer versions:
sudo dnf install cmake
  1. Installation
git clone https://github.com/YEEthanCC/codebase-rag-api.git
  1. Install dependencies:

For basic Python support:

uv sync

For full multi-language support:

uv sync --extra treesitter-full

For development (including tests and pre-commit hooks):

make dev

This installs all dependencies and sets up pre-commit hooks automatically.

This installs Tree-sitter grammars for all supported languages (see Multi-Language Support section).

  1. Set up environment variables:
cp .env.example .env
# Edit .env with your configuration (see options below)

Configuration Options

The new provider-explicit configuration supports mixing different providers for orchestrator and cypher models.

Option 1: All Ollama (Local Models)

# .env file
ORCHESTRATOR_PROVIDER=ollama
ORCHESTRATOR_MODEL=llama3.2
ORCHESTRATOR_ENDPOINT=http://localhost:11434/v1

CYPHER_PROVIDER=ollama
CYPHER_MODEL=codellama
CYPHER_ENDPOINT=http://localhost:11434/v1

Option 2: All OpenAI Models

# .env file
ORCHESTRATOR_PROVIDER=openai
ORCHESTRATOR_MODEL=gpt-4o
ORCHESTRATOR_API_KEY=sk-your-openai-key

CYPHER_PROVIDER=openai
CYPHER_MODEL=gpt-4o-mini
CYPHER_API_KEY=sk-your-openai-key

Option 3: All Google Models

# .env file
ORCHESTRATOR_PROVIDER=google
ORCHESTRATOR_MODEL=gemini-2.5-pro
ORCHESTRATOR_API_KEY=your-google-api-key

CYPHER_PROVIDER=google
CYPHER_MODEL=gemini-2.5-flash
CYPHER_API_KEY=your-google-api-key

Option 4: Mixed Providers

# .env file - Google orchestrator + Ollama cypher
ORCHESTRATOR_PROVIDER=google
ORCHESTRATOR_MODEL=gemini-2.5-pro
ORCHESTRATOR_API_KEY=your-google-api-key

CYPHER_PROVIDER=ollama
CYPHER_MODEL=codellama
CYPHER_ENDPOINT=http://localhost:11434/v1

Get your Google API key from Google AI Studio.

Install and run Ollama:

# Install Ollama (macOS/Linux)
curl -fsSL https://ollama.ai/install.sh | sh

# Pull required models
ollama pull llama3.2
# Or try other models like:
# ollama pull llama3
# ollama pull mistral
# ollama pull codellama

# Ollama will automatically start serving on localhost:11434

Note: Local models provide privacy and no API costs, but may have lower accuracy compared to cloud models like Gemini.

  1. Start Memgraph database & Redis:
docker-compose up -d
  1. Run

Activate Environment

source .venv/bin/activate

Start The Server

make
  1. Connect To Frontend
git clone https://github.com/YEEthanCC/codebase-rag-client.git

About

No description, website, or topics provided.

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Sponsor this project

  •  

Packages

No packages published

Languages