Code-RAG

Local code assistant powered by Retrieval-Augmented Generation.

Code-RAG indexes your codebase into a vector database and uses retrieval-augmented generation to answer questions about your code — all running locally.

How It Works

Index — Your code is chunked, embedded, and stored in a local vector database
Retrieve — When you ask a question, the most relevant code chunks are fetched via similarity search
Generate — Retrieved context is fed to the LLM alongside your query for grounded, accurate responses

Configuration

All parameters are tunable via config/config.yaml:

Parameter	What it does
MODEL_NAME	LLM model for code generation
TOP_K	Number of retrieved chunks to include as context
TOP_P	Nucleus sampling threshold
MAX_LENGTH	Maximum response token length
TEMPERATURE	Creativity vs. precision control

Project Structure

code-RAG/
 src/              # Core RAG pipeline
 config/           # Configuration files
 models/           # Model artifacts
 data/             # Source data for indexing
 vectordb/         # Vector database storage
 tests/            # Test suite
 .github/workflows # CI/CD

Setup

git clone https://github.com/brettleehari/code-RAG.git
cd code-RAG
pip install -r requirements.txt
export OPENAI_API_KEY=your_key
python src/main.py

Why I Built This

RAG is the most common pattern in production AI applications today. I wanted hands-on experience with the full pipeline — chunking strategies, embedding models, vector storage, retrieval tuning, and grounded generation. This project taught me the tradeoffs that matter when building RAG products: chunk size vs context quality, retrieval precision vs recall, and the cost of re-indexing.

Author

Hariprasad Sudharshan - GitHub

License

MIT

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Code-RAG

How It Works

Configuration

Project Structure

Setup

Why I Built This

Author

License

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.github/workflows		.github/workflows
config		config
data		data
models		models
src		src
tests		tests
vectordb		vectordb
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

Folders and files

Latest commit

History

Repository files navigation

Code-RAG

How It Works

Configuration

Project Structure

Setup

Why I Built This

Author

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages