An intelligent AI assistant built in Python that uses Retrieval-Augmented Generation (RAG) to understand and answer natural language questions about any Python codebase.
You can point it at a local repository (for example, a FastAPI project) — it will analyze and "learn" the code, allowing you to ask questions like:
“How is user authentication handled?”
“Show me the Pydantic model for an Article.”
- Google Gemini → For powerful Large Language Model (LLM) and embedding generation.
- LangChain → Framework to chain RAG components together.
- FAISS → Fast, local, in-memory vector database for similarity search.
The assistant follows a classic Retrieval-Augmented Generation pipeline:
- Load → Scans your repository and loads all .pyfiles.
- Split → Intelligently splits code into smaller, context-aware chunks.
- Embed → Converts each chunk into a vector using Google’s embedding-001model.
- Store → Saves all vectors into a local FAISS vector database.
- Retrieve → When you ask a question, it finds the top 5 most relevant code chunks.
- Augment → Adds these retrieved chunks to your original question as context.
- Generate → Sends the augmented prompt to Gemini, which produces an accurate answer based on the provided context.
- 
You ask: “How is database dependency injection handled?” 
- 
The assistant retrieves the top 5 related code snippets from your repository. 
- 
It combines your question + retrieved code snippets into a single prompt. 
- 
Gemini generates a precise, context-aware answer — grounded only in your codebase. 
Make sure you have:
- Python 3.9+
- Git
- A Google Gemini API key
git clone https://github.com/supreetbhat/codebase_assistant.git
cd codebase_assistant
It’s strongly recommended to use a virtual environment.
For macOS/Linux:
python3 -m venv venv
source venv/bin/activateFor Windows:
python -m venv venv
.\venv\Scripts\activate
Install all required libraries from requirements.txt
pip install -r requirements.txt
You’ll need a Google Gemini API key.
Get it from Google AI Studio.
Then create a .env file:
cp .env.example .env
Open .env and add your key:
GOOGLE_API_KEY="YOUR_API_KEY_HERE"
💡 Note: You must enable billing on your Google Cloud project to use the embedding API.
Run the assistant and pass your target codebase path as an argument:
python code_assistant.py /path/to/your/codebase
Example:
python code_assistant.py ../my-fastapi-project
The script will first index the codebase (this may take a minute). Once you see:
✅ Codebase Assistant is ready!
You can start asking natural language questions.
Type exit to quit.
“What does the /api/users/login endpoint do?”
“How do I create a new article?”
“Show me the Pydantic model for a User.”
“How is database dependency injection handled?”
“What fields are required to create a new user?”
“Explain the logic for following another user.”