🧠 Cortex

Your Local AI-Powered Codebase Assistant

A powerful, privacy-first RAG agent that helps you understand, navigate, and query your codebases with precision and ease.

Features • Installation • Usage • Use Cases • Architecture

🌟 Overview

Cortex is a sophisticated AI coding assistant that combines the power of Retrieval-Augmented Generation (RAG) with multi-agent orchestration to provide intelligent, context-aware answers about your codebase. Built with privacy in mind, Cortex runs entirely locally using Ollama or can be configured to use OpenAI's models.

Why Cortex?

🔒 Privacy-First: All data stays local. Your code never leaves your machine (unless using OpenAI).
🎯 Precise & Intelligent: Uses symbolic analysis (LSP) + semantic search for accurate answers.
⚡ Lightning Fast: Incremental indexing with SHA-256 hashing means only changed files are re-indexed.
🤖 Multi-Agent System: Specialized sub-agents for planning, exploration, building, and general queries.
� Real-Time Sync: Automatic file watching keeps your index up-to-date as you code.
🌍 Multi-Project Support: Work with unlimited projects independently, each with its own .cortex directory.

✨ Features

🗂️ Multi-Project Awareness

Each project maintains its own .cortex directory with metadata, indices, and state
Switch between projects seamlessly
No cross-contamination of project data

🔄 Incremental Indexing

SHA-256 hashing detects file changes
Only modified files are re-indexed
Blazing-fast updates even for large codebases

🎯 Symbolic Intelligence (LSP Integration)

Powered by Jedi for Python static analysis
Find exact symbol definitions and references
No guessing—Cortex knows precisely where your code is

🤖 Deep Agent Orchestration

LangGraph-based ReAct agents for multi-step reasoning
Specialized sub-agents:
- 🔍 Explorer: Code search, understanding, and navigation
- 🛠️ Builder: Code modification and file creation (read-only mode currently)
- 📋 Planner: Multi-step task breakdown
- 💬 General: Quick answers and simple queries
Intelligent tool selection with fallback strategies

🔍 Powerful Search Tools

Semantic Search: Find code by meaning and context
Exact Pattern Matching: Regex-based grep for precise searches
Symbol Lookup: Find definitions and references
File Discovery: Search by name, pattern, or directory
Code Outline: Get class/function structure without full content
Reranking: LLM-powered result reranking for better relevance

👁️ Automated Background Watching

Real-time file monitoring with watchdog
Automatic re-indexing on file changes
Always in sync with your latest code

🎨 Professional CLI

Beautiful terminal UI with Rich library
Interactive chat mode with conversation memory
Single-shot query mode for quick answers
Comprehensive repository management commands

🔌 Model Agnostic

Local Models: Optimized for ministral-3:3b + qwen3-embedding:0.6b via Ollama
Cloud Models: Full support for OpenAI (GPT-4, GPT-4o, etc.)
Easy configuration for custom models

🌐 GitHub Integration

Clone and index GitHub repositories directly
Manage multiple cloned repos
Full support for remote codebases

� Installation

Prerequisites

Python 3.13+
uv package manager
Ollama (for local models)

Quick Start

# Clone the repository
git clone https://github.com/your-username/cortex.git
cd cortex

# Install dependencies
uv sync

# Install required models (Ollama)
./scripts/install_models.sh

Environment Setup (Optional)

Create a .env file for OpenAI integration:

cp .env.example .env
# Edit .env and add your OPENAI_API_KEY

📖 Usage

1️⃣ Index a Local Project

Index your current project or any directory:

# Index current directory
uv run python main.py index .

# Index a specific project
uv run python main.py index /path/to/your/project

What happens:

Creates a .cortex directory in the project root
Analyzes and chunks all code files
Builds a vector index for semantic search
Stores file hashes for incremental updates

2️⃣ Index a GitHub Repository

Clone and index a GitHub repository directly:

# Index from GitHub URL
uv run python main.py index https://github.com/user/repo.git --type github

# Or use the shorthand
uv run python main.py index https://github.com/user/repo.git

What happens:

Clones the repository to ~/.cortex/repos/
Indexes the entire codebase
Provides a path for future queries

3️⃣ Ask a Question

Query your codebase with a single question:

# Ask about the current project
uv run python main.py ask "How does the StateManager handle file hashing?"

# Ask about a specific project
uv run python main.py ask "What tools are available?" -p /path/to/project

# Use OpenAI for better responses
uv run python main.py ask "Explain the agent architecture" --provider openai --model gpt-4o

Features:

Automatic background file watching during query
Multi-tool execution for comprehensive answers
Supports both local (Ollama) and cloud (OpenAI) models

4️⃣ Interactive Chat

Start a persistent conversation with your codebase:

# Chat with current project (using OpenAI by default)
uv run python main.py chat

# Chat with a specific project
uv run python main.py chat --project /path/to/project

# Use local Ollama model
uv run python main.py chat --provider ollama --model ministral-3:3b

Features:

Conversation memory across messages
Real-time file watching
Type exit or quit to end the session

Example Session:

You : What files are in the agents directory?
Cortex: The agents directory contains:
- orchestrator.py (main orchestrator)
- tools.py (tool definitions)
- reranker.py (result reranking)
- subagents/ (specialized agents)

You : Show me the tools available
Cortex: [Lists all tools with descriptions...]

5️⃣ Watch for Changes

Manually start the file watcher (usually automatic in ask and chat):

# Watch current directory
uv run python main.py watch .

# Watch a specific project
uv run python main.py watch /path/to/project

What it does:

Monitors file system for changes
Automatically re-indexes modified files
Keeps your vector store in sync

6️⃣ Manage GitHub Repositories

List and manage cloned repositories:

# List all cloned repos
uv run python main.py repo list

# Delete a cloned repo
uv run python main.py repo delete repo-name

🎯 Use Cases

🔍 Code Exploration

"I just joined a new team. How do I understand this massive codebase?"

uv run python main.py chat --project /path/to/new/codebase

Ask questions like:

"What is the overall architecture?"
"Where is the authentication logic?"
"How does the database connection work?"
"Show me all API endpoints"

🐛 Debugging & Investigation

"There's a bug in the payment processing. Where should I look?"

uv run python main.py ask "Find all code related to payment processing"

Cortex will:

Search semantically for payment-related code
Find exact function/class definitions
Show you references across the codebase

📚 Documentation & Onboarding

"I need to document how our indexing system works."

uv run python main.py ask "Explain how the incremental indexing system works"

Get detailed explanations with:

Code snippets from relevant files
Architecture diagrams (via agent reasoning)
Step-by-step breakdowns

🔄 Refactoring Assistance

"I want to refactor the StateManager class. What depends on it?"

uv run python main.py ask "Find all references to StateManager"

Cortex provides:

All files using StateManager
Exact line numbers and context
Related classes and functions

🌐 Open Source Exploration

"I want to understand how LangChain implements agents."

# Index the LangChain repository
uv run python main.py index https://github.com/langchain-ai/langchain.git --type github

# Ask questions
uv run python main.py ask "How are agents implemented?" -p ~/.cortex/repos/langchain

🧪 Testing & Quality Assurance

"Are there any tests for the ingestion pipeline?"

uv run python main.py ask "Find all test files related to ingestion"

🏗️ Architecture Review

"I need to understand the data flow in this system."

uv run python main.py chat
# Then ask: "Trace the data flow from user input to database storage"

🏗️ Architecture

System Overview

┌─────────────────────────────────────────────────────────────┐
│                         Cortex CLI                          │
│                    (Typer + Rich UI)                        │
└────────────────────┬────────────────────────────────────────┘
                     │
                     ▼
┌─────────────────────────────────────────────────────────────┐
│                    Deep Agent Orchestrator                  │
│              (LangGraph Multi-Agent System)                 │
├─────────────────────────────────────────────────────────────┤
│  Sub-Agents:                                                │
│  • Explorer (Code Search & Understanding)                   │
│  • Builder (Code Modification - Read-Only)                  │
│  • Planner (Multi-Step Task Breakdown)                      │
│  • General (Quick Answers)                                  │
└────────────────────┬────────────────────────────────────────┘
                     │
                     ▼
┌─────────────────────────────────────────────────────────────┐
│                        Tool Layer                           │
├─────────────────────────────────────────────────────────────┤
│  • search_code (Semantic Search + Reranking)                │
│  • grep_code (Regex Pattern Matching)                       │
│  • read_file (File Content Retrieval)                       │
│  • get_symbol_info (LSP Symbol Definitions)                 │
│  • find_references (LSP Reference Lookup)                   │
│  • list_files (Directory Exploration)                       │
│  • search_files_by_name (Pattern-Based File Search)         │
│  • get_file_outline (AST-Based Structure)                   │
└────────────────────┬────────────────────────────────────────┘
                     │
        ┌────────────┴────────────┐
        ▼                         ▼
┌──────────────────┐    ┌──────────────────┐
│  Vector Store    │    │   LSP Engine     │
│   (ChromaDB)     │    │     (Jedi)       │
├──────────────────┤    ├──────────────────┤
│ • Embeddings     │    │ • Static         │
│ • Semantic       │    │   Analysis       │
│   Search         │    │ • Symbol         │
│ • Reranking      │    │   Resolution     │
└──────────────────┘    └──────────────────┘

Component Details

1. Ingestion Pipeline

File Loading: Recursive directory traversal with smart filtering
Chunking: Specialized chunkers for Python (AST-based) and text
Hashing: SHA-256 for change detection
State Management: SQLite database tracks indexed files

2. Vector Store

Engine: ChromaDB for efficient vector storage
Embeddings: qwen3-embedding:0.6b (local) or OpenAI embeddings
Storage: .cortex/chroma/ directory per project
Retrieval: Similarity search with configurable k-value

3. Agent System

Framework: LangGraph for agent orchestration
Pattern: ReAct (Reasoning + Acting)
Memory: In-memory conversation state with thread IDs
Sub-Agents: Specialized agents for different task types

4. LSP Integration

Engine: Jedi for Python static analysis
Capabilities: Symbol definitions, references, type inference
Scope: Project-wide symbol resolution

5. File Watching

Library: watchdog for file system monitoring
Triggers: File creation, modification, deletion
Action: Automatic re-indexing of changed files

🛠️ Available Tools

Tool	Description	Use Case
`search_code`	Semantic search with reranking	"How does authentication work?"
`grep_code`	Exact regex pattern matching	"Find all TODO comments"
`read_file`	Read full file content	"Show me the config file"
`get_symbol_info`	Find symbol definitions	"Where is UserManager defined?"
`find_references`	Find symbol usages	"Where is UserManager used?"
`list_files`	List directory contents	"What's in the agents folder?"
`search_files_by_name`	Pattern-based file search	"Find all test files"
`get_file_outline`	Get class/function structure	"Show me the structure of main.py"

🧪 Advanced Configuration

Custom Models

Using Different Ollama Models:

uv run python main.py chat --provider ollama --model llama3:8b

Using OpenAI Models:

export OPENAI_API_KEY="your-key-here"
uv run python main.py chat --provider openai --model gpt-4o

Project Structure

your-project/
├── .cortex/                 # Cortex metadata (auto-created)
│   ├── chroma/             # Vector store
│   └── indexing/
│       └── state.db        # File hash state
├── your-code/
└── ...

📊 Performance

Indexing Speed: ~100-500 files/second (depends on file size)
Query Latency:
- Local (Ollama): 2-5 seconds
- OpenAI: 1-3 seconds
Memory Usage: ~200-500 MB (depends on project size)
Incremental Updates: 10-100x faster than full re-indexing

🤝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

🙏 Acknowledgments

LangChain for the agent framework
Ollama for local LLM inference
ChromaDB for vector storage
Jedi for Python static analysis
Rich for beautiful terminal UI

Built with ❤️ by developers, for developers

⭐ Star this repo if you find it useful!

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
.cortex/indexing		.cortex/indexing
agents		agents
core		core
embeddings		embeddings
indexing		indexing
ingestion		ingestion
llm		llm
scripts		scripts
vectorstore		vectorstore
.env.example		.env.example
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
main.py		main.py
pyproject.toml		pyproject.toml
scenario_evaluation_report_20260116_081322.json		scenario_evaluation_report_20260116_081322.json
scenario_evaluation_report_20260116_081322.md		scenario_evaluation_report_20260116_081322.md
uv.lock		uv.lock

Folders and files

Latest commit

History

Repository files navigation

🧠 Cortex

Your Local AI-Powered Codebase Assistant

🌟 Overview

Why Cortex?

✨ Features

🗂️ Multi-Project Awareness

🔄 Incremental Indexing

🎯 Symbolic Intelligence (LSP Integration)

🤖 Deep Agent Orchestration

🔍 Powerful Search Tools

👁️ Automated Background Watching

🎨 Professional CLI

🔌 Model Agnostic

🌐 GitHub Integration

� Installation

Prerequisites

Quick Start

Environment Setup (Optional)

📖 Usage

1️⃣ Index a Local Project

2️⃣ Index a GitHub Repository

3️⃣ Ask a Question

4️⃣ Interactive Chat

5️⃣ Watch for Changes

6️⃣ Manage GitHub Repositories

🎯 Use Cases

🔍 Code Exploration

🐛 Debugging & Investigation

📚 Documentation & Onboarding

🔄 Refactoring Assistance

🌐 Open Source Exploration

🧪 Testing & Quality Assurance

🏗️ Architecture Review

🏗️ Architecture

System Overview

Component Details

1. Ingestion Pipeline

2. Vector Store

3. Agent System

4. LSP Integration

5. File Watching

🛠️ Available Tools

🧪 Advanced Configuration

Custom Models

Project Structure

📊 Performance

🤝 Contributing

🙏 Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages