# AI-Code-Brain: MCP Semantic Indexer
**AI-Code-Brain** is a high-performance Model Context Protocol (MCP) server that provides AI agents (like Claude and Cursor) with "Long-Term Memory" of your codebase. It utilizes Abstract Syntax Trees (AST) and Vector Embeddings to navigate and understand complex local projects.
Instead of simple keyword matching, this tool enables your AI to understand the **concepts** and **logic** within your source code.
---
## Features
* **Semantic Search:** Find code by *intent* (e.g., "how is the transport layer implemented?") rather than literal strings.
* **Symbol Resolution:** Instantly map class and function definitions across thousands of files using a persistent "Shadow Graph."
* **AST Parsing:** Uses `tree-sitter` for deep structural analysis, ensuring code is indexed by logical blocks rather than arbitrary lines.
* **Local-First & Private:** Embeddings are generated locally using `all-MiniLM-L6-v2`. No code is sent to third-party APIs for indexing.
* **Concurrency Safe:** Implements `asyncio.Lock` to prevent database corruption during parallel AI tool calls.
* **Crash-Resilient:** Structured JSON-RPC error handling ensures the AI client receives a valid error message instead of hanging.
---
## How it Works
The server acts as a bridge between your AI assistant and your local filesystem:
1. **Parser:** Traverses the directory to extract functions and classes using Tree-sitter.
2. **Embedder:** Converts code snippets into 384-dimensional vectors.
3. **Vector Store:** Manages storage and similarity search via **LanceDB**.
4. **Shadow Graph:** A lightweight JSON-based map for instant symbol-to-file lookups.
---
## 🛠️ Tech Stack
* **Runtime:** Python 3.11+
* **Vector DB:** [LanceDB](https://lancedb.com/)
* **Embeddings:** `sentence-transformers` (all-MiniLM-L6-v2)
* **Code Analysis:** `tree-sitter` / `tree-sitter-python`
* **Protocol:** MCP (Model Context Protocol)
---
## 🚦 Getting Started
### 1. Prerequisites
* Python 3.11 or higher
* Conda (Miniconda/Anaconda)
### 2. Installation
```bash
# Clone the repository
git clone [https://github.com/devu729/mcp.git](https://github.com/your-username/mcp-code-analyzer.git)
cd mcp-code-analyzer
# Create the environment
conda env create -f environment.yml
conda activate mcp
To verify the server is running and the embedder is warming up:
python -m src.server
Add the following to your MCP settings:
- Name:
Code-Brain - Type:
command - Command:
python -m src.server - CWD:
C:\path\to\your\mcp-code-analyzer(Use your absolute path)
| Tool | Input | Description |
|---|---|---|
index_codebase |
path |
Scans and indexes a directory to enable semantic search. |
semantic_search |
query |
Finds code snippets based on meaning/intent. |
get_symbol_location |
name |
Returns the file and line number for a specific class/function. |
mcp/
├── .lancedb/ # Local vector database (Git ignored)
├── src/
│ ├── indexer/ # Core logic (Parser, Embedder, Store)
│ ├── protocol/ # MCP/JSON-RPC implementation
│ └── server.py # Entry point
├── environment.yml # Dependency list
└── README.md # You are here!
MIT
---