A Model Context Protocol (MCP) server that provides intelligent code context management and semantic search capabilities for software development. It indexes codebases and enables natural language queries to find relevant code snippets, functions, and classes across Python, JavaScript, and TypeScript projects.
Disclaimer: This project was developed with assistance from vibe-coding/agents.
For AI Agents: Provides rich contextual information about codebases to enable more accurate and relevant code generation, debugging, and feature development.
For Developers: Offers powerful semantic search and code discovery capabilities that go beyond simple text search or file browsing.
βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
β MCP Clients β β MCP Server β β Vector Store β
β (Claude, etc) βββββΊβ (This Tool) βββββΊβ (Redis) β
βββββββββββββββββββ βββββββββββββββββββ βββββββββββββββββββ
β
βΌ
βββββββββββββββββββ
β SQLite DB β
β (Metadata) β
βββββββββββββββββββ
- Code Parser: Analyzes source code using Abstract Syntax Trees (AST) for Python, esprima for JavaScript/TypeScript
- Vector Store: Stores semantic embeddings in Redis with RediSearch for similarity search
- Context Manager: Orchestrates indexing and retrieval operations
- MCP Interface: Provides tools for integration with MCP clients
- Supported Languages: Python (AST-based), JavaScript, TypeScript (esprima-based), SQL (sqlparse-based)
- Entity Extraction: Functions, classes, imports, exports with precise location data
- SQL Support: Tables, views, functions, procedures, and complex queries
- Semantic Embeddings: Uses sentence-transformers for understanding code semantics
- Natural Language Queries: Find code by describing what you need
- Similarity Scoring: Cosine similarity ranking with HNSW indexing
- Multi-level Results: Returns both files and code entities
- Dependency Analysis: Tracks imports and module relationships
- File Metadata: Size, modification time, entity counts
- Incremental Updates: Efficient re-indexing of changed files
- Stdio Protocol: Full MCP server implementation
- Tool-based Interface: 8 MCP tools for comprehensive code operations
- JSON Responses: Structured data for easy consumption
Purpose: Index all supported files in a directory for semantic search
Parameters:
directory
(required): Root directory path (e.g., ".")patterns
(optional): File patterns to include (e.g.,["*.py", "*.js"]
)ignore_patterns
(optional): Patterns to ignore (e.g.,["venv/*", "__pycache__/*"]
)
Example:
{
"directory": ".",
"ignore_patterns": ["venv/*", "node_modules/*"]
}
Output: JSON with status and list of indexed files with entity counts
Purpose: Index a single file for semantic search
Parameters:
file_path
(required): Path to the file to index
Example:
{
"file_path": "src/main.py"
}
Output: JSON with indexing status and entity count
Purpose: Perform semantic search across indexed code using natural language
Parameters:
query
(required): Natural language description of needed codemax_files
(optional): Maximum files to return (default: 5)max_entities
(optional): Maximum code entities to return (default: 10)
Example:
{
"query": "function to parse JSON data",
"max_entities": 15
}
Output: Ranked list of relevant files and entities with similarity scores
Purpose: Read the complete content of a file
Parameters:
file_path
(required): Path to the file
Output: JSON with file path and content
Purpose: List files and subdirectories in a path
Parameters:
path
(optional): Directory path (default: ".")recursive
(optional): Include subdirectories (default: false)
Output: JSON with directory contents
Purpose: Get import dependencies for an indexed file
Parameters:
file_path
(required): Path to the file
Output: JSON with dependencies and metadata
Purpose: Remove indexed data for a specific file
Parameters:
file_path
(required): Path to the file
Output: JSON with removal status
Purpose: Clear all indexed data from Redis and SQLite
Output: JSON with operation status
- Find functions handling specific tasks (e.g., "user authentication")
- Locate class definitions and their relationships
- Discover similar code patterns across the codebase
- Get context for implementing new features
- Find examples of error handling or data processing
- Understand existing API integrations and patterns
- Identify code duplication and inconsistencies
- Find related functions that might need updates
- Understand the impact of code changes
- Explore unfamiliar codebases with natural language queries
- Find relevant code examples for learning
- Understand project architecture and dependencies
- Before development: Index relevant directories
- During coding: Search for similar implementations
- After changes: Re-index modified files for updated context
- Regular maintenance: Update indices as codebase evolves
- Keys:
file:{hash}
for file embeddings,entity:{file_hash}:{name}:{line}
for code entities - Index:
code_index
with HNSW algorithm for cosine similarity search - Embedding Model:
all-MiniLM-L6-v2
(384-dimensional vectors)
CREATE TABLE indexed_files (
file_path TEXT PRIMARY KEY,
file_hash TEXT,
last_indexed TIMESTAMP,
last_modified REAL,
entity_count INTEGER,
imports TEXT
);
- Uses
ast
module for syntax tree analysis - Extracts: functions, classes, imports, docstrings
- Handles nested functions and complex expressions
- Uses
esprima
library for AST parsing - Extracts: functions, classes, imports, exports
- Supports modern JS features (arrow functions, async/await)
- Uses
sqlparse
library for SQL statement parsing - Extracts: tables, views, functions, procedures, and complex queries
- Supports CREATE, SELECT, and other SQL statement types
- Query embedding generation using sentence-transformers
- KNN search with configurable top-k results
- Cosine similarity scoring (0-1 scale)
- Combined file and entity ranking
- Python 3.9+
- Redis server with RediSearch module
- MCP client (e.g., Claude Desktop with MCP support)
- Clone and setup:
git clone <repository-url>
cd code-context-manager-mcp
python -m venv venv
source venv/bin/activate # Windows: venv\Scripts\activate
- Install dependencies:
pip install -r requirements.txt
# Optional: for JS/TS support
pip install esprima
# Optional: for SQL support
pip install sqlparse
- Start Redis:
# Using Docker
docker run -d --name redis-stack -p 6379:6379 redis/redis-stack:latest
# Or using local Redis with RediSearch
redis-server --loadmodule /path/to/redisearch.so
- Configure MCP client (e.g., Claude Desktop):
{
"mcpServers": {
"code-context-manager": {
"command": "python",
"args": ["/full/path/to/code_context_mcp.py"],
"env": {
"REDIS_URL": "redis://localhost:6379"
}
}
}
}
The Code Context Manager does not automatically index or update your codebase. All indexing operations require explicit user commands through MCP tools.
- Purpose: Index all supported files in a directory tree
- Process:
- Scans directory recursively
- Applies include/exclude patterns
- Parses each file using appropriate language parser
- Generates embeddings for files and code entities
- Stores data in Redis (vectors) and SQLite (metadata)
- Purpose: Index individual files
- Process: Same as directory indexing but for one file
Manual Re-indexing Required When:
- Adding new files to the project
- Making significant changes to existing code
- Adding new dependencies or imports
- Changing file structure or organization
The system does NOT:
- Automatically detect file changes
- Re-index modified files
- Monitor the filesystem for updates
- Update indices when code is edited
While the system stores file modification times and hashes, it does not use them for automatic updates. Users must explicitly re-index files after changes.
// Index entire project
{
"tool": "index_directory",
"directory": ".",
"ignore_patterns": ["venv/*", "node_modules/*", "__pycache__/*"]
}
// Re-index specific changed file
{
"tool": "index_file",
"file_path": "src/new_feature.py"
}
// Or re-index entire directory
{
"tool": "index_directory",
"directory": "src"
}
- Re-index after major refactoring
- Update indices before complex development tasks
- Clean up with
clear_all_indexed_data
if needed
- Initial indexing: May take time for large codebases
- Incremental updates: Use
index_file
for single changes - Memory usage: Scales with codebase size
- Search quality: Improves with comprehensive indexing
1. Start MCP client with server configured
2. Call index_directory with project path
3. Wait for indexing completion
4. Ready for semantic code search
1. Developer: "I need to add user authentication"
2. Call search_code_context("user login authentication system")
3. Receive relevant functions, classes, and files
4. Implement using discovered patterns
1. Need to understand payment processing
2. Call search_code_context("payment processing integration")
3. Get relevant code entities and files
4. Study existing implementation patterns
- Initial Indexing: Large projects may take 5-15 minutes
- Incremental Updates: Changed files reindex in seconds
- Memory Usage: ~50-100MB per 1000 files (varies by code complexity)
- Semantic Search: Sub-second response for most queries
- Context Retrieval: Optimized ranking algorithms
- Redis Memory: Monitor usage with large codebases
- Use specific file patterns to avoid unnecessary indexing
- Regular cleanup of old project indices
- Monitor Redis memory usage and configure appropriately
- 0.8-1.0: Highly relevant, direct matches
- 0.6-0.8: Good relevance, related concepts
- 0.4-0.6: Moderate relevance, may be useful
- <0.4: Low relevance, consider refining query
- High Entity Count: Rich codebase with many components
- Good Import/Export Mapping: Well-structured project
- Recent Index Timestamps: Up-to-date information
- Server only reads code files, never executes them
- Redis should be secured if containing sensitive code
- Consider network isolation in production environments
- File system access limited to specified directories
- Regular reindexing of active development areas
- Monitor disk space for SQLite and Redis storage
- Use appropriate ignore patterns for large files/directories
- Test search queries to validate context quality
This server complements your existing MCP infrastructure:
- GitHub MCP: Provides repository history and PR context
- MySQL MCP: Offers database schema context for data-related development
- Web Search MCP: Finds external documentation for discovered libraries
- Redis MCP: Enables direct Redis operations if needed
MCP Tool Errors:
- Ensure correct tool names and parameters
- Check that files are indexed before searching
- Verify Redis connection is working
Redis Connection Failed:
# Check Redis status
redis-cli -h 192.168.0.200 -p 6378 ping
# Start Redis if needed
redis-server --port 6378
Missing Dependencies:
# Install core dependencies
pip install numpy sentence-transformers redis
# Install optional JS/TS support
pip install esprima
# Install optional SQL support
pip install sqlparse
Poor Search Results:
- Use more descriptive queries
- Ensure codebase is fully indexed (re-index after changes)
- Check ignore patterns aren't excluding important files
- Try different query formulations
- Verify files were indexed with
get_file_dependencies
Memory/Performance Issues:
- Monitor Redis memory usage
- Use selective indexing patterns
- Clear old indices periodically
Check indexed files:
python -c "
import sqlite3
conn = sqlite3.connect('code_context.db')
cursor = conn.cursor()
cursor.execute('SELECT file_path, entity_count FROM indexed_files')
for row in cursor.fetchall():
print(f'{row[0]}: {row[1]} entities')
conn.close()
"
Test Redis connection:
python -c "
import redis
r = redis.from_url('redis://192.168.0.200:6378/0')
print('Redis ping:', r.ping())
print('Index info:', r.ft('code_index').info() if r.exists('code_index') else 'No index')
"
Test search functionality:
python test_search.py
- Support for additional languages (Go, Rust, Java)
- Git integration for change tracking
- Advanced dependency graph analysis
- Custom embedding models for specialized domains
- Real-time incremental indexing
- Plugin system for custom language parsers
- Alternative vector databases
- Framework-specific context extractors
- IDE and editor integrations
For developers and AI agents:
- Index first: Always call
index_directory
orindex_file
before searching - Manual updates: Re-index files after making changes (no auto-updating)
- Use natural language: Write descriptive queries for
search_code_context
- Check similarity scores: Higher scores (0.7+) indicate better matches
- Combine tools: Use
read_file
andget_file_dependencies
for detailed analysis - Regular maintenance: Update indices as your codebase evolves
Important: This system requires explicit indexing commands. It does not automatically detect or index file changes.
This MCP server provides semantic understanding of codebases, enabling context-aware development and intelligent code discovery.
We welcome contributions! Please see our Contributing Guide for details on how to get started.
This project is licensed under the MIT License - see the LICENSE file for details.