Code Context Manager MCP Server

A Model Context Protocol (MCP) server that provides intelligent code context management and semantic search capabilities for software development. It indexes codebases and enables natural language queries to find relevant code snippets, functions, and classes across Python, JavaScript, and TypeScript projects.

Disclaimer: This project was developed with assistance from vibe-coding/agents.

🎯 Purpose

For AI Agents: Provides rich contextual information about codebases to enable more accurate and relevant code generation, debugging, and feature development.

For Developers: Offers powerful semantic search and code discovery capabilities that go beyond simple text search or file browsing.

🏗️ Architecture

┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐
│   MCP Clients   │    │  MCP Server     │    │  Vector Store   │
│   (Claude, etc) │◄──►│  (This Tool)    │◄──►│    (Redis)      │
└─────────────────┘    └─────────────────┘    └─────────────────┘
                                │
                                ▼
                       ┌─────────────────┐
                       │   SQLite DB     │
                       │   (Metadata)    │
                       └─────────────────┘

Core Components

Code Parser: Analyzes source code using Abstract Syntax Trees (AST) for Python, esprima for JavaScript/TypeScript
Vector Store: Stores semantic embeddings in Redis with RediSearch for similarity search
Context Manager: Orchestrates indexing and retrieval operations
MCP Interface: Provides tools for integration with MCP clients

🚀 Key Features

Multi-Language Code Indexing

Supported Languages: Python (AST-based), JavaScript, TypeScript (esprima-based), SQL (sqlparse-based)
Entity Extraction: Functions, classes, imports, exports with precise location data
SQL Support: Tables, views, functions, procedures, and complex queries
Semantic Embeddings: Uses sentence-transformers for understanding code semantics

Intelligent Semantic Search

Natural Language Queries: Find code by describing what you need
Similarity Scoring: Cosine similarity ranking with HNSW indexing
Multi-level Results: Returns both files and code entities

Code Context Management

Dependency Analysis: Tracks imports and module relationships
File Metadata: Size, modification time, entity counts
Incremental Updates: Efficient re-indexing of changed files

MCP Integration

Stdio Protocol: Full MCP server implementation
Tool-based Interface: 8 MCP tools for comprehensive code operations
JSON Responses: Structured data for easy consumption

🛠️ Available MCP Tools

1. `index_directory`

Purpose: Index all supported files in a directory for semantic search

Parameters:

directory (required): Root directory path (e.g., ".")
patterns (optional): File patterns to include (e.g., ["*.py", "*.js"])
ignore_patterns (optional): Patterns to ignore (e.g., ["venv/*", "__pycache__/*"])

Example:

{
  "directory": ".",
  "ignore_patterns": ["venv/*", "node_modules/*"]
}

Output: JSON with status and list of indexed files with entity counts

2. `index_file`

Purpose: Index a single file for semantic search

Parameters:

file_path (required): Path to the file to index

Example:

{
  "file_path": "src/main.py"
}

Output: JSON with indexing status and entity count

3. `search_code_context`

Purpose: Perform semantic search across indexed code using natural language

Parameters:

query (required): Natural language description of needed code
max_files (optional): Maximum files to return (default: 5)
max_entities (optional): Maximum code entities to return (default: 10)

Example:

{
  "query": "function to parse JSON data",
  "max_entities": 15
}

Output: Ranked list of relevant files and entities with similarity scores

4. `read_file`

Purpose: Read the complete content of a file

Parameters:

file_path (required): Path to the file

Output: JSON with file path and content

5. `list_directory_contents`

Purpose: List files and subdirectories in a path

Parameters:

path (optional): Directory path (default: ".")
recursive (optional): Include subdirectories (default: false)

Output: JSON with directory contents

6. `get_file_dependencies`

Purpose: Get import dependencies for an indexed file

Parameters:

file_path (required): Path to the file

Output: JSON with dependencies and metadata

7. `remove_indexed_file`

Purpose: Remove indexed data for a specific file

Parameters:

file_path (required): Path to the file

Output: JSON with removal status

8. `clear_all_indexed_data`

Purpose: Clear all indexed data from Redis and SQLite

Output: JSON with operation status

🎯 Use Cases

1. Code Discovery & Exploration

Find functions handling specific tasks (e.g., "user authentication")
Locate class definitions and their relationships
Discover similar code patterns across the codebase

2. Development Assistance

Get context for implementing new features
Find examples of error handling or data processing
Understand existing API integrations and patterns

3. Code Review & Maintenance

Identify code duplication and inconsistencies
Find related functions that might need updates
Understand the impact of code changes

4. Onboarding & Learning

Explore unfamiliar codebases with natural language queries
Find relevant code examples for learning
Understand project architecture and dependencies

5. Workflow Integration

Before development: Index relevant directories
During coding: Search for similar implementations
After changes: Re-index modified files for updated context
Regular maintenance: Update indices as codebase evolves

📊 Technical Details

Data Storage

Redis (Vector Database)

Keys: file:{hash} for file embeddings, entity:{file_hash}:{name}:{line} for code entities
Index: code_index with HNSW algorithm for cosine similarity search
Embedding Model: all-MiniLM-L6-v2 (384-dimensional vectors)

SQLite (Metadata Database)

CREATE TABLE indexed_files (
    file_path TEXT PRIMARY KEY,
    file_hash TEXT,
    last_indexed TIMESTAMP,
    last_modified REAL,
    entity_count INTEGER,
    imports TEXT
);

Parsing Details

Python Files

Uses ast module for syntax tree analysis
Extracts: functions, classes, imports, docstrings
Handles nested functions and complex expressions

JavaScript/TypeScript Files

Uses esprima library for AST parsing
Extracts: functions, classes, imports, exports
Supports modern JS features (arrow functions, async/await)

SQL Files

Uses sqlparse library for SQL statement parsing
Extracts: tables, views, functions, procedures, and complex queries
Supports CREATE, SELECT, and other SQL statement types

Search Algorithm

Query embedding generation using sentence-transformers
KNN search with configurable top-k results
Cosine similarity scoring (0-1 scale)
Combined file and entity ranking

🔧 Installation & Setup

Prerequisites

Python 3.9+
Redis server with RediSearch module
MCP client (e.g., Claude Desktop with MCP support)

Installation Steps

Clone and setup:

git clone <repository-url>
cd code-context-manager-mcp
python -m venv venv
source venv/bin/activate  # Windows: venv\Scripts\activate

Install dependencies:

pip install -r requirements.txt
# Optional: for JS/TS support
pip install esprima
# Optional: for SQL support
pip install sqlparse

Start Redis:

# Using Docker
docker run -d --name redis-stack -p 6379:6379 redis/redis-stack:latest

# Or using local Redis with RediSearch
redis-server --loadmodule /path/to/redisearch.so

Configure MCP client (e.g., Claude Desktop):

{
  "mcpServers": {
    "code-context-manager": {
      "command": "python",
      "args": ["/full/path/to/code_context_mcp.py"],
      "env": {
        "REDIS_URL": "redis://localhost:6379"
      }
    }
  }
}

📁 Codebase Indexing Process

How Indexing Works

The Code Context Manager does not automatically index or update your codebase. All indexing operations require explicit user commands through MCP tools.

Indexing Methods

1. Directory Indexing (`index_directory`)

Purpose: Index all supported files in a directory tree
Process:
1. Scans directory recursively
2. Applies include/exclude patterns
3. Parses each file using appropriate language parser
4. Generates embeddings for files and code entities
5. Stores data in Redis (vectors) and SQLite (metadata)

2. Single File Indexing (`index_file`)

Purpose: Index individual files
Process: Same as directory indexing but for one file

When to Re-index

Manual Re-indexing Required When:

Adding new files to the project
Making significant changes to existing code
Adding new dependencies or imports
Changing file structure or organization

The system does NOT:

Automatically detect file changes
Re-index modified files
Monitor the filesystem for updates
Update indices when code is edited

Change Detection

While the system stores file modification times and hashes, it does not use them for automatic updates. Users must explicitly re-index files after changes.

Best Practices

Initial Setup

// Index entire project
{
  "tool": "index_directory",
  "directory": ".",
  "ignore_patterns": ["venv/*", "node_modules/*", "__pycache__/*"]
}

After Code Changes

// Re-index specific changed file
{
  "tool": "index_file",
  "file_path": "src/new_feature.py"
}

// Or re-index entire directory
{
  "tool": "index_directory",
  "directory": "src"
}

Regular Maintenance

Re-index after major refactoring
Update indices before complex development tasks
Clean up with clear_all_indexed_data if needed

Performance Notes

Initial indexing: May take time for large codebases
Incremental updates: Use index_file for single changes
Memory usage: Scales with codebase size
Search quality: Improves with comprehensive indexing

🎬 Example Workflows

Initial Setup

1. Start MCP client with server configured
2. Call index_directory with project path
3. Wait for indexing completion
4. Ready for semantic code search

Development Workflow

1. Developer: "I need to add user authentication"
2. Call search_code_context("user login authentication system")
3. Receive relevant functions, classes, and files
4. Implement using discovered patterns

Code Exploration

1. Need to understand payment processing
2. Call search_code_context("payment processing integration")
3. Get relevant code entities and files
4. Study existing implementation patterns

⚡ Performance Considerations

Indexing Performance

Initial Indexing: Large projects may take 5-15 minutes
Incremental Updates: Changed files reindex in seconds
Memory Usage: ~50-100MB per 1000 files (varies by code complexity)

Search Performance

Semantic Search: Sub-second response for most queries
Context Retrieval: Optimized ranking algorithms
Redis Memory: Monitor usage with large codebases

Optimization Tips

Use specific file patterns to avoid unnecessary indexing
Regular cleanup of old project indices
Monitor Redis memory usage and configure appropriately

🔍 Understanding Results

Similarity Scores

0.8-1.0: Highly relevant, direct matches
0.6-0.8: Good relevance, related concepts
0.4-0.6: Moderate relevance, may be useful
<0.4: Low relevance, consider refining query

Context Quality Indicators

High Entity Count: Rich codebase with many components
Good Import/Export Mapping: Well-structured project
Recent Index Timestamps: Up-to-date information

🛡️ Security & Best Practices

Data Security

Server only reads code files, never executes them
Redis should be secured if containing sensitive code
Consider network isolation in production environments
File system access limited to specified directories

Development Best Practices

Regular reindexing of active development areas
Monitor disk space for SQLite and Redis storage
Use appropriate ignore patterns for large files/directories
Test search queries to validate context quality

🔧 Integration with Other MCP Servers

This server complements your existing MCP infrastructure:

GitHub MCP: Provides repository history and PR context
MySQL MCP: Offers database schema context for data-related development
Web Search MCP: Finds external documentation for discovered libraries
Redis MCP: Enables direct Redis operations if needed

🐛 Troubleshooting

Common Issues

MCP Tool Errors:

Ensure correct tool names and parameters
Check that files are indexed before searching
Verify Redis connection is working

Redis Connection Failed:

# Check Redis status
redis-cli -h 192.168.0.200 -p 6378 ping

# Start Redis if needed
redis-server --port 6378

Missing Dependencies:

# Install core dependencies
pip install numpy sentence-transformers redis

# Install optional JS/TS support
pip install esprima

# Install optional SQL support
pip install sqlparse

Poor Search Results:

Use more descriptive queries
Ensure codebase is fully indexed (re-index after changes)
Check ignore patterns aren't excluding important files
Try different query formulations
Verify files were indexed with get_file_dependencies

Memory/Performance Issues:

Monitor Redis memory usage
Use selective indexing patterns
Clear old indices periodically

Debugging Commands

Check indexed files:

python -c "
import sqlite3
conn = sqlite3.connect('code_context.db')
cursor = conn.cursor()
cursor.execute('SELECT file_path, entity_count FROM indexed_files')
for row in cursor.fetchall():
    print(f'{row[0]}: {row[1]} entities')
conn.close()
"

Test Redis connection:

python -c "
import redis
r = redis.from_url('redis://192.168.0.200:6378/0')
print('Redis ping:', r.ping())
print('Index info:', r.ft('code_index').info() if r.exists('code_index') else 'No index')
"

Test search functionality:

python test_search.py

🚀 Future Enhancements

Planned Features

Support for additional languages (Go, Rust, Java)
Git integration for change tracking
Advanced dependency graph analysis
Custom embedding models for specialized domains
Real-time incremental indexing

Extension Points

Plugin system for custom language parsers
Alternative vector databases
Framework-specific context extractors
IDE and editor integrations

📝 Usage Guidelines

For developers and AI agents:

Index first: Always call index_directory or index_file before searching
Manual updates: Re-index files after making changes (no auto-updating)
Use natural language: Write descriptive queries for search_code_context
Check similarity scores: Higher scores (0.7+) indicate better matches
Combine tools: Use read_file and get_file_dependencies for detailed analysis
Regular maintenance: Update indices as your codebase evolves

Important: This system requires explicit indexing commands. It does not automatically detect or index file changes.

This MCP server provides semantic understanding of codebases, enabling context-aware development and intelligent code discovery.

🤝 Contributing

We welcome contributions! Please see our Contributing Guide for details on how to get started.

📄 License

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
code_context_mcp.py		code_context_mcp.py
requirements.txt		requirements.txt
setup.py		setup.py
test.js		test.js
test.sql		test.sql
test_connection.py		test_connection.py
test_search.py		test_search.py

License

theraaz/code-context-manager-mcp

Folders and files

Latest commit

History

Repository files navigation

Code Context Manager MCP Server

🎯 Purpose

🏗️ Architecture

Core Components

🚀 Key Features

Multi-Language Code Indexing

Intelligent Semantic Search

Code Context Management

MCP Integration

🛠️ Available MCP Tools

1. index_directory

2. index_file

3. search_code_context

4. read_file

5. list_directory_contents

6. get_file_dependencies

7. remove_indexed_file

8. clear_all_indexed_data

🎯 Use Cases

1. Code Discovery & Exploration

2. Development Assistance

3. Code Review & Maintenance

4. Onboarding & Learning

5. Workflow Integration

📊 Technical Details

Data Storage

Redis (Vector Database)

SQLite (Metadata Database)

Parsing Details

Python Files

JavaScript/TypeScript Files

SQL Files

Search Algorithm

🔧 Installation & Setup

Prerequisites

Installation Steps

📁 Codebase Indexing Process

How Indexing Works

Indexing Methods

1. Directory Indexing (index_directory)

2. Single File Indexing (index_file)

When to Re-index

Change Detection

Best Practices

Initial Setup

After Code Changes

Regular Maintenance

Performance Notes

🎬 Example Workflows

Initial Setup

Development Workflow

Code Exploration

⚡ Performance Considerations

Indexing Performance

Search Performance

Optimization Tips

🔍 Understanding Results

Similarity Scores

Context Quality Indicators

🛡️ Security & Best Practices

Data Security

Development Best Practices

🔧 Integration with Other MCP Servers

🐛 Troubleshooting

Common Issues

Debugging Commands

🚀 Future Enhancements

Planned Features

Extension Points

📝 Usage Guidelines

🤝 Contributing

📄 License

About

Resources

1. `index_directory`

2. `index_file`

3. `search_code_context`

4. `read_file`

5. `list_directory_contents`

6. `get_file_dependencies`

7. `remove_indexed_file`

8. `clear_all_indexed_data`

1. Directory Indexing (`index_directory`)

2. Single File Indexing (`index_file`)

Packages