Skip to content

theraaz/code-context-manager-mcp

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

2 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Code Context Manager MCP Server

License: MIT Python 3.9+

A Model Context Protocol (MCP) server that provides intelligent code context management and semantic search capabilities for software development. It indexes codebases and enables natural language queries to find relevant code snippets, functions, and classes across Python, JavaScript, and TypeScript projects.

Disclaimer: This project was developed with assistance from vibe-coding/agents.

🎯 Purpose

For AI Agents: Provides rich contextual information about codebases to enable more accurate and relevant code generation, debugging, and feature development.

For Developers: Offers powerful semantic search and code discovery capabilities that go beyond simple text search or file browsing.

πŸ—οΈ Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   MCP Clients   β”‚    β”‚  MCP Server     β”‚    β”‚  Vector Store   β”‚
β”‚   (Claude, etc) │◄──►│  (This Tool)    │◄──►│    (Redis)      β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                                β”‚
                                β–Ό
                       β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
                       β”‚   SQLite DB     β”‚
                       β”‚   (Metadata)    β”‚
                       β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Core Components

  1. Code Parser: Analyzes source code using Abstract Syntax Trees (AST) for Python, esprima for JavaScript/TypeScript
  2. Vector Store: Stores semantic embeddings in Redis with RediSearch for similarity search
  3. Context Manager: Orchestrates indexing and retrieval operations
  4. MCP Interface: Provides tools for integration with MCP clients

πŸš€ Key Features

Multi-Language Code Indexing

  • Supported Languages: Python (AST-based), JavaScript, TypeScript (esprima-based), SQL (sqlparse-based)
  • Entity Extraction: Functions, classes, imports, exports with precise location data
  • SQL Support: Tables, views, functions, procedures, and complex queries
  • Semantic Embeddings: Uses sentence-transformers for understanding code semantics

Intelligent Semantic Search

  • Natural Language Queries: Find code by describing what you need
  • Similarity Scoring: Cosine similarity ranking with HNSW indexing
  • Multi-level Results: Returns both files and code entities

Code Context Management

  • Dependency Analysis: Tracks imports and module relationships
  • File Metadata: Size, modification time, entity counts
  • Incremental Updates: Efficient re-indexing of changed files

MCP Integration

  • Stdio Protocol: Full MCP server implementation
  • Tool-based Interface: 8 MCP tools for comprehensive code operations
  • JSON Responses: Structured data for easy consumption

πŸ› οΈ Available MCP Tools

1. index_directory

Purpose: Index all supported files in a directory for semantic search

Parameters:

  • directory (required): Root directory path (e.g., ".")
  • patterns (optional): File patterns to include (e.g., ["*.py", "*.js"])
  • ignore_patterns (optional): Patterns to ignore (e.g., ["venv/*", "__pycache__/*"])

Example:

{
  "directory": ".",
  "ignore_patterns": ["venv/*", "node_modules/*"]
}

Output: JSON with status and list of indexed files with entity counts

2. index_file

Purpose: Index a single file for semantic search

Parameters:

  • file_path (required): Path to the file to index

Example:

{
  "file_path": "src/main.py"
}

Output: JSON with indexing status and entity count

3. search_code_context

Purpose: Perform semantic search across indexed code using natural language

Parameters:

  • query (required): Natural language description of needed code
  • max_files (optional): Maximum files to return (default: 5)
  • max_entities (optional): Maximum code entities to return (default: 10)

Example:

{
  "query": "function to parse JSON data",
  "max_entities": 15
}

Output: Ranked list of relevant files and entities with similarity scores

4. read_file

Purpose: Read the complete content of a file

Parameters:

  • file_path (required): Path to the file

Output: JSON with file path and content

5. list_directory_contents

Purpose: List files and subdirectories in a path

Parameters:

  • path (optional): Directory path (default: ".")
  • recursive (optional): Include subdirectories (default: false)

Output: JSON with directory contents

6. get_file_dependencies

Purpose: Get import dependencies for an indexed file

Parameters:

  • file_path (required): Path to the file

Output: JSON with dependencies and metadata

7. remove_indexed_file

Purpose: Remove indexed data for a specific file

Parameters:

  • file_path (required): Path to the file

Output: JSON with removal status

8. clear_all_indexed_data

Purpose: Clear all indexed data from Redis and SQLite

Output: JSON with operation status

🎯 Use Cases

1. Code Discovery & Exploration

  • Find functions handling specific tasks (e.g., "user authentication")
  • Locate class definitions and their relationships
  • Discover similar code patterns across the codebase

2. Development Assistance

  • Get context for implementing new features
  • Find examples of error handling or data processing
  • Understand existing API integrations and patterns

3. Code Review & Maintenance

  • Identify code duplication and inconsistencies
  • Find related functions that might need updates
  • Understand the impact of code changes

4. Onboarding & Learning

  • Explore unfamiliar codebases with natural language queries
  • Find relevant code examples for learning
  • Understand project architecture and dependencies

5. Workflow Integration

  • Before development: Index relevant directories
  • During coding: Search for similar implementations
  • After changes: Re-index modified files for updated context
  • Regular maintenance: Update indices as codebase evolves

πŸ“Š Technical Details

Data Storage

Redis (Vector Database)

  • Keys: file:{hash} for file embeddings, entity:{file_hash}:{name}:{line} for code entities
  • Index: code_index with HNSW algorithm for cosine similarity search
  • Embedding Model: all-MiniLM-L6-v2 (384-dimensional vectors)

SQLite (Metadata Database)

CREATE TABLE indexed_files (
    file_path TEXT PRIMARY KEY,
    file_hash TEXT,
    last_indexed TIMESTAMP,
    last_modified REAL,
    entity_count INTEGER,
    imports TEXT
);

Parsing Details

Python Files

  • Uses ast module for syntax tree analysis
  • Extracts: functions, classes, imports, docstrings
  • Handles nested functions and complex expressions

JavaScript/TypeScript Files

  • Uses esprima library for AST parsing
  • Extracts: functions, classes, imports, exports
  • Supports modern JS features (arrow functions, async/await)

SQL Files

  • Uses sqlparse library for SQL statement parsing
  • Extracts: tables, views, functions, procedures, and complex queries
  • Supports CREATE, SELECT, and other SQL statement types

Search Algorithm

  • Query embedding generation using sentence-transformers
  • KNN search with configurable top-k results
  • Cosine similarity scoring (0-1 scale)
  • Combined file and entity ranking

πŸ”§ Installation & Setup

Prerequisites

  • Python 3.9+
  • Redis server with RediSearch module
  • MCP client (e.g., Claude Desktop with MCP support)

Installation Steps

  1. Clone and setup:
git clone <repository-url>
cd code-context-manager-mcp
python -m venv venv
source venv/bin/activate  # Windows: venv\Scripts\activate
  1. Install dependencies:
pip install -r requirements.txt
# Optional: for JS/TS support
pip install esprima
# Optional: for SQL support
pip install sqlparse
  1. Start Redis:
# Using Docker
docker run -d --name redis-stack -p 6379:6379 redis/redis-stack:latest

# Or using local Redis with RediSearch
redis-server --loadmodule /path/to/redisearch.so
  1. Configure MCP client (e.g., Claude Desktop):
{
  "mcpServers": {
    "code-context-manager": {
      "command": "python",
      "args": ["/full/path/to/code_context_mcp.py"],
      "env": {
        "REDIS_URL": "redis://localhost:6379"
      }
    }
  }
}

πŸ“ Codebase Indexing Process

How Indexing Works

The Code Context Manager does not automatically index or update your codebase. All indexing operations require explicit user commands through MCP tools.

Indexing Methods

1. Directory Indexing (index_directory)

  • Purpose: Index all supported files in a directory tree
  • Process:
    1. Scans directory recursively
    2. Applies include/exclude patterns
    3. Parses each file using appropriate language parser
    4. Generates embeddings for files and code entities
    5. Stores data in Redis (vectors) and SQLite (metadata)

2. Single File Indexing (index_file)

  • Purpose: Index individual files
  • Process: Same as directory indexing but for one file

When to Re-index

Manual Re-indexing Required When:

  • Adding new files to the project
  • Making significant changes to existing code
  • Adding new dependencies or imports
  • Changing file structure or organization

The system does NOT:

  • Automatically detect file changes
  • Re-index modified files
  • Monitor the filesystem for updates
  • Update indices when code is edited

Change Detection

While the system stores file modification times and hashes, it does not use them for automatic updates. Users must explicitly re-index files after changes.

Best Practices

Initial Setup

// Index entire project
{
  "tool": "index_directory",
  "directory": ".",
  "ignore_patterns": ["venv/*", "node_modules/*", "__pycache__/*"]
}

After Code Changes

// Re-index specific changed file
{
  "tool": "index_file",
  "file_path": "src/new_feature.py"
}

// Or re-index entire directory
{
  "tool": "index_directory",
  "directory": "src"
}

Regular Maintenance

  • Re-index after major refactoring
  • Update indices before complex development tasks
  • Clean up with clear_all_indexed_data if needed

Performance Notes

  • Initial indexing: May take time for large codebases
  • Incremental updates: Use index_file for single changes
  • Memory usage: Scales with codebase size
  • Search quality: Improves with comprehensive indexing

🎬 Example Workflows

Initial Setup

1. Start MCP client with server configured
2. Call index_directory with project path
3. Wait for indexing completion
4. Ready for semantic code search

Development Workflow

1. Developer: "I need to add user authentication"
2. Call search_code_context("user login authentication system")
3. Receive relevant functions, classes, and files
4. Implement using discovered patterns

Code Exploration

1. Need to understand payment processing
2. Call search_code_context("payment processing integration")
3. Get relevant code entities and files
4. Study existing implementation patterns

⚑ Performance Considerations

Indexing Performance

  • Initial Indexing: Large projects may take 5-15 minutes
  • Incremental Updates: Changed files reindex in seconds
  • Memory Usage: ~50-100MB per 1000 files (varies by code complexity)

Search Performance

  • Semantic Search: Sub-second response for most queries
  • Context Retrieval: Optimized ranking algorithms
  • Redis Memory: Monitor usage with large codebases

Optimization Tips

  • Use specific file patterns to avoid unnecessary indexing
  • Regular cleanup of old project indices
  • Monitor Redis memory usage and configure appropriately

πŸ” Understanding Results

Similarity Scores

  • 0.8-1.0: Highly relevant, direct matches
  • 0.6-0.8: Good relevance, related concepts
  • 0.4-0.6: Moderate relevance, may be useful
  • <0.4: Low relevance, consider refining query

Context Quality Indicators

  • High Entity Count: Rich codebase with many components
  • Good Import/Export Mapping: Well-structured project
  • Recent Index Timestamps: Up-to-date information

πŸ›‘οΈ Security & Best Practices

Data Security

  • Server only reads code files, never executes them
  • Redis should be secured if containing sensitive code
  • Consider network isolation in production environments
  • File system access limited to specified directories

Development Best Practices

  • Regular reindexing of active development areas
  • Monitor disk space for SQLite and Redis storage
  • Use appropriate ignore patterns for large files/directories
  • Test search queries to validate context quality

πŸ”§ Integration with Other MCP Servers

This server complements your existing MCP infrastructure:

  • GitHub MCP: Provides repository history and PR context
  • MySQL MCP: Offers database schema context for data-related development
  • Web Search MCP: Finds external documentation for discovered libraries
  • Redis MCP: Enables direct Redis operations if needed

πŸ› Troubleshooting

Common Issues

MCP Tool Errors:

  • Ensure correct tool names and parameters
  • Check that files are indexed before searching
  • Verify Redis connection is working

Redis Connection Failed:

# Check Redis status
redis-cli -h 192.168.0.200 -p 6378 ping

# Start Redis if needed
redis-server --port 6378

Missing Dependencies:

# Install core dependencies
pip install numpy sentence-transformers redis

# Install optional JS/TS support
pip install esprima

# Install optional SQL support
pip install sqlparse

Poor Search Results:

  • Use more descriptive queries
  • Ensure codebase is fully indexed (re-index after changes)
  • Check ignore patterns aren't excluding important files
  • Try different query formulations
  • Verify files were indexed with get_file_dependencies

Memory/Performance Issues:

  • Monitor Redis memory usage
  • Use selective indexing patterns
  • Clear old indices periodically

Debugging Commands

Check indexed files:

python -c "
import sqlite3
conn = sqlite3.connect('code_context.db')
cursor = conn.cursor()
cursor.execute('SELECT file_path, entity_count FROM indexed_files')
for row in cursor.fetchall():
    print(f'{row[0]}: {row[1]} entities')
conn.close()
"

Test Redis connection:

python -c "
import redis
r = redis.from_url('redis://192.168.0.200:6378/0')
print('Redis ping:', r.ping())
print('Index info:', r.ft('code_index').info() if r.exists('code_index') else 'No index')
"

Test search functionality:

python test_search.py

πŸš€ Future Enhancements

Planned Features

  • Support for additional languages (Go, Rust, Java)
  • Git integration for change tracking
  • Advanced dependency graph analysis
  • Custom embedding models for specialized domains
  • Real-time incremental indexing

Extension Points

  • Plugin system for custom language parsers
  • Alternative vector databases
  • Framework-specific context extractors
  • IDE and editor integrations

πŸ“ Usage Guidelines

For developers and AI agents:

  1. Index first: Always call index_directory or index_file before searching
  2. Manual updates: Re-index files after making changes (no auto-updating)
  3. Use natural language: Write descriptive queries for search_code_context
  4. Check similarity scores: Higher scores (0.7+) indicate better matches
  5. Combine tools: Use read_file and get_file_dependencies for detailed analysis
  6. Regular maintenance: Update indices as your codebase evolves

Important: This system requires explicit indexing commands. It does not automatically detect or index file changes.

This MCP server provides semantic understanding of codebases, enabling context-aware development and intelligent code discovery.

🀝 Contributing

We welcome contributions! Please see our Contributing Guide for details on how to get started.

πŸ“„ License

This project is licensed under the MIT License - see the LICENSE file for details.

About

MCP server for intelligent code context management and semantic search

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published