Conversation
…oding This major update transforms CodeContext from a simple file dumper into an intelligent context server optimized for agentic coding workflows. New Features: - MCP (Model Context Protocol) server implementation * Native integration with Claude Code, Cline, and other MCP clients * Four powerful MCP tools: GetCodeContext, GetProjectStructure, ListProjectFiles, GetFileContent * Stdio transport for seamless subprocess integration - Token Budget Optimization System * TokenCounter service for accurate token estimation * FileRelevanceScorer with multi-factor relevance scoring (filename, path, content, importance) * TokenBudgetOptimizer with three strategies: GreedyByScore, ValueOptimized, Balanced * Intelligent file selection to maximize relevance within token constraints - Enhanced Program Architecture * Dual-mode support: CLI mode (original) and MCP server mode (new) * Command-line flag --mcp/--server to enable MCP mode * Refactored CLI code into ProgramCli.cs for separation of concerns - Comprehensive Documentation * Updated README with MCP setup instructions * Detailed explanations of token optimization strategies * Example workflows and usage patterns * MCP configuration example file Technical Implementation: - Added ModelContextProtocol and Microsoft.Extensions.Hosting NuGet packages - Implemented MCP tools using attribute-based discovery pattern - Relevance scoring algorithm with configurable weights - Multiple selection strategies for different use cases - Task-specific context generation vs. whole-codebase dumps Benefits for Agentic Coding: - Token efficiency: Only send relevant files, not entire codebases - Task-specific context: Intelligent file selection based on task description - Scalable: Works with large codebases through smart sampling - Flexible: Multiple optimization strategies for different scenarios - Integration-ready: Native MCP support for modern AI coding tools This update positions CodeContext as essential infrastructure for agentic coding, similar to how LSP became fundamental for modern IDEs.
There was a problem hiding this comment.
Pull request overview
This PR transforms CodeContext from a simple CLI tool into a comprehensive agentic coding platform by adding MCP (Model Context Protocol) server capabilities and intelligent token budget optimization. The changes enable native integration with AI coding assistants like Claude Code while maintaining backward compatibility with the original CLI mode.
Key Changes:
- Added MCP server mode with four powerful tools (GetCodeContext, GetProjectStructure, ListProjectFiles, GetFileContent) for AI agent integration
- Implemented token budget optimization system with multiple strategies (GreedyByScore, ValueOptimized, Balanced) and intelligent relevance scoring
- Refactored architecture to support dual-mode operation (CLI and MCP server) with proper separation of concerns
Reviewed changes
Copilot reviewed 9 out of 9 changed files in this pull request and generated 21 comments.
Show a summary per file
| File | Description |
|---|---|
| Program.cs | Refactored entry point to support dual-mode operation (CLI vs MCP server) with command-line flag detection |
| ProgramCli.cs | Extracted CLI-specific logic from original Program.cs for clean separation of concerns |
| Mcp/CodeContextTools.cs | Implements four MCP server tools with file scanning, relevance scoring, and context generation |
| Services/TokenCounter.cs | Provides token estimation for code and natural language using character-based approximations |
| Services/FileRelevanceScorer.cs | Multi-factor relevance scoring algorithm weighing filename, path, content, and importance |
| Services/TokenBudgetOptimizer.cs | Implements three optimization strategies for file selection within token constraints |
| CodeContext.csproj | Adds ModelContextProtocol and Microsoft.Extensions.Hosting package dependencies |
| mcp-config.example.json | Example MCP server configuration for integration with AI coding tools |
| README.md | Comprehensive documentation updates covering MCP setup, token optimization, and usage examples |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Mcp/CodeContextTools.cs
Outdated
| var fullPath = Path.Combine(projectPath, relativePath); | ||
|
|
||
| if (!File.Exists(fullPath)) | ||
| { | ||
| output.AppendLine($"## {relativePath}"); | ||
| output.AppendLine("❌ File not found"); | ||
| output.AppendLine(); | ||
| continue; | ||
| } | ||
|
|
||
| var content = File.ReadAllText(fullPath); |
There was a problem hiding this comment.
Potential path traversal vulnerability. A malicious user could provide file paths like "../../../etc/passwd" to read files outside the project directory. Consider validating that the resolved fullPath is still within the projectPath:
var fullPath = Path.GetFullPath(Path.Combine(projectPath, relativePath));
if (!fullPath.StartsWith(Path.GetFullPath(projectPath) + Path.DirectorySeparatorChar))
{
output.AppendLine($"## {relativePath}");
output.AppendLine("❌ Invalid path - outside project directory");
continue;
}
Services/FileRelevanceScorer.cs
Outdated
| // Penalize very long files (might be generated/verbose) | ||
| // This would need actual file size, using path length as proxy | ||
| if (filePath.Length > 100) | ||
| score -= 0.1; |
There was a problem hiding this comment.
The file path length is used as a proxy for file size to penalize long files, but this is misleading. A file with a very long path (e.g., deep nested directories) but small content would be incorrectly penalized. Consider either:
- Passing actual file size/content length as a parameter
- Removing this heuristic entirely since it doesn't correlate with file verbosity
- Documenting this clearly as a limitation
| "--", | ||
| "--mcp" | ||
| ], | ||
| "description": "CodeContext MCP Server - Intelligent code context generation with token budget optimization" |
There was a problem hiding this comment.
[nitpick] The example configuration uses dotnet run which is inefficient for production use as it rebuilds on every invocation. Consider showing the use of a compiled executable instead:
"command": "/path/to/CodeContext/bin/Release/net9.0/CodeContext",
"args": ["--mcp"]Or keep both examples showing the development approach (dotnet run) and production approach (compiled executable).
| "description": "CodeContext MCP Server - Intelligent code context generation with token budget optimization" | |
| "description": "CodeContext MCP Server (Development) - Uses 'dotnet run' for rapid iteration and debugging" | |
| }, | |
| "codecontext-prod": { | |
| "command": "/path/to/CodeContext/bin/Release/net9.0/CodeContext", | |
| "args": [ | |
| "--mcp" | |
| ], | |
| "description": "CodeContext MCP Server (Production) - Uses compiled executable for efficient deployment" |
Mcp/CodeContextTools.cs
Outdated
| catch | ||
| { | ||
| // Skip directories that can't be accessed |
There was a problem hiding this comment.
Empty catch blocks swallow all exceptions silently. Consider at least logging the error or being more specific about which exceptions to catch (e.g., UnauthorizedAccessException, IOException).
| catch | |
| { | |
| // Skip directories that can't be accessed | |
| catch (Exception ex) | |
| { | |
| // Skip directories that can't be accessed | |
| System.Diagnostics.Debug.WriteLine($"[CollectFiles] Could not access directory '{currentPath}': {ex.Message}"); |
Services/TokenBudgetOptimizer.cs
Outdated
| } | ||
|
|
||
| // Reserve tokens for project structure if requested | ||
| var reservedTokens = includeStructure ? 2000 : 100; // Structure + overhead |
There was a problem hiding this comment.
[nitpick] The comment says "Structure + overhead" but the code uses a fixed 2000 tokens for structure, which seems quite high. Consider making this configurable or calculating it dynamically based on actual project structure size using EstimateTokensNaturalLanguage from TokenCounter. A small project might not need 2000 tokens for structure, wasting budget.
| var structureTokens = EstimateTokensNaturalLanguage(projectStructure); | ||
| var fileTokens = files.Sum(f => EstimateTokensForFile(f.path, f.content)); | ||
|
|
||
| const int overhead = 100; // Headers, formatting, metadata |
There was a problem hiding this comment.
[nitpick] Magic number 100 is used without explanation. Consider defining this as a named constant (e.g., private const int BaseOverheadTokens = 100;) to improve maintainability and make it easier to adjust if needed.
| { | ||
| // File paths are typically short, use exact word counting | ||
| var parts = filePath.Split('/', '\\', '.'); | ||
| return parts.Length + 2; // Add overhead for formatting |
There was a problem hiding this comment.
[nitpick] Magic number 2 is used for overhead. Consider defining this as a named constant to improve code clarity and maintainability.
Services/FileRelevanceScorer.cs
Outdated
| var pathLower = filePath.ToLowerInvariant(); | ||
| if (keywords.Count == 0) | ||
| return 0.5; | ||
|
|
||
| var matchCount = keywords.Count(keyword => pathLower.Contains(keyword)); | ||
| return Math.Min(1.0, matchCount / (double)keywords.Count); | ||
| } | ||
|
|
||
| /// <summary> | ||
| /// Scores based on content relevance. | ||
| /// </summary> | ||
| private static double ScoreContent(string content, List<string> keywords) | ||
| { | ||
| if (string.IsNullOrWhiteSpace(content) || keywords.Count == 0) | ||
| return 0.3; // Low default score | ||
|
|
||
| var contentLower = content.ToLowerInvariant(); |
There was a problem hiding this comment.
Performance concern: ToLowerInvariant() is called repeatedly on potentially large content strings during scoring (lines 112, 128). For large files, this creates unnecessary string allocations. Consider caching the lowercased content or using case-insensitive string operations like IndexOf(keyword, StringComparison.OrdinalIgnoreCase) to avoid creating new strings.
Mcp/CodeContextTools.cs
Outdated
| var fileChecker = new FileFilterService(filterConfig, gitIgnoreParser); | ||
| var scanner = new ProjectScanner(fileChecker, _console); | ||
|
|
||
| var files = await Task.Run(() => GetAllProjectFiles(scanner, projectPath)); |
There was a problem hiding this comment.
Using Task.Run here is unnecessary for synchronous I/O work. Simply call GetAllProjectFiles(scanner, projectPath) directly without wrapping it in Task.Run.
| var files = await Task.Run(() => GetAllProjectFiles(scanner, projectPath)); | |
| var files = GetAllProjectFiles(scanner, projectPath); |
Services/FileRelevanceScorer.cs
Outdated
| }); | ||
|
|
||
| // Normalize by content length and keyword count | ||
| var density = totalMatches / (double)(content.Length / 100 + 1); |
There was a problem hiding this comment.
Possible loss of precision: any fraction will be lost.
| var density = totalMatches / (double)(content.Length / 100 + 1); | |
| var density = totalMatches / (content.Length / 100.0 + 1); |
…uality This commit addresses all feedback from PR #4's Copilot AI review: Security Fixes: - Add PathSecurity utility to prevent path traversal attacks - Validate all file paths in GetFileContent to prevent directory traversal - Replace empty catch blocks with specific exception types Performance Improvements: - Remove unnecessary Task.Run wrappers (synchronous I/O doesn't benefit) - Optimize ToLowerInvariant() calls to avoid repeated allocations on large strings - Cache lowercase conversions once per scoring operation Code Quality Improvements: - Replace magic numbers with named constants throughout * Scoring weights (FileNameWeight, FilePathWeight, etc.) * Scoring parameters (NeutralScore, MaxMatchesPerKeyword, etc.) * File importance boost values (ReadmeBoost, ConfigBoost, etc.) * Token reservation constants (StructureTokenReservation) - Fix misleading file path length heuristic * Now uses actual file size instead of path length * Properly penalizes large files (50KB+ threshold) Build Fixes: - Fix .NET 10.0 target framework error in test project (downgrade to .NET 9.0) Technical Details: - PathSecurity.ValidatePathWithinRoot ensures resolved paths stay within project - SecurityException thrown for path traversal attempts - Specific exception handling for UnauthorizedAccessException, IOException, DirectoryNotFoundException - Reduced token reservation from 2000 to 1000 for more reasonable small project handling - All magic numbers extracted to const fields with descriptive names
…oding
This major update transforms CodeContext from a simple file dumper into an intelligent context server optimized for agentic coding workflows.
New Features:
MCP (Model Context Protocol) server implementation
Token Budget Optimization System
Enhanced Program Architecture
Comprehensive Documentation
Technical Implementation:
Benefits for Agentic Coding:
This update positions CodeContext as essential infrastructure for agentic coding, similar to how LSP became fundamental for modern IDEs.