Skip to content

andreahaku/code-analysis-context-python-mcp

Repository files navigation

Code Analysis & Context Engineering MCP - Python Edition

A sophisticated Model Context Protocol (MCP) server that provides deep codebase understanding for Python projects, with a focus on data analysis frameworks and scientific computing.

🎯 Overview

This MCP server is specifically designed for Python projects, especially those using data analysis and scientific computing libraries. It provides architectural analysis, pattern detection, dependency mapping, test coverage analysis, and AI-optimized context generation.

✨ Features

  • πŸ—οΈ Architecture Analysis: Comprehensive architectural overview with module relationships, class hierarchies, and data flow patterns
  • πŸ” Pattern Detection: Identify data analysis patterns, best practices, and antipatterns
  • πŸ“Š Dependency Mapping: Visualize module dependencies, detect circular dependencies, and analyze coupling
  • πŸ§ͺ Coverage Analysis: Find untested code with actionable test suggestions based on complexity
  • βœ… Convention Validation: Validate adherence to PEP 8 and project-specific coding conventions
  • πŸ€– Context Generation: Build optimal AI context packs respecting token limits and maximizing relevance

🐍 Supported Frameworks & Libraries

Data Analysis & Scientific Computing

  • βœ… Pandas - DataFrame operations, data transformations, aggregations
  • βœ… NumPy - Array manipulations, mathematical operations, broadcasting
  • βœ… Scikit-learn - ML pipelines, model training, feature engineering
  • βœ… Matplotlib/Seaborn - Visualization patterns, plot types
  • βœ… Jupyter Notebooks - Notebook structure, cell analysis

Web Frameworks

  • βœ… FastAPI - Async endpoints, dependency injection, Pydantic models
  • βœ… Django - Models, views, ORM patterns, middleware
  • βœ… Flask - Routes, blueprints, decorators

Testing Frameworks

  • βœ… Pytest - Test discovery, fixtures, parametrization
  • βœ… Unittest - Test cases, mocking, assertions

πŸ“¦ Installation

Prerequisites

  • Python 3.10 or higher
  • pip or poetry for package management

Install from source

git clone https://github.com/andreahaku/code-analysis-context-python-mcp.git
cd code-analysis-context-python-mcp
pip install -e .

Development installation

pip install -e ".[dev]"

πŸš€ Usage

As an MCP Server

Add to your MCP client configuration:

{
  "mcpServers": {
    "code-analysis-python": {
      "command": "python",
      "args": ["-m", "src.server"]
    }
  }
}

Or using the installed script:

{
  "mcpServers": {
    "code-analysis-python": {
      "command": "code-analysis-python-mcp"
    }
  }
}

πŸ’¬ Example Usage

Once configured as an MCP server in your LLM client, you can use natural language prompts:

Analyzing a Data Science Project

"Analyze the architecture of my pandas project and show me the complexity metrics"

This will invoke the arch tool with appropriate parameters to:

  • Detect pandas framework usage
  • Calculate complexity for all Python files
  • Show module structure and class hierarchies
  • Identify high-complexity functions needing refactoring

"Find all DataFrame operations in my project and check if they follow best practices"

This will use the patterns tool to:

  • Detect DataFrame chaining, groupby, merge operations
  • Compare against pandas best practices
  • Suggest optimizations (e.g., vectorization instead of loops)

Finding Code Issues

"Check my project for circular dependencies and show me the dependency graph"

Uses the deps tool to:

  • Build import graph using NetworkX
  • Detect any circular import cycles
  • Calculate coupling/cohesion metrics
  • Generate Mermaid diagram of dependencies

"Which files in my project have no test coverage and are most critical to test?"

The coverage tool will:

  • Parse coverage.xml or .coverage file
  • Prioritize gaps based on complexity and file location
  • Generate pytest test scaffolds for critical files
  • Show untested functions with complexity scores

Code Quality

"Validate my project follows PEP 8 conventions and show naming violations"

The conventions tool will:

  • Check snake_case, PascalCase, UPPER_SNAKE_CASE naming
  • Validate import grouping and ordering
  • Check for missing docstrings and type hints
  • Report consistency score by category

AI-Assisted Development

"Generate a context pack for adding data validation to my pandas DataFrame processing pipeline"

The context tool will:

  • Extract keywords ("data validation", "pandas", "DataFrame")
  • Score files by relevance to the task
  • Select most relevant files within token budget
  • Format as markdown with code snippets and suggestions

πŸ› οΈ Available Tools

1. arch - Architecture Analysis

Analyze Python project architecture and structure.

{
  "path": "/path/to/project",
  "depth": "d",
  "types": ["mod", "class", "func", "api"],
  "diagrams": true,
  "metrics": true,
  "details": true,
  "minCx": 10,
  "maxFiles": 50
}

Parameters:

  • path: Project root directory
  • depth: Analysis depth - "o" (overview), "d" (detailed), "x" (deep)
  • types: Analysis types - mod, class, func, api, model, view, notebook, pipeline, dataflow
  • diagrams: Generate Mermaid diagrams
  • metrics: Include code metrics
  • details: Per-file detailed metrics
  • minCx: Minimum complexity threshold for filtering
  • maxFiles: Maximum number of files in detailed output
  • memSuggest: Generate memory suggestions for llm-memory MCP
  • fw: Force framework detection - pandas, numpy, sklearn, fastapi, django, flask, jupyter

Detects:

  • Module structure and organization
  • Class hierarchies and inheritance patterns
  • Function definitions and decorators
  • API endpoints (FastAPI/Django/Flask)
  • Data models and Pydantic schemas
  • Jupyter notebook structure
  • Data processing pipelines

2. deps - Dependency Analysis

Analyze module dependencies and imports.

{
  "path": "/path/to/project",
  "circular": true,
  "metrics": true,
  "diagram": true,
  "focus": "src/models",
  "depth": 3
}

Parameters:

  • path: Project root directory
  • circular: Detect circular dependencies
  • metrics: Calculate coupling/cohesion metrics
  • diagram: Generate Mermaid dependency graph
  • focus: Focus on specific module
  • depth: Maximum dependency depth to traverse
  • external: Include site-packages dependencies

Features:

  • Import graph construction
  • Circular dependency detection with cycle paths
  • Coupling and cohesion metrics
  • Dependency hotspots (hubs and bottlenecks)
  • Module classification (utility, service, model, etc.)

3. patterns - Pattern Detection

Detect data analysis and coding patterns.

{
  "path": "/path/to/project",
  "types": ["dataframe", "array", "pipeline", "model", "viz"],
  "custom": true,
  "best": true,
  "suggest": true
}

Parameters:

  • path: Project root directory
  • types: Pattern types to detect
    • dataframe: Pandas DataFrame operations
    • array: NumPy array manipulations
    • pipeline: Scikit-learn pipelines
    • model: ML model training patterns
    • viz: Matplotlib/Seaborn visualizations
    • api: FastAPI/Django/Flask endpoints
    • orm: Django ORM patterns
    • async: Async/await patterns
    • decorator: Custom decorators
    • context: Context managers
  • custom: Detect custom patterns
  • best: Compare with best practices
  • suggest: Generate improvement suggestions

Detected Patterns:

  • Pandas: DataFrame chaining, groupby operations, merge strategies
  • NumPy: Broadcasting, vectorization, array creation patterns
  • Scikit-learn: Pipelines, transformers, model training
  • Visualization: Plot types, figure management, style patterns
  • API: Endpoint patterns, request/response models, middleware
  • Async: Async functions, coroutines, event loops
  • Testing: Fixtures, mocking, parametrization

4. coverage - Test Coverage Analysis

Analyze test coverage and generate test suggestions.

{
  "path": "/path/to/project",
  "report": ".coverage",
  "fw": "pytest",
  "threshold": {
    "lines": 80,
    "functions": 80,
    "branches": 75
  },
  "priority": "high",
  "tests": true,
  "cx": true
}

Parameters:

  • path: Project root directory
  • report: Coverage report path (.coverage, coverage.xml)
  • fw: Test framework - pytest, unittest, nose
  • threshold: Coverage thresholds
  • priority: Filter priority - crit, high, med, low, all
  • tests: Generate test scaffolds
  • cx: Analyze complexity for prioritization

Features:

  • Parse coverage.py reports
  • Identify untested modules and functions
  • Complexity-based prioritization
  • Test scaffold generation (pytest/unittest)
  • Framework-specific test patterns

5. conventions - Convention Validation

Validate PEP 8 and coding conventions.

{
  "path": "/path/to/project",
  "auto": true,
  "severity": "warn",
  "rules": {
    "naming": {
      "functions": "snake_case",
      "classes": "PascalCase"
    }
  }
}

Parameters:

  • path: Project root directory
  • auto: Auto-detect project conventions
  • severity: Minimum severity - err, warn, info
  • rules: Custom convention rules

Checks:

  • PEP 8 compliance
  • Naming conventions (snake_case, PascalCase, UPPER_CASE)
  • Import ordering and grouping
  • Docstring presence
  • Type hint usage
  • Line length and formatting

6. context - Context Pack Generation

Generate AI-optimized context packs.

{
  "task": "Add data preprocessing pipeline with pandas",
  "path": "/path/to/project",
  "tokens": 50000,
  "include": ["files", "arch", "patterns"],
  "focus": ["src/preprocessing", "src/models"],
  "format": "md",
  "lineNums": true,
  "strategy": "rel"
}

Parameters:

  • task: Task description (required)
  • path: Project root directory
  • tokens: Token budget (default: 50000)
  • include: Content types - files, deps, tests, types, arch, conv, notebooks
  • focus: Priority files/directories
  • history: Include git history
  • format: Output format - md, json, xml
  • lineNums: Include line numbers
  • strategy: Optimization strategy - rel (relevance), wide (breadth), deep (depth)

Features:

  • Task-based file relevance scoring
  • Token budget management
  • Multiple output formats
  • Architectural context inclusion
  • Dependency traversal

πŸ“ Configuration

Create a .code-analysis.json file in your project root:

{
  "project": {
    "name": "MyDataProject",
    "type": "pandas"
  },
  "analysis": {
    "includeGlobs": ["src/**/*.py", "notebooks/**/*.ipynb"],
    "excludeGlobs": ["**/test_*.py", "**/__pycache__/**", "**/venv/**"]
  },
  "conventions": {
    "naming": {
      "functions": "snake_case",
      "classes": "PascalCase",
      "constants": "UPPER_SNAKE_CASE"
    },
    "imports": {
      "order": ["stdlib", "third_party", "local"],
      "grouping": true
    }
  },
  "coverage": {
    "threshold": {
      "lines": 80,
      "functions": 80,
      "branches": 75
    }
  }
}

πŸ”§ Development

Project Structure

code-analysis-context-python-mcp/
β”œβ”€β”€ src/
β”‚   β”œβ”€β”€ __init__.py
β”‚   β”œβ”€β”€ server.py              # MCP server entry point
β”‚   β”œβ”€β”€ tools/                 # Tool implementations
β”‚   β”‚   β”œβ”€β”€ architecture_analyzer.py
β”‚   β”‚   β”œβ”€β”€ pattern_detector.py
β”‚   β”‚   β”œβ”€β”€ dependency_mapper.py
β”‚   β”‚   β”œβ”€β”€ coverage_analyzer.py
β”‚   β”‚   β”œβ”€β”€ convention_validator.py
β”‚   β”‚   └── context_pack_generator.py
β”‚   β”œβ”€β”€ analyzers/             # Framework-specific analyzers
β”‚   β”œβ”€β”€ utils/                 # Utilities (AST, complexity, etc.)
β”‚   └── types/                 # Type definitions
β”œβ”€β”€ tests/                     # Test suite
β”œβ”€β”€ pyproject.toml
└── README.md

Running Tests

pytest
pytest --cov=src --cov-report=html

Code Quality

# Format code
black src tests
isort src tests

# Lint
flake8 src tests

# Type checking
mypy src

🎯 Implementation Status

βœ… All Features Complete!

Core Tools (100% Complete)

  • βœ… Architecture Analyzer - AST parsing, complexity metrics, framework detection, Mermaid diagrams
  • βœ… Pattern Detector - DataFrame/array/ML patterns, async/decorators, antipatterns, best practices
  • βœ… Dependency Mapper - Import graphs, circular detection, coupling metrics, hotspots
  • βœ… Coverage Analyzer - Coverage.py integration, test scaffolds, complexity-based prioritization
  • βœ… Convention Validator - PEP 8 checking, naming conventions, docstrings, auto-detection
  • βœ… Context Pack Generator - Task-based relevance, token budgets, multiple formats, AI optimization

Utilities (100% Complete)

  • βœ… AST Parser - Classes, functions, imports, complexity
  • βœ… Complexity Analyzer - Radon integration, maintainability index
  • βœ… File Scanner - Glob patterns, intelligent filtering
  • βœ… Framework Detector - 14+ frameworks, pattern matching
  • βœ… Diagram Generator - Mermaid architecture & dependency graphs

Features

  • βœ… Circular dependency detection with cycle paths
  • βœ… LLM memory integration for persistent context
  • βœ… Test scaffold generation (pytest/unittest)
  • βœ… Multi-format output (JSON, Markdown, XML)
  • βœ… Token budget management for AI tools
  • βœ… Complexity-based prioritization
  • βœ… Mermaid diagram generation

🀝 Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

πŸ“„ License

MIT

πŸ‘¨β€πŸ’» Author

Andrea Salvatore (@andreahaku) with Claude (Anthropic)

πŸ”— Related Projects

πŸ§ͺ Testing

Test all tools on this project itself:

python3 test_tools.py

This will run all 6 tools and display:

  • Project architecture and complexity metrics
  • Pattern detection (async, decorators, context managers)
  • Dependency graph and coupling metrics
  • Coverage gaps with priorities
  • Convention violations and consistency scores
  • AI context pack generation

πŸ“Š Example Output

Running the tools on this project shows:

  • 18 modules, 9 classes, 61 functions, 3,787 lines of code
  • 38 patterns detected (17 async functions, 21 decorators)
  • 0 circular dependencies - clean architecture!
  • 0% test coverage - needs tests (demonstrates coverage tool)
  • High complexity (avg 36.8) - identifies refactoring targets
  • 100% naming consistency - follows PEP 8

Status: βœ… Production Ready - All 6 tools fully implemented and tested

Python Version: 3.10+

Lines of Code: 3,787 Test Coverage: Functional (integration tests via test_tools.py) Code Quality: High consistency, follows PEP 8

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •  

Languages