Skip to content

gagip/git-log-analysis

Repository files navigation

Git Log Analysis

Python

⚠️ Note: This project was originally built for personal use and has been recently made public. Some features may be unstable or subject to change.

A tool for collecting and analyzing Git repository logs on a monthly basis. It parses Git commit history, stores results in a searchable JSON format, and exposes data via CLI and MCP server for Claude Desktop integration.

Motivation

When reviewing past work or searching for a specific commit, browsing Git history directly is inefficient. A project with years of history can have thousands of commits — finding what you need with git log alone is tedious.

Asking an AI agent to search through raw Git history makes this worse. The agent reads commits one by one, consuming large amounts of tokens in the process. In practice, this means high API costs and slow responses, with no guarantee of finding the right result.

This tool takes a different approach: parse and structure Git logs locally first, then let the AI query only what it needs.

By pre-processing commit history into a searchable JSON format and exposing it through an MCP server, Claude can retrieve relevant commits quickly — without reading through thousands of raw log entries. This reduces token usage significantly and makes AI-assisted Git history analysis practical for everyday use.

Features

  • Monthly Git log generation: Extracts Git logs from a specified period on a monthly basis
  • Git log parsing: Parses Git log files to extract commit information and diff details
  • Statistics: Commit count, files changed, lines added/deleted, etc.
  • JSON output: Saves parsed results in JSON format

Project Structure

git-log-analysis/
├── cli.py                           # CLI entry point
├── mcp_server.py                    # MCP server
├── git_log_analysis/                # Core modules
│   ├── git_monthly_log_generator.py # Monthly Git log generator
│   ├── git_diff_parser.py           # Git log and diff parser
│   ├── processors/                  # Log processing modules
│   │   ├── batch_processor.py       # Batch processing
│   │   ├── file_processor.py        # File processing
│   │   └── encoding_handler.py      # Encoding handling
│   ├── cli/                         # CLI subcommands
│   │   ├── generate_command.py      # generate subcommand
│   │   ├── parse_command.py         # parse subcommand
│   │   ├── search_command.py        # search subcommand
│   │   └── projects_command.py      # projects subcommand
│   └── mcp/                         # MCP-related modules
│       ├── data_loader.py           # JSON data loader
│       ├── query_engine.py          # Search engine
│       ├── formatters.py            # Output formatters
│       ├── service.py               # Service layer
│       └── utils.py                 # Utilities
├── tests/                           # Test code
└── data/                            # Git log data directory

Requirements

  • Python 3.12+
  • uv
  • Git must be installed on the system

Installation

uv sync

Usage

1. Generate Monthly Git Logs

Extracts logs from a Git repository on a monthly basis:

uv run cli.py generate -r <repo_path> -y <start_year> [-m <start_month>]

Example:

# Generate from January 2024 (default month)
uv run cli.py generate -r /path/to/repo -y 2024

# Generate from August 2024
uv run cli.py generate -r /path/to/repo -y 2024 -m 8

This command generates monthly Git logs from the specified month (default: January) to the present in the <repo_path>/git_logs_by_month/ directory.

2. Analyze Git Logs (CLI)

Parse generated Git log files and save results as JSON:

uv run cli.py parse -l <log_directory> -o <output_directory>

Example:

uv run cli.py parse -l data/project_git_logs_by_month -o results/project

3. Search Commits (CLI)

Search parsed commits:

uv run cli.py search -r <results_directory> -k <keyword>

Options:

  • -r, --results-dir: Results directory path (required)
  • -k, --keyword: Search keyword (commit messages)
  • -s, --start-date: Start date (YYYY-MM-DD)
  • -e, --end-date: End date (YYYY-MM-DD)
  • -a, --author: Author filter (partial match)
  • -p, --project: Project filter (exact match)
  • --limit: Maximum number of results (default: 50)
  • --format: Output format (text, json, csv)
  • -o, --output: Save results to file (stdout if not specified)

4. List Projects (CLI)

View available projects and commit counts:

uv run cli.py projects -r <results_directory>

5. Integrate with Claude Desktop via MCP Server

Configure the Claude Desktop config file for your OS:

  • macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
  • Windows: %APPDATA%\Claude\claude_desktop_config.json
{
  "mcpServers": {
    "git-log-analyzer": {
      "command": "uv",
      "args": ["run", "--directory", "<project absolute directory path>", "python", "mcp_server.py"],
      "env": {
        "GIT_LOG_RESULTS_DIR": "<project absolute directory path>/results"
      }
    }
  }
}

Available MCP Tools:

  1. search_commits: Search commits

    • keyword: Search keyword (commit messages)
    • start_date, end_date: Date range (YYYY-MM-DD)
    • author: Author filter
    • project: Project filter
    • limit: Result limit (default 50)
  2. list_projects: List projects and commit counts

  3. get_commit_changes: Retrieve detailed changes for a specific commit

    • commit_hash: Commit hash (full or minimum 7 characters)
    • project: Project name (optional, improves search speed)

Example prompts in Claude Desktop:

  • "Search commits with keyword 'Add'"
  • "Show me the work history from January 2024"
  • "Show the changes for commit abc1234"
  • "Pick one of the recent commits and show the detailed changes"

Tests

uv run pytest

Linting

# Check code style
uv run ruff check .

# Auto-fix issues
uv run ruff check --fix .

About

A tool for parsing and searching Git commit history locally, exposed via MCP server for efficient AI-assisted analysis with Claude.

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages