Skip to content

PuvaanRaaj/ghostwrite

Repository files navigation

ghostwrite

Know what your AI wrote. Track AI-generated code provenance in git repos.

ghostwrite analyzes your git history and tells you exactly how much of your codebase was written by AI tools — which tool, which author, which files, and whether that code survived. It works with any git repo, requires zero configuration, and runs entirely offline.

$ ghostwrite scan && ghostwrite report

 ghostwrite — Know what your AI wrote.

╭────────────────────────────────────────────────────────────╮
│ Repository: my-project                                     │
│ Period:      2025-01-01 → 2026-03-22                       │
│ Commits:    847 total │ 312 AI-assisted (36.8%)            │
╰────────────────────────────────────────────────────────────╯

 📊 AI Code Breakdown

 Tool      Commits +Added  Confidence
 ───────── ─────── ─────── ──────────────
 claude    198     +41,203  ● 94% confirmed
 cursor    89      +18,552  ● 72% confirmed
 copilot   25      +4,109   ◐ 60% heuristic

 🧬 AI Code Survival    312 / 312 AI commits reachable from HEAD (100%)
 ⚡ Code Churn          AI: 8.3%  Human: 12.1%

Features

  • 8 detection strategies — co-author emails, git trailers, [ai:tool] tags, cursor-style commit messages, structured commit bodies, file-count spikes, labels, and session markers
  • 9 tools tracked — Claude, Cursor, GitHub Copilot, Codex, Gemini, Aider, Windsurf, Devin, Augment
  • Survival analysis — how much AI code is still reachable from HEAD vs churned
  • Churn analysis — compare rewrite rates between AI and human code
  • CI/CD mode — fail builds when AI% exceeds a threshold; outputs JSON or SARIF
  • Git hooks — auto-tag AI commits at commit time via commit-msg hook
  • Zero network calls — all analysis is local; no telemetry, no API keys
  • Fast — parallel scanning with SQLite cache; incremental rescans skip already-analyzed commits

Installation

Homebrew

brew install puvaan/tap/ghostwrite

Go

go install github.com/puvaan/ghostwrite@latest

Download binary

Grab the latest release for your platform from Releases.

Build from source

git clone https://github.com/puvaan/ghostwrite
cd ghostwrite
make build   # output: bin/ghostwrite

Quick Start

# 1. Initialize (creates .ghostwrite.yml, updates .gitignore)
ghostwrite init

# 2. Scan git history (last 6 months by default)
ghostwrite scan

# 3. View report
ghostwrite report

# 4. (Optional) Install git hooks to auto-tag future commits
ghostwrite hook install

Commands

Command Description
ghostwrite init Initialize config and directories
ghostwrite scan Analyze git history and cache results
ghostwrite report Display AI provenance report
ghostwrite ci CI mode — threshold checks, JSON/SARIF output
ghostwrite diff Compare AI vs human code metrics side-by-side
ghostwrite hook install Install commit-msg and pre-push git hooks
ghostwrite hook uninstall Remove git hooks
ghostwrite hook status Show installed hook status
ghostwrite config show Print current configuration
ghostwrite config reset Reset config to defaults
ghostwrite version Print version info

Scan options

ghostwrite scan --since 6m          # last 6 months (default)
ghostwrite scan --since 2024-01-01  # since a specific date
ghostwrite scan --since 1y          # last year
ghostwrite scan --force             # re-analyze all commits (ignore cache)
ghostwrite scan --branch main       # specific branch
ghostwrite scan --all               # all branches

Report options

ghostwrite report                          # terminal output (default)
ghostwrite report --format json            # JSON output
ghostwrite report --format markdown        # Markdown output
ghostwrite report --output report.md       # write to file
ghostwrite report --section summary        # single section only
ghostwrite report --since 30d             # filter by date range
ghostwrite report --author alice@acme.com  # filter by author

CI mode

# Warn if AI% exceeds 80% (exit 0)
ghostwrite ci --threshold 80

# Fail if AI% exceeds 80% (exit 1)
ghostwrite ci --threshold 80 --fail

# Output SARIF for GitHub Code Scanning
ghostwrite ci --format sarif --output results.sarif

# Compare against a baseline report
ghostwrite ci --threshold 80 --baseline previous-report.json

Diff

ghostwrite diff                # last 30 days
ghostwrite diff --since 3m    # last 3 months

Configuration

Running ghostwrite init creates .ghostwrite.yml:

detection:
  tools:
    - name: claude
      enabled: true
    - name: cursor
      enabled: true
    # ... more tools

report:
  top_n: 10       # max items per section
  since: 6m       # default scan window

cache:
  path: .ghostwrite/cache.db

Detection Strategies

ghostwrite uses multiple strategies to identify AI-generated commits, each producing either a confirmed or heuristic confidence level:

Strategy How it works Confidence
CoAuthor Matches Co-Authored-By: lines against known AI tool emails confirmed
Trailer Matches AI-Tool:, AI-Agent:, Generated-By: git trailers confirmed
Tag Matches [ai:toolname] in commit subject confirmed
CursorStyle Lowercase, no conventional prefix, imperative verb, 30–100 chars confirmed*
ConventionalRich Conventional prefix + 4+ bullet body OR structured metadata heuristic
FilesMultiplier Files changed > 1.8× author's personal average heuristic

* confirmed when ~/.cursor/ exists locally, otherwise heuristic.

CI/CD Integration

GitHub Actions

- name: Check AI provenance
  run: |
    ghostwrite scan --since 1y --force
    ghostwrite ci --threshold 90 --fail --format sarif --output ghostwrite.sarif

- uses: github/codeql-action/upload-sarif@v3
  with:
    sarif_file: ghostwrite.sarif

GitLab CI

ghostwrite:
  script:
    - ghostwrite scan --since 1y --force
    - ghostwrite ci --threshold 90 --fail
  artifacts:
    reports:
      sast: ghostwrite.sarif

Dogfooding

ghostwrite tracks itself. The CI pipeline runs ghostwrite on its own git history on every push to main and every Monday morning. The JSON report is uploaded as a build artifact.

License

MIT — see LICENSE.

About

Know what your AI wrote. Track AI-generated code provenance in git repos.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages