diffctx — smart diff context for LLM code review

diffctx selects the minimum code an LLM needs to review a git diff. Instead of pasting whole files, it walks the dependency graph from the changed lines outward and stops as soon as additional context stops paying for itself.

Why not just use `tree` or repomix?

	`tree`	repomix	Claude Code Review	diffctx
Primary use case	directory listing	full repo export	automated PR review	diff context for code review
Smart diff context	✗	✗	✓	✓
Works with any LLM	✓	✓	Claude only	✓
Free / local / offline	✓	✓	$15–25/review	✓
GitHub required	✗	✗	✓	✗
Multiple output formats	✗	limited	—	YAML/JSON/MD/txt
Python API	✗	✗	✗	✓
MCP server	✗	✗	✗	✓

Install (30 seconds)

pip install diffctx                     # canonical
pipx install diffctx                    # or: isolated, no venv needed
pip install 'diffctx[tree-sitter]'      # + AST parsing for smarter diff context
pip install 'diffctx[mcp]'              # + MCP server for AI assistants

diffctx . --diff HEAD~1       # smart context for last commit → paste into Claude/ChatGPT
diffctx . -f md -c            # full export → clipboard in Markdown

Demo: diffctx . --diff HEAD~1 selects only the fragments — functions, imports, type definitions — that an LLM actually needs to review the last commit, instead of dumping every changed file in full.

Standalone binary (no Python required): download from the releases page.

Diff context mode works out of the box. Adding [tree-sitter] enables AST-level parsing for more accurate context selection across 12 languages.

Diff Context Mode

Automatically finds the minimal set of code fragments needed to understand a change — imports, callers, type definitions, config dependencies — without dumping entire files. Understands 50+ file types.

name: myproject
type: diff_context
fragment_count: 5
fragments:
  - path: src/main.py
    lines: "10-25"
    kind: function
    symbol: process_data
    content: |
      def process_data(items):
          ...

How it works

Builds a code graph (imports, co-changes, type refs) and propagates relevance from changed lines outward across it. Three scoring modes are available — pick one with --scoring:

`--scoring`	What it does
`ego` (default)	Bounded ego-network expansion around changed nodes — fast, predictable radius, the current default
`ppr`	Personalized PageRank with damping `--alpha` — global, smoother decay, slower
`bm25`	Lexical fragment retrieval against the diff hunks — useful as a baseline / fallback when the graph is sparse

Selection stops when relevance drops below --tau (the minimum score a fragment must beat to be kept), or once --budget tokens have been emitted, whichever comes first.

Flag	Default	Description
`--scoring`	`ego`	Scoring mode: `ego`, `ppr`, or `bm25`
`--budget`	auto	Token cap. `auto` lets selection converge; `-1` disables the cap; `N` enforces a fixed cap
`--alpha`	0.60	How tightly context clusters around changes (PPR damping; 0–1, higher = more focused)
`--tau`	0.08	Minimum relevance required to include a fragment (lower = more context)
`--full`	false	Include every changed fragment; skip the smart-selection step entirely

Calibration of --alpha, --tau, and the edge-weight priors is documented in docs/parameter-strategy.md.

Theory: Context-Selection for Git Diff (Zenodo, 2026).

`graph` subcommand

For exploring the underlying dependency graph directly (without a diff), use the graph subcommand:

diffctx graph .                                  # Mermaid graph of directory deps (default)
diffctx graph . --summary                        # cycles, hotspots, coupling metrics
diffctx graph . --level fragment -f json         # fragment-level graph as JSON
diffctx graph . --level file -f graphml -o g.xml # file-level graph as GraphML

Flag	Default	Description
`-f/--format`	`mermaid`	Output format: `mermaid`, `json`, or `graphml`
`--level`	`directory`	Granularity: `fragment`, `file`, or `directory`
`--summary`	false	Print graph statistics (cycles, hotspots, coupling)

Usage

# full codebase export:
diffctx .                                # YAML to stdout + token count
diffctx . -f md -c                       # Markdown → clipboard
diffctx . -f json -o tree.json           # JSON → file
diffctx . --no-content                   # structure only, no file contents
diffctx . --max-depth 3                  # limit depth
diffctx . -i custom.ignore               # custom ignore patterns

# diff context mode (requires git repo):
diffctx . --diff HEAD~1                  # context for last commit
diffctx . --diff main..feature           # context for feature branch
diffctx . --diff HEAD~1 --budget 30000   # limit to ~30k tokens
diffctx . --diff HEAD~1 -c               # diff context to clipboard

Full codebase export output format:

name: myproject
type: directory
children:
  - name: main.py
    type: file
    content: |
      def hello():
          print("Hello, World!")
  - name: utils/
    type: directory
    children:
      - name: helpers.py
        type: file
        content: |
          def add(a, b):
              return a + b

Token Counting

Token count and size are always displayed on stderr:

12,847 tokens (o200k_base), 52.3 KB

For large outputs (>1MB), approximate counts with ~ prefix:

~125,000 tokens (o200k_base), 5.2 MB

Uses tiktoken with o200k_base encoding (GPT-4o tokenizer).

Clipboard Support

Copy output directly to clipboard with -c or --copy:

diffctx . -c                       # copy (stdout suppressed, stderr: token count)
diffctx . -c -o tree.yaml          # copy + save to file

System Requirements:

macOS: pbcopy (pre-installed)
Windows: clip (pre-installed)
Linux (Wayland): wl-copy
Linux (X11): xclip or xsel

Python API

from diffctx import map_directory
from diffctx import to_yaml, to_json, to_text, to_markdown

tree = map_directory(
    path,                     # directory path
    max_depth=None,           # limit traversal depth
    no_content=False,         # exclude file contents
    max_file_bytes=None,      # skip large files
    ignore_file=None,         # custom ignore file
    no_default_ignores=False, # disable default ignores
    whitelist_file=None,      # include-only filter
)

yaml_str = to_yaml(tree)
json_str = to_json(tree)
text_str = to_text(tree)
md_str = to_markdown(tree)

# Diff context mode
from pathlib import Path
from diffctx import build_diff_context, to_yaml

ctx = build_diff_context(
    Path("."),                # repository root
    "HEAD~1..HEAD",           # diff range; also accepts "main..feature"
    budget_tokens=None,       # None = convergence-based (default)
                              #   0  = diff only, no expansion (recall floor)
                              #  <0  = unlimited (10M-token soft ceiling)
                              #  >0  = explicit token cap
    alpha=0.6,                # PPR damping factor
    tau=0.08,                 # stopping threshold
    full=False,               # skip smart selection
)
yaml_str = to_yaml(ctx)

MCP Server

diffctx includes an MCP server that lets AI assistants (Claude Code, Cursor, Windsurf, etc.) call diff context analysis automatically during code review.

pip install 'diffctx[mcp]'

Add to your MCP client config (e.g. ~/.claude/mcp.json for Claude Code):

{
  "mcpServers": {
    "diffctx": {
      "command": "diffctx-mcp"
    }
  }
}

The server exposes a get_diff_context tool. Your AI assistant will automatically call it when reviewing PRs, explaining changes, or investigating broken tests — no manual invocation needed.

See src/diffctx/mcp/README.md for configs for Cursor, Continue, Windsurf, and Zed.

Ignore Patterns

Respects .gitignore and .diffctx/ignore automatically. Use --no-default-ignores to disable built-in patterns (.gitignore and .diffctx/ignore still apply).

Hierarchical: nested ignore files at each directory level
Negation patterns: !important.log un-ignores a file
Anchored patterns: /root_only.txt matches only in root
Output file is always auto-ignored

Auto-discovered files:

.diffctx/ignore — diffctx-specific ignore patterns
.diffctx/whitelist — Include-only filter (only matched files included)

Content Placeholders

<file too large: N bytes> — exceeds --max-file-bytes
<binary file: N bytes> — binary file detected
<unreadable content: not utf-8> — not valid UTF-8
<unreadable content> — permission denied or I/O error

License

Apache 2.0

Changelog
Security policy — threat model and vulnerability reporting
Parameter strategy — how --alpha, --tau, and edge weights are calibrated

Name		Name	Last commit message	Last commit date
Latest commit History 548 Commits
.claude		.claude
.github		.github
.treemapper		.treemapper
benchmarks		benchmarks
diffctx		diffctx
docs		docs
results		results
scripts		scripts
src/diffctx		src/diffctx
tests		tests
.dockerignore		.dockerignore
.gitignore		.gitignore
.gitleaks.toml		.gitleaks.toml
.gitleaksignore		.gitleaksignore
.markdownlint.yaml		.markdownlint.yaml
.markdownlintignore		.markdownlintignore
.mise.toml		.mise.toml
.pre-commit-config.yaml		.pre-commit-config.yaml
.secrets.baseline		.secrets.baseline
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
pyproject.toml		pyproject.toml
renovate.json		renovate.json
rust-toolchain.toml		rust-toolchain.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

diffctx — smart diff context for LLM code review

Why not just use `tree` or repomix?

Install (30 seconds)

Diff Context Mode

How it works

`graph` subcommand

Usage

Token Counting

Clipboard Support

Python API

MCP Server

Ignore Patterns

Content Placeholders

License

About

Uh oh!

Releases 22

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

diffctx — smart diff context for LLM code review

Why not just use tree or repomix?

Install (30 seconds)

Diff Context Mode

How it works

graph subcommand

Usage

Token Counting

Clipboard Support

Python API

MCP Server

Ignore Patterns

Content Placeholders

License

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 22

Uh oh!

Contributors

Uh oh!

Languages

Why not just use `tree` or repomix?

`graph` subcommand