Skip to content

CLAUDE.md hot-path cost attribution with section-level recommendations #10

@willwashburn

Description

@willwashburn

Context

Prism (jakeefr/prism) validates this as the single highest-leverage waste-finder in a Claude Code project: CLAUDE.md rides in every turn's cached context, so a bloated CLAUDE.md silently taxes every session in the repo forever. Their real-data examples (from the README):

  • "6738% CLAUDE.md re-read cost in one session: a 237-line file being re-read on every tool call"
  • "CLAUDE.md re-reads consumed 480% of total session tokens"

Prism's mechanism is a proxy: reread_cost = tool_call_count × claude_md_size_tokens (analyzer.py:173). It never touches real cacheRead values. The shock ratios are (count × size) / session_incremental_tokens, meant to alarm, not to be precise.

Burn can do this correctly because the ledger already carries real per-turn cacheRead values.

Scope

One file (plus nested variants): the project's CLAUDE.md at repo root. Also detect nested .claude/CLAUDE.md or scoped src/**/CLAUDE.md files Claude Code loads hierarchically.

Mechanism

Cost attribution

At session start, resolve the active CLAUDE.md set from cwd / project root.
Measure byte length of each file; estimate tokens (bytes/4 heuristic, or tiktoken cl100k if the dep cost is acceptable).

For each turn T in the session:
  share_T = claude_md_tokens / (claude_md_tokens + tool_defs_tokens + conversation_tokens_T)
  attributed_tokens_T = usage.cacheRead_T × share_T
  attributed_cost_T   = attributed_tokens_T × cache_read_price(model)

session_claude_md_cost = Σ attributed_cost_T
  • tool_defs_tokens — sum of the registered tools' schema tokens. Approximate once per session from the Claude Code client version or derive from the session JSONL if exposed; otherwise treat as a fixed constant per Claude Code minor version and measure once.
  • conversation_tokens_T — back-compute from usage.input - cacheCreate - cacheRead plus the running prefix. Rough is fine; we're computing a share, not a precise boundary.
  • Accounts for cache-read pricing being ~10% of fresh-input price (prism doesn't — they multiply tokens by tool-call count as if every read was fresh).

Section ranking

Unlike prism's zone heuristic (top 20%, mid 20-75%, bottom 25% at advisor.py:234-235), parse CLAUDE.md by markdown heading structure:

  • # Top-level / ## Section / ### Subsection — each heading begins a section.
  • Capture each section's line range, heading text, and token count.
  • Compute per-section cost as (section_tokens / total_claude_md_tokens) × session_claude_md_cost.

No keyword classification (prism does tone|style|personality regex — too fragile). Rank purely by cost. The section content is what the user judges, not a classifier-assigned category.

CLI surface

burn claude-md [--project <path>] [--since 7d]
# Prints:
#   CLAUDE.md at /Users/will/Projects/foo (412 lines, ~3.1k tokens)
#   Cost per session:   avg $0.18, p95 $0.34
#   Cost last 7 days:   $4.12 across 23 sessions
#
#   Sections ranked by cost:
#     lines  12- 47  ## Architecture              1,820 tok  $0.11/session
#     lines  48- 89  ## Testing conventions         890 tok  $0.05/session
#     lines  90-148  ## Tone and personality notes  410 tok  $0.02/session   <- consider trimming
#     …

burn claude-md advise [--project <path>]
# Emits a unified diff of suggested TRIM / RESTRUCTURE hunks with
# per-recommendation projected-savings-per-session and savings-over-last-N-sessions.
# Does NOT write to the file. User applies manually.

burn claude-md --json   # programmatic

Explicitly no --apply flag. Unlike prism's advise --apply, burn never mutates files.

Acceptance

  • Attribution math: on a fixture session with a known CLAUDE.md and known cacheRead values, burn claude-md reports a per-turn attributed cost within ±10% of a hand-computed baseline.
  • Section parsing correctly handles: top-level text before the first heading (treated as a "preamble" section), nested headings (rolled up to their parent level for ranking), and files with no headings (reported as a single section with a note).
  • Cross-session view: burn claude-md --since 7d aggregates across sessions for the same projectKey (depends on Reader infrastructure: incremental cursors and git-canonical project keys #4's git canonicalization).
  • advise output is applyable by hand: line ranges match the user's current file (run against HEAD, not a stale snapshot).
  • If CLAUDE.md has changed during the reporting window, attribution uses the CLAUDE.md at the time of each session, not the current version. (Read via git log if the file is tracked; fall back to current file with a warning otherwise.)

Depends on

Unblocks

The biggest concrete "switch a choice, save spend" answer for the meta-goal. Higher leverage than model-switching because CLAUDE.md savings compound across every future session in the repo.

Out of scope

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions