You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Prism (jakeefr/prism) validates this as the single highest-leverage waste-finder in a Claude Code project: CLAUDE.md rides in every turn's cached context, so a bloated CLAUDE.md silently taxes every session in the repo forever. Their real-data examples (from the README):
"6738% CLAUDE.md re-read cost in one session: a 237-line file being re-read on every tool call"
"CLAUDE.md re-reads consumed 480% of total session tokens"
Prism's mechanism is a proxy: reread_cost = tool_call_count × claude_md_size_tokens (analyzer.py:173). It never touches real cacheRead values. The shock ratios are (count × size) / session_incremental_tokens, meant to alarm, not to be precise.
Burn can do this correctly because the ledger already carries real per-turn cacheRead values.
Scope
One file (plus nested variants): the project's CLAUDE.md at repo root. Also detect nested .claude/CLAUDE.md or scoped src/**/CLAUDE.md files Claude Code loads hierarchically.
Mechanism
Cost attribution
At session start, resolve the active CLAUDE.md set from cwd / project root.
Measure byte length of each file; estimate tokens (bytes/4 heuristic, or tiktoken cl100k if the dep cost is acceptable).
For each turn T in the session:
share_T = claude_md_tokens / (claude_md_tokens + tool_defs_tokens + conversation_tokens_T)
attributed_tokens_T = usage.cacheRead_T × share_T
attributed_cost_T = attributed_tokens_T × cache_read_price(model)
session_claude_md_cost = Σ attributed_cost_T
tool_defs_tokens — sum of the registered tools' schema tokens. Approximate once per session from the Claude Code client version or derive from the session JSONL if exposed; otherwise treat as a fixed constant per Claude Code minor version and measure once.
conversation_tokens_T — back-compute from usage.input - cacheCreate - cacheRead plus the running prefix. Rough is fine; we're computing a share, not a precise boundary.
Accounts for cache-read pricing being ~10% of fresh-input price (prism doesn't — they multiply tokens by tool-call count as if every read was fresh).
Section ranking
Unlike prism's zone heuristic (top 20%, mid 20-75%, bottom 25% at advisor.py:234-235), parse CLAUDE.md by markdown heading structure:
# Top-level / ## Section / ### Subsection — each heading begins a section.
Capture each section's line range, heading text, and token count.
Compute per-section cost as (section_tokens / total_claude_md_tokens) × session_claude_md_cost.
No keyword classification (prism does tone|style|personality regex — too fragile). Rank purely by cost. The section content is what the user judges, not a classifier-assigned category.
CLI surface
burn claude-md [--project <path>] [--since 7d]
# Prints:
# CLAUDE.md at /Users/will/Projects/foo (412 lines, ~3.1k tokens)
# Cost per session: avg $0.18, p95 $0.34
# Cost last 7 days: $4.12 across 23 sessions
#
# Sections ranked by cost:
# lines 12- 47 ## Architecture 1,820 tok $0.11/session
# lines 48- 89 ## Testing conventions 890 tok $0.05/session
# lines 90-148 ## Tone and personality notes 410 tok $0.02/session <- consider trimming
# …
burn claude-md advise [--project <path>]
# Emits a unified diff of suggested TRIM / RESTRUCTURE hunks with
# per-recommendation projected-savings-per-session and savings-over-last-N-sessions.
# Does NOT write to the file. User applies manually.
burn claude-md --json # programmatic
Attribution math: on a fixture session with a known CLAUDE.md and known cacheRead values, burn claude-md reports a per-turn attributed cost within ±10% of a hand-computed baseline.
Section parsing correctly handles: top-level text before the first heading (treated as a "preamble" section), nested headings (rolled up to their parent level for ranking), and files with no headings (reported as a single section with a note).
advise output is applyable by hand: line ranges match the user's current file (run against HEAD, not a stale snapshot).
If CLAUDE.md has changed during the reporting window, attribution uses the CLAUDE.md at the time of each session, not the current version. (Read via git log if the file is tracked; fall back to current file with a warning otherwise.)
The biggest concrete "switch a choice, save spend" answer for the meta-goal. Higher leverage than model-switching because CLAUDE.md savings compound across every future session in the repo.
Context
Prism (jakeefr/prism) validates this as the single highest-leverage waste-finder in a Claude Code project: CLAUDE.md rides in every turn's cached context, so a bloated CLAUDE.md silently taxes every session in the repo forever. Their real-data examples (from the README):
Prism's mechanism is a proxy:
reread_cost = tool_call_count × claude_md_size_tokens(analyzer.py:173). It never touches realcacheReadvalues. The shock ratios are(count × size) / session_incremental_tokens, meant to alarm, not to be precise.Burn can do this correctly because the ledger already carries real per-turn
cacheReadvalues.Scope
One file (plus nested variants): the project's
CLAUDE.mdat repo root. Also detect nested.claude/CLAUDE.mdor scopedsrc/**/CLAUDE.mdfiles Claude Code loads hierarchically.Mechanism
Cost attribution
tool_defs_tokens— sum of the registered tools' schema tokens. Approximate once per session from the Claude Code client version or derive from the session JSONL if exposed; otherwise treat as a fixed constant per Claude Code minor version and measure once.conversation_tokens_T— back-compute fromusage.input - cacheCreate - cacheReadplus the running prefix. Rough is fine; we're computing a share, not a precise boundary.Section ranking
Unlike prism's zone heuristic (top 20%, mid 20-75%, bottom 25% at
advisor.py:234-235), parse CLAUDE.md by markdown heading structure:# Top-level/## Section/### Subsection— each heading begins a section.(section_tokens / total_claude_md_tokens) × session_claude_md_cost.No keyword classification (prism does
tone|style|personalityregex — too fragile). Rank purely by cost. The section content is what the user judges, not a classifier-assigned category.CLI surface
Explicitly no
--applyflag. Unlike prism'sadvise --apply, burn never mutates files.Acceptance
cacheReadvalues,burn claude-mdreports a per-turn attributed cost within ±10% of a hand-computed baseline.burn claude-md --since 7daggregates across sessions for the sameprojectKey(depends on Reader infrastructure: incremental cursors and git-canonical project keys #4's git canonicalization).adviseoutput is applyable by hand: line ranges match the user's current file (run against HEAD, not a stale snapshot).Depends on
burn waste) — this is the most user-visible waste finder.Unblocks
The biggest concrete "switch a choice, save spend" answer for the meta-goal. Higher leverage than model-switching because CLAUDE.md savings compound across every future session in the repo.
Out of scope
advise --applyequivalent) — deliberate; burn recommends, doesn't mutate.analyzer.py:548+). Narrow and doesn't scale. See Design: outcome / quality signal for 'same output, less spend' comparisons #6.