Add burn waste: per-tool-call & per-file cost attribution#50
Add burn waste: per-tool-call & per-file cost attribution#50willwashburn wants to merge 1 commit intomainfrom
Conversation
Computes initial cost (the turn after a tool call, where the tool_result enters context) and persistence cost (subsequent turns where the result rides along in cacheRead) for every tool_use in a session. Aggregates into three rankings: top files (Read/Edit/Write/NotebookEdit by target path), top Bash commands (grouped by argsHash so repeated invocations collapse), and top subagent calls (Agent/Task by subagent_type). Sized attribution uses the content sidecar to estimate each tool_result's token count from its text length (chars/4); when the sidecar is unavail- able the command falls back to even-split across the prior turn's tool calls (initial cost only) and emits a note. Tool results are treated as evicted from cache once a turn's cacheRead drops below their estimated size, capping persistence reasonably without modeling the 5m/1h tier boundary explicitly. Closes #3 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
| const turnRate = t.model === emitModel ? rate : null; | ||
| const cacheReadPrice = turnRate ? turnRate.cacheRead : rate.cacheRead; |
There was a problem hiding this comment.
🟡 Cross-model persistence pricing is a no-op — always uses emit model's rate
In computePersistenceCostSized, lines 271-272 attempt to differentiate pricing when a tail turn uses a different model than the emit turn, but the logic is a no-op. When t.model === emitModel, turnRate = rate so cacheReadPrice = rate.cacheRead. When t.model !== emitModel, turnRate = null so the fallback is still rate.cacheRead. Both branches produce the identical result. The pricing table (available in the caller at packages/analyze/src/waste.ts:130) is never passed through to this function, so it cannot look up the tail turn's actual model rate. In sessions that switch models (e.g. Sonnet → Haiku, where cacheRead rates differ ~10×), persistence costs will be priced at the wrong model's rate.
Prompt for agents
The function computePersistenceCostSized at packages/analyze/src/waste.ts:254 receives only the emit model's ModelCost (rate), not the full PricingTable. Lines 271-272 attempt cross-model rate differentiation but are a no-op because the fallback always uses rate.cacheRead.
To fix: pass the PricingTable through the call chain from attributeSession (line 130) → attributeToolCallsSized (line 164) → computePersistenceCostSized (line 198). Then inside the loop at line 271, call lookupRate(t.model, pricing) to get the tail turn's actual rate and use its cacheRead price. The emitModel parameter and the existing conditional can then be replaced with a proper pricing table lookup.
Affected files: packages/analyze/src/waste.ts (attributeSession, attributeToolCallsSized, computePersistenceCostSized function signatures and bodies).
Was this helpful? React with 👍 or 👎 to provide feedback.
Summary
burn wastecommand: ranks tool calls, files, Bash commands, and subagent calls by their attributed costinput/cacheCreate) and persistence (the same tokens riding along incacheReadfor every subsequent turn until eviction)Closes #3.
Output shape
--allshows the full lists;--jsonemits the raw aggregations for downstream tooling.Attribution math
Stokens contributesS × ((next.input × inputPrice + next.cacheCreate × cacheWritePrice) / (next.input + next.cacheCreate))— i.e. the model-weighted blend of fresh-input and cacheCreate rates that turn N+1 actually paid.S × cacheReadPrice / 1M. Eviction is approximated by stopping whencacheRead < S(the tool_result no longer fits in cache, so it must have aged out or been compacted away).Read/Edit/Write/NotebookEditgroup bytargetpath;Bashgroups byargsHash;Agent/Taskgroup bysubagent_type(or target as a fallback label).Test plan
pnpm run build— cleanpnpm run test— 188/188 passing (six new attribution tests)attributedTotal + unattributedTotal == grandTotalinvariant testburn helplistswasteand example invocationNotes / follow-ups
cacheRead < S) handles the practical case where content has aged out--suggestheuristics (issue burn waste: per-tool-call and per-file spend attribution #3 stretch) are out of scope here — easy follow-up once we have real-session output to tune against🤖 Generated with Claude Code