Releases: noah-thing/receipt
Release list
AI token receipt
Receipt posts an itemized AI-cost comment, tokens and dollars, broken down by model, on every pull request, scoped to that branch's commits. It puts the cost of an agent's work right next to the diff, instead of waiting for the monthly bill.
Works with Claude Code, Cursor, Copilot, Aider, Codex, and any OpenAI or Anthropic-compatible API. Every figure comes from the provider's own token counts times a price book you can read and override. It never stores prompts or completions.
Receipt 0.5.0 — session-health automation
Builds on 0.4.0's session health with automation, history, and a reviewer-facing note.
receipt guard— a Claude Code hook entrypoint. Stays silent until a session crosses your gate, then prints the single most important move where the agent sees it. With--notifyit exits 2 so Claude Code feeds the nudge back to the model — strongest on thePreCompacthook, right before lossy auto-compaction.receipt health --json/--quiet --gate— machine-readable output and severity exit codes (0 / 10 watch / 20 degrading / 30 critical) for hooks and CI.receipt health --all— scores every past session and learns your personal pattern ("you tend to drift around turn ~12; X% of sessions compacted too late").- Context tax — shows how much of a session is just re-sending itself (the quadratic cost behind both rising spend and fading quality).
- PR-comment health note — a collapsed, reviewer-facing
<details>block when the work ran under degrading conditions; silent otherwise; opt out with"health": false. It never claims the code is wrong — only points to where to look.
Honest constraint: the token-only ledger cannot detect redundant file reads, identical-command loops, or semantic issues (hallucinations, drift) — those need data Receipt deliberately never stores. Features only ever flag "conditions correlated with drift," documented in docs/SESSION-HEALTH.md.
Still deterministic and local. 104 tests, typecheck clean. MIT.
Receipt 0.4.0 — session health
Receipt now keeps your agent sharp, not just cheap. 🧠
receipt health # is the current session still sharp?Long AI-coding sessions get worse before they hit any limit — context rot, lost-in-the-middle, instruction drift, looping, and lossy auto-compaction set in while the window still has room. Built from a six-agent research sweep (RULER, lost-in-the-middle, Chroma's context-rot study, the multi-turn "lost in conversation" finding, and Anthropic's context-engineering guidance), receipt health reads the same local ledger and scores the latest session for:
- context fill — by % and absolute tokens (big windows degrade by absolute size)
- session length & age — multi-turn drift past ~12 turns; accumulation slowdown past ~2h
- cache reuse — low reuse means the context is churning
- auto-compaction cascades — each summary loses precise detail
- looping — retries that re-send everything for nothing
…then tells you the move to make before quality drops: /compact (proactively at 50–60%), /clear between tasks, a fresh session with a handoff, or /rewind out of a loop.
It never tells the agent to think or write less — every suggestion refreshes context to preserve full capability. The live gauge also rides in receipt statusline and receipt fuel. Research, signals, and thresholds: docs/SESSION-HEALTH.md.
Deterministic and local — no API key, nothing leaves your machine. 89 tests, typecheck clean. MIT.
Receipt 0.3.0 — advice
Receipt now tells you how to spend fewer tokens — without making the work worse. 💡
receipt advice # this branch
receipt advice --all # the whole ledgerIt reads your usage, names what's driving the cost, and recommends fixes that only target waste — never "think less" or "explain less." Output and reasoning are the value; the advisor leaves them alone. What it catches:
- context re-sent at full price instead of cached,
- the cache rebuilt faster than it's reused,
- retries that re-send everything for no new output,
- premium prices paid for mechanical reads a cheaper model handles at the same quality.
Each tip is ranked by impact with the dollar saving where it's computable, and the single biggest win rides along on every PR receipt and receipt show.
Deterministic and local — no API key, no network. 75 tests, typecheck clean. MIT.
Receipt 0.2.1 — hardening
Hardening and polish for the usage-awareness feature shipped in 0.2.0. No breaking changes.
- Provider-aware lever. The what-if suggestion now picks the cheapest model from the same provider — an OpenAI-heavy branch is told about
gpt-4.1-nano, not a Claude model. - Foolproof plan lookup. A malformed or hand-edited plan id (e.g.
toString) can no longer resolve to an inherited object member and poison the budget.budget planvalidates strictly. - Cleaner distribution. Task sizes and records ignore unscoped proxy-only usage, so "≈ 2 PRs" reflects your real branches.
- Quieter proxy. Rate-limit capture skips no-op disk writes.
- Opt-out. Set
"usage": falsein.receipt/config.jsonto drop the block from the PR comment andreceipt show.
67 tests, typecheck clean. Verified end-to-end across empty / no-plan / full scenarios. MIT.
Receipt 0.2.0 — usage awareness
See what a task costs you, not just dollars. 🔋
Receipt now measures every task as a share of your Claude plan's windows — the 5-hour and weekly limits — and turns that into things that change how you work.
New commands
receipt fuel— how much of your plan you're using right now, what's left, and your pacereceipt records— your heaviest, leanest, and most recent tasks, rankedreceipt forecast— a typical task's window impact and your weekly runwayreceipt statusline— a one-line gauge for the Claude Code statuslinereceipt calibrate— set your real budget from a limit you actually hitreceipt budget plan <pro|max5x|max20x>— pick your tier
On the PR comment and receipt show: a usage block with the % of your 5h and weekly windows the branch ate, the same number in your own work-units, an efficiency grade, and the one lever worth pulling (what the heavy model's reads would cost on a cheaper one). Add --fun for honest playful comparisons.
Real numbers, honestly. The proxy now reads your true ceiling from the provider's anthropic-ratelimit-* headers; or receipt calibrate learns it from a wall you hit. Presets are clearly labeled estimates. How it all works: docs/LIMITS.md.
Everything reads the same local, append-only ledger — no prompts, nothing leaves your machine. 60 tests, typecheck clean. MIT.
Receipt 0.1.0
First public version. Itemized AI cost on every pull request.
- Logging proxy for streaming and non-streaming usage
- Importers for agent session logs and any JSON/JSONL export
- Append-only token and cost ledger (metadata only, never content)
- Markdown PR receipt: model table, spend sparkline, budget bar, median-PR compare
receipt postand a composite GitHub Action that upsert one sticky comment- Local dashboard: spend over time, by model, by branch, top calls
Use it on a PR with uses: noah-thing/receipt@v1. 40 tests, no production dependencies.