Context
PR #76 (#41 first cut) shipped summarizeFidelity and hasMinimumFidelity in @relayburn/analyze, but stopped short of wiring those helpers into the commands that consume the ledger. Quoting PR #76's deferred-work paragraph:
burn compare, burn waste, burn limits, burn plans behavior gating on fidelity class — the helpers (hasMinimumFidelity, summarizeFidelity) are now in place; wiring them into each command is the natural follow-up.
burn limits reads ledger turns via loadForecastFromLedger (packages/cli/src/commands/limits.ts:46) to project tokensSoFar against the active 5-hour window. Today it sums tokens regardless of whether each contributing turn actually has reliable token data — a Codex turn missing token_count lands usage.input === 0 / usage.output === 0 and silently weights the forecast toward "lots of headroom left." Quoting #41:
burn limits / burn plans (#5, #39) … should permit partial usage data where enough exists for spend totals … should mark projections as low-confidence when the underlying fidelity is partial.
Proposal
In packages/cli/src/commands/limits.ts (and the loadForecastFromLedger helper it calls):
- Permissive filter.
limits is allowed to consume partial and aggregate-only data — token / cost totals still mean something even when per-turn detail is fuzzy. Do not default-exclude turns the way compare does. Use the entire slice the active 5-hour window covers.
- Track contributing fidelity. Run
summarizeFidelity over the windowed slice. Compute a confidence flag: high when every contributing turn has class === 'full' or 'usage-only' (with hasInputTokens + hasOutputTokens true); low when any contributing turn is partial / aggregate-only / cost-only / unknown.
- Surface confidence in the rendered output. When
confidence === 'low', append a notice to the human-readable line: forecast: low-confidence (N of M contributing turns lack per-turn token data). The forecast number itself is unchanged — we are not refusing, we are flagging.
- JSON contract. Add a
fidelity block to the --json payload: { confidence: 'high' | 'low', summary: FidelitySummary }. Reuse the same shape as summary --json so programmatic consumers don't have to learn a second schema.
--watch mode. Recompute on each tick; the confidence flag may flip mid-window as new full-fidelity turns arrive.
Acceptance criteria
burn limits continues to render a forecast even when the windowed slice contains partial or aggregate-only turns (no refusal).
- The rendered output shows a low-confidence notice when any contributing turn lacks per-turn token coverage; full-fidelity windows show no notice (suppressed in the all-full common case to avoid noise, matching
summary's behavior).
--json emits a fidelity block with confidence and the underlying FidelitySummary.
--watch re-evaluates confidence on each tick.
- New tests in
packages/cli/src/commands/limits.test.ts cover: high-confidence (all full), low-confidence (one partial turn), and the JSON shape.
Out of scope
- The OAuth
usage endpoint side of limits (Anthropic-side window data). That is independent of TurnRecord.fidelity and stays unchanged.
- Confidence intervals / probabilistic forecasts. The flag is binary
high / low for the first cut.
Refs
Context
PR #76 (#41 first cut) shipped
summarizeFidelityandhasMinimumFidelityin@relayburn/analyze, but stopped short of wiring those helpers into the commands that consume the ledger. Quoting PR #76's deferred-work paragraph:burn limitsreads ledger turns vialoadForecastFromLedger(packages/cli/src/commands/limits.ts:46) to projecttokensSoFaragainst the active 5-hour window. Today it sums tokens regardless of whether each contributing turn actually has reliable token data — a Codex turn missingtoken_countlandsusage.input === 0/usage.output === 0and silently weights the forecast toward "lots of headroom left." Quoting #41:Proposal
In
packages/cli/src/commands/limits.ts(and theloadForecastFromLedgerhelper it calls):limitsis allowed to consumepartialandaggregate-onlydata — token / cost totals still mean something even when per-turn detail is fuzzy. Do not default-exclude turns the waycomparedoes. Use the entire slice the active 5-hour window covers.summarizeFidelityover the windowed slice. Compute aconfidenceflag:highwhen every contributing turn hasclass === 'full'or'usage-only'(withhasInputTokens+hasOutputTokenstrue);lowwhen any contributing turn ispartial/aggregate-only/cost-only/unknown.confidence === 'low', append a notice to the human-readable line:forecast: low-confidence (N of M contributing turns lack per-turn token data). The forecast number itself is unchanged — we are not refusing, we are flagging.fidelityblock to the--jsonpayload:{ confidence: 'high' | 'low', summary: FidelitySummary }. Reuse the same shape assummary --jsonso programmatic consumers don't have to learn a second schema.--watchmode. Recompute on each tick; the confidence flag may flip mid-window as new full-fidelity turns arrive.Acceptance criteria
burn limitscontinues to render a forecast even when the windowed slice containspartialoraggregate-onlyturns (no refusal).summary's behavior).--jsonemits afidelityblock withconfidenceand the underlyingFidelitySummary.--watchre-evaluates confidence on each tick.packages/cli/src/commands/limits.test.tscover: high-confidence (all full), low-confidence (one partial turn), and the JSON shape.Out of scope
usageendpoint side oflimits(Anthropic-side window data). That is independent ofTurnRecord.fidelityand stays unchanged.high/lowfor the first cut.Refs