Skip to content

Honor fidelity in burn plans (#108)#134

Open
willwashburn wants to merge 1 commit intomainfrom
feat/plans-honor-fidelity-108
Open

Honor fidelity in burn plans (#108)#134
willwashburn wants to merge 1 commit intomainfrom
feat/plans-honor-fidelity-108

Conversation

@willwashburn
Copy link
Copy Markdown
Member

@willwashburn willwashburn commented Apr 26, 2026

Summary

  • computePlanUsage now annotates each cycle with a fidelity: { confidence, summary } block. confidence === 'high' only when every contributing turn is full or usage-only with both per-turn input and output token coverage; otherwise low. Records without a fidelity field stay best-effort high (matches the codebase's existing pre-Coverage and fidelity metadata: distinguish missing, zero, aggregate-only, and partial usage #41 backward-compat policy). Spend totals continue to include partial / aggregate-only / cost-only contributions — under-counting silently is worse than annotating low-confidence — so spentUsd becomes the lower bound the consumer renders against the new flag.
  • burn plans (list view) gains a confidence column when at least one plan has any low-confidence cycle, and a footer note naming the affected plan + lower-bound caveat (note: claude-pro: 3 of 412 turns this cycle lack per-turn token data — totals are a lower bound.). Full-fidelity cycles render exactly as before — no extra column, no footer.
  • --json emits a per-plan usage.fidelity: { confidence, summary } block carrying the same FidelitySummary shape summarizeFidelity produces elsewhere.
  • PlanUsageFidelity is exported from @relayburn/analyze.

Rebase consideration for PR #131

PR #131 (issue #91) is migrating burn plans to read from archive.sqlite. This PR targets the current queryAll-based path on main. When #131 lands, the rebaser needs to apply the same fidelity-collection + low-confidence rule to the archive-backed path so the new annotations don't disappear. The annotation logic itself lives entirely in computePlanUsage (analyze) — it walks whatever turns the caller hands it — so the fidelity data only needs to keep flowing through to the renderer. Concretely, the conflicts will be in packages/cli/src/commands/plans.ts runList and the helper that loads turns; everything in packages/analyze/src/plan-usage.ts should rebase cleanly.

Test plan

  • pnpm run build clean.
  • pnpm run test:ts — 523 passing (10 new).
  • New analyze tests cover: high-confidence cycle (all full), high-confidence cycle (usage-only with both axes), high-confidence cycle for unknown-fidelity (older ledger writers), low-confidence cycle (partial turn), cost-only contributions counted toward spend + flagged low-confidence, empty cycle, and out-of-cycle turns ignored.
  • New CLI tests cover: text table omits the column + footer when every cycle is full-fidelity, text table renders the confidence column + footer note when any cycle is low-confidence, and --json emits the usage.fidelity block with the right confidence / summary shape.

Refs

Closes #108 — refs #41, #76 (which shipped summarizeFidelity / hasMinimumFidelity).


Open in Devin Review

`computePlanUsage` now annotates each cycle with a `fidelity:
{ confidence, summary }` block computed over its contributing turns.
`confidence === 'high'` only when every turn is `full` or `usage-only`
with both per-turn input and output token coverage; otherwise `low`.
Records without a `fidelity` field stay best-effort high (matches the
codebase's existing backward-compat policy). Spend totals continue to
include `partial` / `aggregate-only` / `cost-only` contributions —
under-counting silently is worse than annotating low-confidence — so
the cycle's `spentUsd` is the lower bound the consumer renders against
the new flag.

`burn plans` (list view) renders a `confidence` column and a footer
note (e.g. `note: claude-pro: 3 of 412 turns this cycle lack per-turn
token data — totals are a lower bound.`) when at least one plan has
any low-confidence cycle. Full-fidelity cycles render exactly as
before. `--json` gains a per-plan `usage.fidelity` block.

`PlanUsageFidelity` is exported from `@relayburn/analyze`. The
`limits.test.ts` mocks now include `fidelity` because `PlanUsage`
gained a required field.

Tests cover the high/low/cost-only/partial cycle paths in analyze,
and the rendered-note + JSON shape in cli.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 potential issue.

⚠️ 1 issue in files not directly in the diff

⚠️ Missing root CHANGELOG entry for cross-package change (AGENTS.md violation) (CHANGELOG.md:7-11)

This PR touches both packages/analyze (new PlanUsageFidelity type + deriveFidelity logic) and packages/cli (fidelity column + footer in burn plans list view + JSON). AGENTS.md states: "Update [Unreleased] only when the work spans packages or warrants a top-level summary; single-package work belongs only in that package's CHANGELOG." Since the work spans two packages, the root CHANGELOG.md's [Unreleased] section should have an entry for #108, but it does not (CHANGELOG.md:7-11).

View 3 additional findings in Devin Review.

Open in Devin Review

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

burn plans: honor fidelity (mark monthly spend totals low-confidence on partial usage)

1 participant