Skip to content

Honor fidelity in burn limits (#105)#132

Open
willwashburn wants to merge 1 commit intomainfrom
feat/limits-honor-fidelity-105
Open

Honor fidelity in burn limits (#105)#132
willwashburn wants to merge 1 commit intomainfrom
feat/limits-honor-fidelity-105

Conversation

@willwashburn
Copy link
Copy Markdown
Member

@willwashburn willwashburn commented Apr 26, 2026

Summary

  • burn limits consumes the entire windowed slice (no refusal of partial / aggregate-only / cost-only data) and now classifies the contributing turns via summarizeFidelity to derive a binary high / low confidence flag for the 5-hour forecast.
  • Text mode appends forecast: low-confidence (N of M contributing turns lack per-turn token data) to the forecast block when any contributing turn lacks per-turn token coverage; full-fidelity windows print no notice.
  • --json gains a forecast.fidelity block: { confidence: 'high' | 'low', summary: FidelitySummary }. --watch re-evaluates the flag on each tick.

Test plan

  • pnpm run build
  • pnpm run test:ts (518 tests pass)
  • New packages/cli/src/commands/limits.test.ts cases:
    • high-confidence (all full turns) renders no notice
    • low-confidence (one partial turn) appends the notice but still renders the burn rate + projection
    • --json carries forecast.fidelity.{confidence, summary} with the right counts
    • --json reports confidence: 'high' when every turn is full
    • --watch-style re-invocation flips low → high as new full-fidelity turns arrive

Refs

🤖 Generated with Claude Code


Open in Devin Review

`burn limits` now classifies its 5-hour windowed slice via
`summarizeFidelity` and surfaces a binary high/low confidence flag
without changing the projection itself. Text mode appends a
"forecast: low-confidence (N of M contributing turns lack per-turn
token data)" notice when at least one contributing turn lacks per-turn
token coverage; full-fidelity windows print no notice. `--json` gains
a `forecast.fidelity` block carrying `confidence` and the underlying
`FidelitySummary`. `--watch` re-evaluates confidence on each tick so
the flag flips as fresher full-fidelity turns arrive.

Refs #41, #76. Closes #105.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@devin-ai-integration devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 potential issue.

View 3 additional findings in Devin Review.

Open in Devin Review

Comment on lines +55 to +58
// Count of turns whose per-turn token data is unreliable for forecasting.
// Equivalent to `total - (full + qualified usage-only) - (any unknowns)`;
// surfaced separately so the rendered notice can read "N of M".
lowConfidenceTurns: number;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Interface comment formula for lowConfidenceTurns incorrectly excludes unknowns, contradicting the implementation

The doc comment on lowConfidenceTurns at line 56 claims the value is total - (full + qualified usage-only) - (any unknowns), which says unknown/no-fidelity turns are subtracted from the count. However, the implementation at limits.ts:579-581 does the opposite: turns with !f (no fidelity) are added to lowConfidenceTurns. The function-level doc at limits.ts:569-571 correctly states unknowns "get counted toward lowConfidenceTurns". The actual formula is total - full - qualified_usage_only (unknowns are included, not excluded). This is an exported interface (ForecastFidelity), so downstream consumers of the type definition will get an incorrect mental model of what the field contains — e.g. for a window with 5 turns (2 full, 1 partial, 2 unknown), the comment says lowConfidenceTurns = 5 - 2 - 0 - 2 = 1 but the actual value is 3.

Suggested change
// Count of turns whose per-turn token data is unreliable for forecasting.
// Equivalent to `total - (full + qualified usage-only) - (any unknowns)`;
// surfaced separately so the rendered notice can read "N of M".
lowConfidenceTurns: number;
// Count of turns whose per-turn token data is unreliable for forecasting.
// Equivalent to `total - (full + qualified usage-only)` — unknowns (records
// with no `fidelity` field) are counted here too, not excluded.
// Surfaced separately so the rendered notice can read "N of M".
lowConfidenceTurns: number;
Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

burn limits: honor fidelity (mark forecasts low-confidence on partial usage)

1 participant