Add coverage and fidelity metadata to TurnRecord (#41) by willwashburn · Pull Request #76 · AgentWorkforce/burn

willwashburn · 2026-04-25T01:55:09Z

Refs #41 — first cut at coverage and fidelity metadata.

Summary

Introduce explicit machine-readable metadata so commands can distinguish "missing", "zero", "aggregate-only", and "partial" usage data instead of silently treating all four as 0.

What landed

TurnRecord.fidelity (optional) with three pieces:
- granularity: per-turn | per-message | per-session-aggregate | cost-only
- coverage: per-field availability flags (hasInputTokens, hasOutputTokens, hasReasoningTokens, hasCacheReadTokens, hasCacheCreateTokens, hasToolCalls, hasToolResultEvents, hasSessionRelationships, hasRawContent)
- class: derived full | usage-only | aggregate-only | cost-only | partial
Coverage is strictly about availability: hasOutputTokens: false means "we don't know," not "0 output tokens." Numeric fields on Usage carry 0 when the source actually reports zero or doesn't report the field — the matching coverage flag is the only honest way to tell those apart.
EMPTY_COVERAGE, classifyFidelity, and makeFidelity helpers in @relayburn/reader.
The Claude parser populates fidelity on every turn; usage-coverage flags reflect which fields the upstream usage block actually carried (so a turn with no cache_creation reports hasCacheCreateTokens: false).
summarizeFidelity(turns) and hasMinimumFidelity(...) in @relayburn/analyze for command-level aggregation and gating.
burn summary prints a one-line fidelity notice whenever any turn is below full or unsupported.
burn summary --json now emits a structured payload (ingest, turns, totalCost, byModel, fidelity) with per-class / per-granularity counts and per-field missingCoverage totals — programmatic consumers can finally distinguish numeric zero from "we don't know."

Deferred to follow-up PRs

Codex and OpenCode parsers (they still emit no fidelity; consumers treat absence as best-effort full for backward compat).
burn compare, burn waste, burn limits, burn plans behavior gating on fidelity class — the helpers (hasMinimumFidelity, summarizeFidelity) are now in place; wiring them into each command is the natural follow-up.
Mux / Crush collector schemas adopting the shared shape.

Schema

TurnRecord.fidelity is optional. Older ledger entries without the field round-trip unchanged. No v bump needed.

Test plan

pnpm run test:ts — 356 tests pass (15 new).
New fidelity.test.ts in both reader and analyze covers classifyFidelity, makeFidelity, the unknown-bucket counting, and hasMinimumFidelity ordering.
New Claude parser tests cover the full-fidelity case, the tool-call case, and the partial case (missing-output-tokens.jsonl fixture) — verifying usage.output === 0 while coverage.hasOutputTokens === false and class === 'partial'.

🤖 Generated with Claude Code

First cut at issue #41. Introduces explicit machine-readable metadata so commands can distinguish "missing", "zero", "aggregate-only", and "partial" usage data instead of silently treating all four as "0". What landed: - New `Fidelity` shape on `TurnRecord` with `granularity` (per-turn / per-message / per-session-aggregate / cost-only), per-field `coverage` flags (input/output/reasoning/cacheRead/ cacheCreate/toolCalls/toolResultEvents/sessionRelationships/ rawContent), and a derived `class` (full / usage-only / aggregate-only / cost-only / partial). Coverage is strictly about *availability*: `hasOutputTokens: false` means "we don't know," not "0 output tokens." - `EMPTY_COVERAGE`, `classifyFidelity`, and `makeFidelity` helpers in `@relayburn/reader`. - The Claude parser now populates fidelity on every turn; usage coverage flags reflect which fields the upstream `usage` block actually carried. - `summarizeFidelity(turns)` and `hasMinimumFidelity(...)` in `@relayburn/analyze` for command-level aggregation and gating. - `burn summary` prints a fidelity notice when any turn is below full fidelity, and `burn summary --json` now emits a structured payload including a `fidelity` block with per-class / per-granularity counts and per-field `missingCoverage` totals. What's deferred to follow-up PRs: - Codex and OpenCode parsers (they still emit no fidelity field; consumers treat absence as best-effort full). - `burn compare`, `burn waste`, `burn limits`, `burn plans` behavior gating on fidelity class. - Mux / Crush collector schemas adopting the shared shape. Schema: `TurnRecord.fidelity` is optional; older ledgers without the field continue to round-trip unchanged. No on-disk `v` bump needed. Refs #41 Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

devin-ai-integration

✅ Devin Review: No Issues Found

Devin Review analyzed this PR and found no potential bugs to report.

View in Devin Review to see 5 additional findings.

…a-41 # Conflicts: # packages/analyze/CHANGELOG.md # packages/analyze/src/index.ts # packages/reader/CHANGELOG.md

devin-ai-integration

Devin Review found 1 new potential issue.

View 6 additional findings in Devin Review.

devin-ai-integration · 2026-04-25T03:57:10Z

 ### Added

 - **`burn mcp-server`** — stdio MCP (Model Context Protocol) server that lets a running agent self-query its own cost and quota state mid-session. Registers `burn__sessionCost` and `burn__currentBlock`. Read-only. Pair with `buildMcpConfig({sessionId})` from `@relayburn/mcp` to inject the server into a spawned `claude --mcp-config <…>` session. (#26)
+- **`burn summary --json` and fidelity counts** ([#41](https://github.com/AgentWorkforce/burn/issues/41) — first cut). The default summary now prints a one-line `fidelity:` notice whenever any turn is below full or unsupported, with counts by class. `--json` emits a structured payload with `ingest`, `turns`, `totalCost`, `byModel`, and a `fidelity` block (totals by class + granularity, plus per-field `missingCoverage` counts) so programmatic consumers can distinguish numeric zero from "we don't know." Suppressed in the all-full common case to avoid noise.


🟡 CLI changelog entry added to already-released [0.13.1] instead of [Unreleased]

The new burn summary --json fidelity-counts entry is added under ## [0.13.1] - 2026-04-25, which is an already-published version. Per AGENTS.md: "Curate [Unreleased] in the relevant per-package packages/*/CHANGELOG.md as you land PRs." The reader and analyze packages correctly add their entries under [Unreleased], but the CLI package does not. This will cause the publish workflow to miss this entry during the next release promotion (it promotes the [Unreleased] block, which is empty for CLI), and falsely attributes unreleased work to v0.13.1.

Prompt for agents

The new changelog bullet for 'burn summary --json and fidelity counts' is placed under the [0.13.1] section (line 15), which is an already-released version. It should be moved under the [Unreleased] section (after line 8) to match the pattern used by packages/reader/CHANGELOG.md and packages/analyze/CHANGELOG.md. Specifically, move line 15 content to between line 8 and the blank line before [0.13.1], adding the appropriate ### Added subsection header under [Unreleased].

Was this helpful? React with 👍 or 👎 to provide feedback.

devin-ai-integration Bot reviewed Apr 25, 2026

View reviewed changes

Merge remote-tracking branch 'origin/main' into feat/fidelity-metadat…

3b0f9fd

…a-41 # Conflicts: # packages/analyze/CHANGELOG.md # packages/analyze/src/index.ts # packages/reader/CHANGELOG.md

willwashburn merged commit 642391c into main Apr 25, 2026

willwashburn deleted the feat/fidelity-metadata-41 branch April 25, 2026 03:55

devin-ai-integration Bot reviewed Apr 25, 2026

View reviewed changes

This was referenced Apr 26, 2026

Populate TurnRecord.fidelity from Codex parser (#84) #115

Merged

Populate TurnRecord.fidelity from OpenCode parser (#89) #116

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add coverage and fidelity metadata to TurnRecord (#41)#76

Add coverage and fidelity metadata to TurnRecord (#41)#76
willwashburn merged 2 commits intomainfrom
feat/fidelity-metadata-41

willwashburn commented Apr 25, 2026 •

edited by devin-ai-integration Bot

Loading

Uh oh!

devin-ai-integration Bot left a comment

Uh oh!

devin-ai-integration Bot left a comment

Uh oh!

devin-ai-integration Bot Apr 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

willwashburn commented Apr 25, 2026 • edited by devin-ai-integration Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What landed

Deferred to follow-up PRs

Schema

Test plan

Uh oh!

devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

✅ Devin Review: No Issues Found

Uh oh!

devin-ai-integration Bot left a comment

Choose a reason for hiding this comment

Uh oh!

devin-ai-integration Bot Apr 25, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

willwashburn commented Apr 25, 2026 •

edited by devin-ai-integration Bot

Loading