Skip to content

Migrate burn summary from ledger walk to archive query #82

@willwashburn

Description

@willwashburn

Context

burn summary today calls queryAll() from @relayburn/ledger, which streams the entire ledger.jsonl from disk and re-folds every stamp in memory on every invocation (packages/cli/src/commands/summary.ts:35, packages/ledger/src/reader.ts). PR #78 landed the derived archive at ~/.relayburn/archive.sqlite with materialized turns rows that already have stamp enrichment columns (workflow_id, agent_id, persona, tier) and indexes on ts, model, activity, project_key, and workflow_id — exactly the dimensions summary filters and aggregates on.

PR #78 explicitly defers this rewire:

Rewiring burn summary / compare / plans / @relayburn/mcp to read from the archive (each command is a self-contained migration that keeps the in-memory fallback intact).

Proposal

Replace the queryAll() call in runSummary with an archive-backed query:

  1. Call buildArchive() after ingestAll() so the archive is current before we read.
  2. Add a small summarizeFromArchive(query) helper to @relayburn/ledger that issues a single SQL SELECT ... FROM turns WHERE ... GROUP BY model (or the per-row projection summary needs) using the existing indexes.
  3. Keep the queryAll() path behind a fallback flag (--no-archive or env RELAYBURN_ARCHIVE=0) so users can validate parity and we have an escape hatch if the archive is missing/corrupt.
  4. Subagent-tree and by-subagent-type modes still need per-turn rows; for those, either pull from the archive or fall back to queryAll. The archive already materializes subagent_id / parent_subagent_id / subagent_type so the tree mode should work natively.

Tests:

  • A summary fixture that asserts the archive-backed and ledger-backed code paths produce the same output (text + --json) for a non-trivial mixed-stamp ledger.
  • A test that the archive is auto-built if missing before the first summary invocation.
  • A test that RELAYBURN_ARCHIVE=0 (or the equivalent flag) routes through queryAll unchanged.

Acceptance criteria

  • burn summary no longer scans ledger.jsonl line-by-line on the hot path; it issues SQL against archive.sqlite.
  • Output is byte-identical to the pre-migration implementation for the parity fixture (text and --json).
  • Subagent-tree (--subagent-tree) and --by-subagent-type modes work against the archive (or transparently fall back).
  • Fallback path (--no-archive flag or env) preserves the old behavior.
  • Performance: on a ledger with >=100k turns, archive-backed burn summary --since 7d is at least 5x faster than the current ledger walk on the same machine. (Numbers in the PR description.)

Out of scope

  • Schema additions to the archive (this is a read-side migration).
  • Coverage / fidelity columns (separate follow-up).
  • Rewiring compare, plans, or MCP tools (separate follow-ups).

Refs

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions