Derived analytics archive foundation: archive.sqlite + burn archive (#40)#78
Merged
willwashburn merged 3 commits intomainfrom Apr 26, 2026
Merged
Derived analytics archive foundation: archive.sqlite + burn archive (#40)#78willwashburn merged 3 commits intomainfrom
burn archive (#40)#78willwashburn merged 3 commits intomainfrom
Conversation
…ive` (#40) Lands the rebuildable SQLite read model alongside the canonical ledger.jsonl so future commands no longer have to scan the whole ledger and refold all stamps on every query. This is the foundation PR — schema, build pipeline, and CLI surface only. Rewiring `burn summary` / `compare` / `plans` and the MCP server onto SQL queries lands in follow-ups so each rewire stays small and reviewable. Lands: - `@relayburn/ledger`: `buildArchive()`, `rebuildArchive()`, `getArchiveStatus()`, `openArchive()`, `archivePath()`, `ARCHIVE_VERSION`. Schema covers `sessions`, `turns`, `tool_calls`, `compactions`, plus a reserved `tool_result_events` table for the future #33 content-sidecar bridge. Stamps are folded into materialized columns (`workflow_id`, `agent_id`, `persona`, `tier`) plus a JSON blob. Build is incremental keyed off `archive_state.ledger_offset_bytes`; rebuild-from-zero is deterministic. - `@relayburn/cli`: `burn archive build | rebuild | status` with `--json`. - Backed by `node:sqlite` so no native build step. The experimental warning is suppressed at the archive boundary so it doesn't pollute every CLI invocation. Acceptance against the issue: - ledger.jsonl remains canonical; deleting `archive.sqlite` and running `burn archive rebuild` recreates it. - Materialized enrichment columns mean stamps are folded once at build time, not on every query. - Rebuild is deterministic: same ledger -> same row counts and primary keys. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…der (Devin review on #78) Three Devin findings on the foundation archive PR: 1. (BUG_0001) `buildArchiveLocked` was stamping `ledgerStat.size` as the ledger cursor outside the transaction, overwriting the safe newline boundary committed by `applyLedgerRange`. If the ledger had a partial trailing line (writer interrupted mid-write), the cursor would advance past the incomplete bytes; the next build would read from mid-line and silently skip the completed turn once it landed. Plumb `safeOffset` through `ApplyResult` so the caller writes the parser's actual newline boundary as the cursor; remove the redundant in-transaction UPDATE. 2. (BUG_0002) `last_rebuild_at` was schema'd but never written. Add an internal `{isRebuild}` option to `buildArchiveLocked` so `rebuildArchive` stamps both `last_built_at` and `last_rebuild_at`; `buildArchive` keeps updating only `last_built_at`. `burn archive status` now shows the "last rebuild" line after `burn archive rebuild`. 3. (BUG_0003) The `## [0.11.0] - 2026-04-25` header was deleted from `packages/ledger/CHANGELOG.md` so the published Plans entry was bleeding back into `[Unreleased]` (and we had two `### Added` subsections in a row). Restore the version header; the new archive entry stays under `[Unreleased]` where it belongs until release. Tests: three new cases in `archive.test.ts`: - `rebuildArchive populates both last_built_at and last_rebuild_at` - `buildArchive only updates lastBuiltAt, not lastRebuildAt` - `partial trailing line: ledger cursor advances only past complete lines` All 358 tests pass (`pnpm run test:ts`). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This was referenced Apr 25, 2026
…e-40 # Conflicts: # packages/cli/src/cli.ts # packages/ledger/CHANGELOG.md
7 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Refs #40.
Summary
Lands the foundation for the derived analytics archive: a rebuildable SQLite read model at
~/.relayburn/archive.sqlite, materialized from the canonicalledger.jsonl. This is the architectural gap between burn as an event collector and burn as a usable local analytics system — every non-trivial query today scans the full ledger and re-folds all stamps in memory.Scoped intentionally to the foundation: schema + build pipeline + CLI surface. Rewiring read commands onto SQL queries lands in follow-up PRs so each rewire stays small and reviewable.
What landed
@relayburn/ledger— newarchive.tsmodule exportingbuildArchive(),rebuildArchive(),getArchiveStatus(),openArchive(),archivePath(),ARCHIVE_VERSION. Schema:sessions— one row per(source, session_id), derived fromturns.turns— one row per ingestedTurnRecord, with stamps folded into materialized columns (workflow_id,agent_id,persona,tier) plus a JSON blob for arbitrary keys.tool_calls— one row perToolCallattached to a turn.compactions— one row per ingestedCompactionEvent.tool_result_events— table reserved (created, not populated) for the future content-sidecar bridge (Design: content sidecar store with retention and opt-out #33) and execution-graph work.archive_state— incremental cursor (ledger_offset_bytes), schema version, last-built timestamps.burn rebuild --reclassifyrewrites the file) by falling back to a clean rebuild.@relayburn/cli—burn archive build | rebuild | status, all with--json.node:sqlite(Node 22 built-in) — no native build step, no extra runtime dep. TheExperimentalWarningis suppressed at the archive boundary so it doesn't pollute every CLI invocation.What's deferred (separate PRs)
burn summary/compare/plans/@relayburn/mcpto read from the archive (each command is a self-contained migration that keeps the in-memory fallback intact).tool_result_eventsfrom the content sidecar (depends on Design: content sidecar store with retention and opt-out #33 landing the richer write path).burn archive vacuum(one-liner; trivial to add when needed).Acceptance against the issue body
ledger.jsonlremains canonical; deletingarchive.sqliteand runningburn archive rebuildrecreates it. (covered by tests)burn summaryexecutes against the archive (deferred to follow-up).burn compare/burn plansas SQL-style grouped queries (deferred to follow-ups).Broader plan
This PR is step 1 of ~5 to fully resolve #40:
burn summaryto read from the archive (with fallback flag).burn compareandburn plans.@relayburn/mcptools to the archive for low-latency self-query.tool_result_eventsonce the content-sidecar bridge (Design: content sidecar store with retention and opt-out #33) lands.Test plan
pnpm install && pnpm run test:ts— 355 tests pass (350 baseline + 11 archive unit + 5 CLI -1 dup count). All green.burn archive build/status/--jsonagainst an empty home directory, plus an end-to-end run that appends turns + a stamp and verifies the SQL contents.archive_state.archive_versiontriggers a clean rebuild on next open.🤖 Generated with Claude Code