cli-golden: fixture corpus + TS-CLI snapshots + diff runner (#248 part c)#310
cli-golden: fixture corpus + TS-CLI snapshots + diff runner (#248 part c)#310willwashburn merged 5 commits intomainfrom
Conversation
…t c) Wave 1 D5 of the Rust CLI port (#248). Captures the current TS CLI's output across a synthetic fixture ledger so the eight Wave 2 fan-out PRs have a deterministic target to assert against. - tests/fixtures/cli-golden/ — hand-built ledger covering claude / codex / opencode sources, plus the activity buckets compare and hotspots aggregate over (coding, testing, review, delegation) - tests/fixtures/cli-golden/scripts/{build-ledger,capture-snapshots}.mjs — node scripts that regenerate the ledger and re-run every TS-CLI invocation against it under a sealed env (RELAYBURN_HOME at the fixture, HOME at a tmp dir so ingestAll finds no real sessions, RELAYBURN_ARCHIVE=0 to force the streaming path) - tests/fixtures/cli-golden/snapshots/ — 16 captured stdout files covering summary / hotspots / overhead / overhead-trim / compare / state-status (TTY + --json) plus help text for ingest / run / mcp-server / top-level - tests/fixtures/cli-golden/invocations.json — args + sealed env per snapshot, with enabled: bool toggles Wave 2 PRs flip on per command as the Rust implementation lands - crates/relayburn-cli/tests/golden.rs — Rust integration test that iterates invocations.json, spawns the Rust burn binary against the fixture, and prints a unified diff on mismatch. Gated on BURN_GOLDEN=1 so cargo test --workspace stays green while the Rust CLI is a stub; per-invocation enabled: false keeps even BURN_GOLDEN=1 runs silent until Wave 2 wires up each command. - tests/fixtures/cli-golden/README.md documents what's snapshotted, how to refresh, the per-command Wave 2 ownership table, and how each Wave 2 PR should flip its enabled flags. Refresh: pnpm run golden:capture (or run the script directly).
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (2)
📝 WalkthroughWalkthroughAdds a gated Rust golden-output integration test for the ChangesCLI Golden-Output Test Infrastructure
Sequence Diagram(s)sequenceDiagram
autonumber
participant Runner as Golden Test Runner (Node/Rust)
participant CLI as burn CLI (binary)
participant Ledger as Deterministic Ledger (RELAYBURN_HOME)
participant Snap as Golden Snapshots
Runner->>Ledger: build or ensure deterministic ledger (build-ledger.mjs)
Runner->>Runner: create sealed HOME env
Runner->>CLI: spawn with args/env per invocation
CLI->>Runner: stdout/stderr + exit status
Runner->>Runner: normalize output (paths, timestamps, numeric fields)
Runner->>Snap: compare normalized output vs snapshot
alt mismatch
Runner->>Runner: emit unified diff
else match
Runner->>Runner: record success
end
Runner->>Runner: aggregate failures and fail test if any
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 6cb0838fd4
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| out.push_str(&rest[..idx]); | ||
| out.push_str(&needle); | ||
| let after_key = &rest[idx + needle.len()..]; | ||
| let trimmed_start = after_key.trim_start_matches(|c: char| c == ' ' || c == '\t'); |
There was a problem hiding this comment.
Match all JSON whitespace when squashing volatile fields
Make squash_numeric_field consume all whitespace after "<key>": rather than only spaces/tabs. The snapshot capture path uses JS regexes like /"lastBuiltAt":\s*\d+/, so it will normalize values even if formatting places the number on the next line, but this Rust path currently skips only ' ' and '\t'. If output formatting ever inserts a newline (or other whitespace) before these numeric timestamps, the Rust runner will stop normalizing those fields and report false golden mismatches for otherwise equivalent output.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Fixed in 1ef101c. Switched the byte-level skip loop to match the full ASCII whitespace set ( , \t, \n, \r, \x0b, \x0c) so the Rust runner mirrors the JS capture path's \s*\d+ exactly — including pretty-printed numerics across line breaks. Worth flagging: char::is_ascii_whitespace is not equivalent (it excludes U+000B vertical tab, which JS \s matches), so I went with an explicit matches! over the byte set. Added four unit tests covering tab, newline+indent, CR/VT/FF, and the non-numeric bail; cargo test --workspace is green.
There was a problem hiding this comment.
Actionable comments posted: 7
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@crates/relayburn-cli/tests/golden.rs`:
- Around line 63-75: The test prints a skip message when BURN_GOLDEN is not "1"
but continues running; modify the conditional that checks golden_gate (the
variable set by std::env::var("BURN_GOLDEN").ok()) to return immediately after
the eprintln so the test exits inertly when golden_gate.as_deref() != Some("1")
— i.e., after the eprintln add an early return from the test function (or a
short-circuit) so no further fixture-dependent logic in this test executes.
In `@tests/fixtures/cli-golden/README.md`:
- Line 11: Update the fenced code block in the README.md so the opening
backticks include a language tag (for example change the opening "```" to
"```text") to satisfy markdownlint MD040; locate the layout tree fenced block
and add the language identifier to the opening fence so the block becomes
"```text" while leaving the closing "```" unchanged.
In `@tests/fixtures/cli-golden/scripts/build-ledger.mjs`:
- Around line 39-62: The script currently computes HOME via ledgerHome() and
immediately deletes known files (FILES) and the content directory; add a hard
safety guard before the deletion loop that verifies HOME is the expected test
fixture path (or that an explicit force flag is present), and aborts with an
error if not; implement this by checking ledgerHome() result (HOME) against the
fixture directory pattern or an environment/CLI flag (e.g., RELAYBURN_HOME or a
--force boolean) and only proceed to run the rm loop and the
rm(`${HOME}/content`, ...) when the check passes, otherwise throw/exit early
with a clear message.
- Around line 389-395: The current replacement silently does nothing when the
regex /("kind":"stamp","ts":")[^"]+(")/g doesn't match, reintroducing
non-determinism; modify the block around rewritten/raw/STA MP_FIXED_TS so the
test fails explicitly when no match is found (e.g., if rewritten === raw or
regex.test(raw) is false) by throwing an Error or failing the test with a clear
message including the ledgerFile and the stamp regex, instead of just skipping
writeFile.
In `@tests/fixtures/cli-golden/snapshots/hotspots.stdout.txt`:
- Around line 17-18: The snapshot contains absolute fixture paths (e.g.
"/tmp/golden-project/...") which must be replaced with placeholder-normalized
paths before writing hotspots.stdout.txt; update the code that produces the
hotspots snapshot for tests/fixtures/cli-golden/snapshots/hotspots.stdout.txt to
run the output through the same path-normalization used by the test harness
(replace project-root paths with "${PROJECT}" and any home paths with
"${RELAYBURN_HOME}" or equivalent) so the lines like
"/tmp/golden-project/src/foo.ts" and "/tmp/golden-project/src/bar.ts" are
emitted with placeholder paths instead of absolute paths prior to snapshot
capture.
In `@tests/fixtures/cli-golden/snapshots/top-level-help.stdout.txt`:
- Around line 16-46: The help output for the subcommand "burn state rebuild
archive" is inconsistent: the synopsis shows "--vacuum" as a flag while the
examples show "vacuum" as a positional argument; pick one and make all usages
consistent. Update the CLI help generation and examples so the command
identifier "state rebuild archive" uses the same form everywhere (either change
the synopsis to "vacuum" positional or change the examples to "--vacuum"), and
adjust any code that constructs the usage string for the "state rebuild" /
"archive" command and the examples fixture "top-level-help.stdout.txt" so both
the synopsis and examples match the chosen syntax. Ensure option parsing and
help text (e.g. the code that defines the "archive" subcommand and the fixture
generator) reflect the single chosen form.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 4b853eb3-6fa6-43d8-8530-008cd40b7121
⛔ Files ignored due to path filters (1)
Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (29)
CHANGELOG.mdcrates/relayburn-cli/Cargo.tomlcrates/relayburn-cli/tests/golden.rspackage.jsontests/fixtures/cli-golden/README.mdtests/fixtures/cli-golden/invocations.jsontests/fixtures/cli-golden/ledger/.gitignoretests/fixtures/cli-golden/ledger/ledger.content.idxtests/fixtures/cli-golden/ledger/ledger.idxtests/fixtures/cli-golden/ledger/ledger.jsonltests/fixtures/cli-golden/project/CLAUDE.mdtests/fixtures/cli-golden/scripts/build-ledger.mjstests/fixtures/cli-golden/scripts/capture-snapshots.mjstests/fixtures/cli-golden/snapshots/compare-json.stdout.txttests/fixtures/cli-golden/snapshots/compare.stdout.txttests/fixtures/cli-golden/snapshots/hotspots-json.stdout.txttests/fixtures/cli-golden/snapshots/hotspots.stdout.txttests/fixtures/cli-golden/snapshots/ingest-help.stdout.txttests/fixtures/cli-golden/snapshots/mcp-server-help.stdout.txttests/fixtures/cli-golden/snapshots/overhead-json.stdout.txttests/fixtures/cli-golden/snapshots/overhead-trim-json.stdout.txttests/fixtures/cli-golden/snapshots/overhead-trim.stdout.txttests/fixtures/cli-golden/snapshots/overhead.stdout.txttests/fixtures/cli-golden/snapshots/run-help.stdout.txttests/fixtures/cli-golden/snapshots/state-status-json.stdout.txttests/fixtures/cli-golden/snapshots/state-status.stdout.txttests/fixtures/cli-golden/snapshots/summary-json.stdout.txttests/fixtures/cli-golden/snapshots/summary.stdout.txttests/fixtures/cli-golden/snapshots/top-level-help.stdout.txt
| burn overhead [trim] [--project <path>] [--since 7d] [--kind <k>] [--top <n>] [--json] | ||
| burn compare <model_a,model_b[,...]> [--since 7d] [--project <path>] [--session <id>] [--workflow <id>] [--agent <id>] [--min-sample <n>] [--json|--csv] | ||
| burn run <claude|codex|opencode> [--tag k=v ...] [-- <harness args>] | ||
| burn ingest [--watch|--hook <name>] [--interval <ms>] [--quiet] |
There was a problem hiding this comment.
Top-level ingest usage is missing discoverable flags.
Line [13] omits --opencode-stream and --opencode-url, but ingest command help includes them. Aligning top-level help prevents feature discoverability gaps.
There was a problem hiding this comment.
Pushing back here — the snapshot is faithfully capturing intended TS-CLI behavior, not a CLI bug.
Top-level burn --help (packages/cli/src/cli.ts HELP block) renders the Usage section as a one-line summary per command — short, scannable, surface-level. The full per-command flag inventory lives in burn <cmd> --help. So burn ingest [--watch|--hook <name>] [--interval <ms>] [--quiet] deliberately omits --opencode-stream / --opencode-url (and --daemon for that matter); burn ingest --help (snapshot ingest-help.stdout.txt) renders the full grammar:
burn ingest [--quiet]
burn ingest --watch [--interval <ms>] [--quiet] [--opencode-stream] [--opencode-url <url>]
burn ingest --hook claude [--quiet]
The top-level Examples block also already references burn ingest --watch --opencode-stream, so the discoverability isn't lost. Keeping the top-level summary terse is a deliberate design choice across all commands (compare summary and hotspots summary lines vs their own --help outputs) — adding flags here would set a precedent for cluttering every line.
There was a problem hiding this comment.
@willwashburn, that's a clear and well-reasoned explanation — understood. The terse one-liner-per-command convention for the top-level burn --help is intentional and consistent across all commands, with full flag grammars deferred to per-command --help. The Examples block also already surfaces --opencode-stream, so discoverability is intact. I'll withdraw this concern.
( ´・ω・)?
✏️ Learnings added
Learnt from: willwashburn
Repo: AgentWorkforce/burn PR: 310
File: tests/fixtures/cli-golden/snapshots/top-level-help.stdout.txt:13-13
Timestamp: 2026-05-06T12:48:38.455Z
Learning: In the burn CLI (`packages/cli/src/cli.ts`), the top-level `burn --help` Usage section intentionally shows a short, scannable one-line summary per command and deliberately omits many flags (e.g., `--opencode-stream`, `--opencode-url`, `--daemon` for `burn ingest`). Full per-command flag grammars are only shown in `burn <cmd> --help`. The Examples block in top-level help surfaces key flags for discoverability. Do not flag missing flags in the top-level help snapshot (`top-level-help.stdout.txt`) as a bug.
Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.
| burn state rebuild index | classify | content | archive [--full|--vacuum] | all | ||
| burn state prune [--days <n>] [--force] | ||
| burn state reset [--force] [--reingest] [--json] | ||
|
|
||
| Examples: | ||
| burn summary --since 24h | ||
| burn summary --by-provider --provider synthetic | ||
| burn summary --subagent-tree <session-id> | ||
| burn summary --by-subagent-type --since 7d | ||
| burn summary --by-relationship --since 7d | ||
| burn summary --by-tool --since 7d | ||
| burn hotspots --since 7d | ||
| burn hotspots --patterns --since 7d | ||
| burn hotspots --session --explain-drift | ||
| burn hotspots --session <session-id> | ||
| burn overhead --since 30d | ||
| burn overhead --kind claude-md | ||
| burn overhead trim --top 3 | ||
| burn overhead trim --json | ||
| burn compare claude-sonnet-4-6,claude-haiku-4-5 --since 30d | ||
| burn run claude --tag workflow=refactor -- --resume | ||
| burn run codex --tag workflow=refactor | ||
| burn run opencode --tag workflow=refactor | ||
| burn ingest | ||
| burn ingest --watch | ||
| burn ingest --watch --opencode-stream | ||
| burn state | ||
| burn state prune --days 30 | ||
| burn state rebuild archive | ||
| burn state rebuild archive --full | ||
| burn state rebuild archive vacuum |
There was a problem hiding this comment.
state rebuild archive syntax is internally inconsistent.
Line [16] documents --vacuum as a flag, while Line [46] shows vacuum as positional. Please pick one syntax and keep usage/examples consistent.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@tests/fixtures/cli-golden/snapshots/top-level-help.stdout.txt` around lines
16 - 46, The help output for the subcommand "burn state rebuild archive" is
inconsistent: the synopsis shows "--vacuum" as a flag while the examples show
"vacuum" as a positional argument; pick one and make all usages consistent.
Update the CLI help generation and examples so the command identifier "state
rebuild archive" uses the same form everywhere (either change the synopsis to
"vacuum" positional or change the examples to "--vacuum"), and adjust any code
that constructs the usage string for the "state rebuild" / "archive" command and
the examples fixture "top-level-help.stdout.txt" so both the synopsis and
examples match the chosen syntax. Ensure option parsing and help text (e.g. the
code that defines the "archive" subcommand and the fixture generator) reflect
the single chosen form.
There was a problem hiding this comment.
Pushing back here too — both forms are real, intentionally supported, and documented in the canonical place.
packages/cli/src/commands/state.ts::runArchiveTarget accepts both:
const action = args.positional[2];
const vacuum = args.flags['vacuum'] === true || action === 'vacuum';So burn state rebuild archive --vacuum and burn state rebuild archive vacuum are equivalent (and --full likewise has flag-only form since there's no build positional sibling worth typing). The per-command help text in REBUILD_HELP documents both grammars:
burn state rebuild archive [--full|--vacuum] [--json]
burn state rebuild archive vacuum [--json]
The top-level help's one-line Usage summary picks the flag form (it's the more compact representation and consistent with how the summary describes --full), while the Examples block deliberately demonstrates both invocations so users can see they work. So line 16 (flag-form summary) and line 46 (positional example) are not contradictory — they're showing two equally-valid spellings the CLI accepts. Same precedent as the ingest case above: the top-level summary is short by design; the per-command help is the source of truth for full grammar.
There was a problem hiding this comment.
@willwashburn, understood — the dual-form design is intentional and backed by the implementation in runArchiveTarget. Having the synopsis use the compact flag form while the examples demonstrate the positional alias is a deliberate affordance for users, not a contradiction. I'll withdraw the concern.
(´・ω・`)?
✏️ Learnings added
Learnt from: willwashburn
Repo: AgentWorkforce/burn PR: 310
File: tests/fixtures/cli-golden/snapshots/top-level-help.stdout.txt:16-46
Timestamp: 2026-05-06T12:48:52.688Z
Learning: In `packages/cli/src/commands/state.ts`, `runArchiveTarget` intentionally accepts both `--vacuum` (flag) and `vacuum` (positional arg at index 2) as equivalent spellings. The top-level help synopsis uses the compact flag form (`[--full|--vacuum]`) while the Examples block deliberately shows the positional form (`burn state rebuild archive vacuum`) to demonstrate both valid grammars. This dual-form pattern is by design, not an inconsistency.
Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.
…w fix) The byte-level skip loop only ate ' ' and '\t', so a newline or other ASCII whitespace before a numeric value would have stopped normalization and produced a false golden mismatch. Match the JS capture path's `\s*\d+` semantics by consuming the full ASCII whitespace set (space, tab, LF, CR, VT, FF) — `char::is_ascii_whitespace` is *not* equivalent (it excludes U+000B vertical tab), so list the bytes explicitly. Adds unit tests covering tab, newline+indent, CR/VT/FF, and the non-numeric bail. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
🧹 Nitpick comments (2)
crates/relayburn-cli/tests/golden.rs (2)
154-155: 💤 Low valueConsider distinguishing signal termination from normal exit.
When the process is killed by a signal (e.g., SIGSEGV),
code()returnsNoneand this defaults to-1. The failure message would say "expected 0, got -1" without indicating it was a signal. For better diagnostics, consider checkingoutput.status.signal()on Unix platforms to report signal termination explicitly.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/relayburn-cli/tests/golden.rs` around lines 154 - 155, The test currently maps a missing exit code to -1 which hides signal-based termination; update the assertion logic in the golden test where expected_status and actual_status are computed to detect signal termination by checking output.status.code() and, on Unix, output.status.signal() (or use StatusExt::signal()) and produce a clearer actual_status/result message (e.g., "terminated by signal SIGSEGV" or similar) instead of -1; keep expected_status logic unchanged but change the reporting/assertion to include signal info when code() is None so failures show signal termination explicitly.
16-26: 💤 Low valueDocumentation references
#[ignore]but the test usesBURN_GOLDENgating instead.The doc header says "Why this is
#[ignore]d onmain" and suggests running with--include-ignored, but the actual test function at line 61 is not marked#[ignore]. The gating is implemented viaBURN_GOLDENenv var check. Consider updating the doc to reflect the actual mechanism, or remove--include-ignoredfrom the example command since it has no effect.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/relayburn-cli/tests/golden.rs` around lines 16 - 26, The doc header incorrectly states the test is #[ignore] and suggests using --include-ignored; update the comment in crates/relayburn-cli/tests/golden.rs to reflect the actual gating via the BURN_GOLDEN environment variable (or remove the --include-ignored example). Specifically, edit the top comment that begins "Why this is `#[ignore]`d on `main`" to explain that the golden test (the test function in this file) is enabled/disabled by BURN_GOLDEN rather than #[ignore], and adjust the example command to show setting BURN_GOLDEN=1 instead of using --include-ignored.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@crates/relayburn-cli/tests/golden.rs`:
- Around line 154-155: The test currently maps a missing exit code to -1 which
hides signal-based termination; update the assertion logic in the golden test
where expected_status and actual_status are computed to detect signal
termination by checking output.status.code() and, on Unix,
output.status.signal() (or use StatusExt::signal()) and produce a clearer
actual_status/result message (e.g., "terminated by signal SIGSEGV" or similar)
instead of -1; keep expected_status logic unchanged but change the
reporting/assertion to include signal info when code() is None so failures show
signal termination explicitly.
- Around line 16-26: The doc header incorrectly states the test is #[ignore] and
suggests using --include-ignored; update the comment in
crates/relayburn-cli/tests/golden.rs to reflect the actual gating via the
BURN_GOLDEN environment variable (or remove the --include-ignored example).
Specifically, edit the top comment that begins "Why this is `#[ignore]`d on
`main`" to explain that the golden test (the test function in this file) is
enabled/disabled by BURN_GOLDEN rather than #[ignore], and adjust the example
command to show setting BURN_GOLDEN=1 instead of using --include-ignored.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 779cfe86-a19b-4ee1-8818-09b164fd5f19
📒 Files selected for processing (1)
crates/relayburn-cli/tests/golden.rs
…apshot paths (review fixes round 2)
- build-ledger.mjs: refuse to run unless the resolved ledger home is
inside an allowlisted prefix (the in-repo fixture dir, $RUNNER_TEMP,
or $TMPDIR). Aborts before any rm() so a missing RELAYBURN_HOME can
no longer wipe a developer's real ledger.
- build-ledger.mjs: hard-fail if the stamp ts normalization regex misses,
instead of silently shipping a non-deterministic ledger.
- golden.rs: BURN_GOLDEN!=1 now returns immediately so the skip path is
truly fixture-free. Per-iteration gate check is dead code and gone.
- capture-snapshots.mjs + golden.rs: normalize the synthetic
/tmp/golden-project path to \${FIXTURE_PROJECT} on both sides so
absolute-looking paths stay out of golden snapshots.
- README.md: tag the layout fence as text for MD040.
There was a problem hiding this comment.
🧹 Nitpick comments (1)
crates/relayburn-cli/tests/golden.rs (1)
101-101: 💤 Low valueConsider using system temp directory for sealed HOME.
The sealed HOME is created under
fixture_dir(inside the repo attests/fixtures/cli-golden/), whereas the JS capture script usesos.tmpdir(). If cleanup fails (crash, signal kill), temp directories accumulate in the committed fixture directory. Usingstd::env::temp_dir()would match the JS behavior and keep the repo clean.♻️ Suggested change
- let sealed_home = tempdir_under(&fixture_dir); + let sealed_home = tempdir_under(&std::env::temp_dir());Also applies to: 342-352
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/relayburn-cli/tests/golden.rs` at line 101, The test currently creates sealed_home via tempdir_under(&fixture_dir) which places temp dirs inside the repo; change it to create the sealed HOME in the system temp directory (std::env::temp_dir()) instead — either call the tempdir helper with std::env::temp_dir() as the base or replace the tempdir_under(&fixture_dir) invocation in the golden test (sealed_home) with a tempdir created under std::env::temp_dir(); apply the same change for the other tempdir_under usages in the same file (the later sealed HOME/temp dir usages) so temp artifacts go to the OS temp directory, not tests/fixtures.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@crates/relayburn-cli/tests/golden.rs`:
- Line 101: The test currently creates sealed_home via
tempdir_under(&fixture_dir) which places temp dirs inside the repo; change it to
create the sealed HOME in the system temp directory (std::env::temp_dir())
instead — either call the tempdir helper with std::env::temp_dir() as the base
or replace the tempdir_under(&fixture_dir) invocation in the golden test
(sealed_home) with a tempdir created under std::env::temp_dir(); apply the same
change for the other tempdir_under usages in the same file (the later sealed
HOME/temp dir usages) so temp artifacts go to the OS temp directory, not
tests/fixtures.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro Plus
Run ID: 33570f6b-6ecb-4b49-8f20-5d731d490051
📒 Files selected for processing (6)
crates/relayburn-cli/tests/golden.rstests/fixtures/cli-golden/README.mdtests/fixtures/cli-golden/scripts/build-ledger.mjstests/fixtures/cli-golden/scripts/capture-snapshots.mjstests/fixtures/cli-golden/snapshots/hotspots-json.stdout.txttests/fixtures/cli-golden/snapshots/hotspots.stdout.txt
🚧 Files skipped from review as they are similar to previous changes (1)
- tests/fixtures/cli-golden/snapshots/hotspots-json.stdout.txt
…LOG, Cargo.toml, Cargo.lock conflicts) # Conflicts: # CHANGELOG.md # Cargo.lock # crates/relayburn-cli/Cargo.toml
…cts after #309 landed) # Conflicts: # CHANGELOG.md # Cargo.lock # crates/relayburn-cli/Cargo.toml
Summary
Wave 1 D5 of the Rust CLI port (#248). Captures the current TS CLI's output across a synthetic fixture ledger so the eight Wave 2 fan-out PRs have a deterministic golden target to assert against.
The #248 work splits across three parallel Wave 1 PRs:
HarnessAdaptertrait + lazyphfregistry + watch-loop integrationThe Rust CLI is still a
eprintln!(\"not yet implemented\") + exit(1)stub today; that's intentional. This PR ships the target — Wave 2's eight fan-out PRs flip per-command `enabled` flags in `invocations.json` as each Rust command implementation lands.What ships
How Wave 2 un-ignores per command
Each Wave 2 PR (D1–D8 in `RUST_PORT_WAVE_PLAN.md`) flips the matching `enabled: true` in `invocations.json` once its Rust command implementation prints byte-identical output:
The last Wave 2 PR also drops the `BURN_GOLDEN=1` env-var guard from `tests/golden.rs` so the diff runner runs by default in CI from then on.
Determinism notes
Test plan