Skip to content

test(cli): help-text snapshots for every CLI command#28267

Open
kitlangton wants to merge 3 commits into
worktree-cli-acp-builderfrom
worktree-cli-help-snapshots
Open

test(cli): help-text snapshots for every CLI command#28267
kitlangton wants to merge 3 commits into
worktree-cli-acp-builderfrom
worktree-cli-help-snapshots

Conversation

@kitlangton
Copy link
Copy Markdown
Contributor

@kitlangton kitlangton commented May 19, 2026

Summary

One test file. Spawns `opencode --help` for every documented command + key subcommand (35 total) in parallel under `concurrency: 8`, snapshots the stderr output (yargs writes `--help` to stderr, not stdout).

Pinned snapshots catch flag removals, renames, reordering, and exit-code regressions across the entire user-facing CLI surface in one place. The diff in the .snap file is the surface-change report.

Why this matters

This is the broad-coverage layer that makes the future Effect CLI migration (yargs → effect-smol/cli) safe to attempt: if a refactor preserves the surface, snapshots stay green; if it doesn't, the diff names exactly which command(s) changed.

Implementation notes

  • Snapshots are normalized — the tmpdir prefix that bleeds through `acp --cwd`'s default is rewritten to ``. macOS `/private` realpath and unresolved `os.tmpdir()` both covered.
  • Failures from individual commands are collected and reported together, so a regression in one command doesn't mask issues in others.
  • Excluded: `opencode completion --help` — it's a yargs built-in that emits top-level help and exits 1; not a real command.
  • ~6s wall-clock thanks to parallel spawns.

Stacked on #28263 + #28265

Branches off the acp-builder branch. Once both upstream PRs merge, the diff here shrinks to just the snapshot test + .snap file.

Test plan

  • `bun run test test/cli/help/help-snapshots.test.ts` — 1/1 pass, 34 snapshots, ~6s
  • Re-run for stability — clean
  • `bun run test test/cli/` — 324/324 pass, no regressions
  • `bun run typecheck` clean

Stack

  1. test(cli): subprocess integration tests for opencode acp #28265
  2. test(cli): help-text snapshots for every CLI command #28267 👈 current

@kitlangton kitlangton enabled auto-merge (squash) May 19, 2026 00:37
@kitlangton kitlangton disabled auto-merge May 19, 2026 01:07
@kitlangton kitlangton changed the base branch from dev to worktree-cli-acp-builder May 19, 2026 01:07
@kitlangton kitlangton force-pushed the worktree-cli-acp-builder branch from e38f49f to 8513472 Compare May 19, 2026 01:11
One test file. Spawns `opencode <cmd> --help` for every documented
command + key subcommand (35 in total) in parallel under concurrency:8,
snapshots the stderr output (yargs writes --help to stderr, not stdout).

Snapshots are normalized — the tmpdir prefix that bleeds through
`acp --cwd`'s default is rewritten to `<HOME>` so test runs in
different sandboxes stay stable. macOS `/private` realpath form and
the unresolved `os.tmpdir()` form both covered.

Pinned snapshots catch flag removals, renames, reordering, and exit-code
regressions across the entire user-facing CLI surface in one place.
Diff in the .snap file is the surface-change report.

Excluded: `opencode completion --help` is a yargs built-in that emits
top-level help and exits 1; not a real opencode command.

~6s wall-clock thanks to parallel spawns.
Applied review findings from the simplify pass:

1. Extract fromBunStream(name, get) and forkStderrDrain(stream, into)
   helpers — 4 duplicated Stream.fromReadableStream call sites collapse
   to one factory, and the identical stderr drain across serve/acp is
   now a single helper. Error messages now include the underlying cause
   instead of swallowing it.

2. acp.send awaits proc.stdin.write's backpressure promise. The bare
   Effect.sync was discarding the Promise<number> form, which can
   reorder ndjson lines under pipe-buffer-full conditions and corrupt
   framing.

3. acp.close drops the try/catch around proc.stdin.end() — idempotent
   in Bun, the bare catch only masked future regressions.

4. Effect.ignore on stream drains is now Effect.ignore({ log: true })
   so a real protocol or decode error surfaces in test debug output
   instead of disappearing silently.

5. Help-snapshots replaces the manual failures[] accumulator + continue
   with Effect.partition. Same behavior, no mutable state, declarative.

324/324 CLI tests stay green; typecheck clean.
yargs wraps the \`[string] [default: "..."]\` clause based on the
pre-normalized default value's character length, so a different random
tmpdir width produces a different leading-whitespace count on the
wrapped continuation line. After normalizing the path to \`<HOME>\` we
were left with a one-space drift between runs.

Collapse the wrap-dependent whitespace immediately before the clause
so the snapshot is byte-stable regardless of home path length.

Verified by deleting the .snap and regenerating across 3 runs.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant