Skip to content

feat(test): PTY-based interactive CLI snapshot harness#2052

Draft
fengmk2 wants to merge 20 commits into
mainfrom
rfc/interactive-snapshot-tests
Draft

feat(test): PTY-based interactive CLI snapshot harness#2052
fengmk2 wants to merge 20 commits into
mainfrom
rfc/interactive-snapshot-tests

Conversation

@fengmk2

@fengmk2 fengmk2 commented Jul 5, 2026

Copy link
Copy Markdown
Member

Implements the RFC included in this PR as rfcs/interactive-snapshot-tests.md.

  • crates/vite_cli_snapshots: PTY snapshot harness (libtest-mimic). Every step runs in a real pseudo-terminal with vt100 grid capture; interactive steps script keystrokes synchronized on OSC 8 milestones; snapshots are Markdown compared with real pass/fail semantics (UPDATE_SNAPSHOTS=1 to accept, .md.new plus unified diff on mismatch). Built on the pty_terminal/snapshot_test crates from vite-task at the already-pinned rev.
  • One fixture tree replaces the local/global split: each case declares vp = "local" | "global" | ["local", "global"]. The parity matrix already caught real drift between the two help outputs. Per-case VP_HOME/HOME/npm-prefix isolation removes serial and the bootstrap byte-match requirement; seed-runtime symlinks a provisioned managed runtime so cases do not download Node per case.
  • vpt test multitool: vtt-aligned subcommands plus json-edit, chmod, and probe (interactive payload proving milestone round-trips before product commands are instrumented).
  • tool migrate-snap-tests <dir> --vp <flavor> [filter]: one-click migration of old steps.json cases with a report; validated by migrating the four check-pass* cases, which pass under the new harness with equivalent assertions.
  • packages/prompts: milestone emission (select/confirm/text, gated on VP_EMIT_MILESTONES=1, byte-identical to the vite-task protocol, unit-tested; render output unchanged when disabled).
  • Entry points: just snapshot-test [filter], pnpm snapshot-test. CI runs the suite on the Rust test job with VP_SNAP_SKIP_FLAVORS=local until the JS build joins during migration.

Validation: 8 snapshot trials pass in compare mode (including the interactive milestone case), prompts suite 10/10, clippy/rustfmt/vp check clean, just test excludes the harness crate.

Follow-ups per the RFC phasing: local-registry case support, remaining prompt components (multiselect, password, spinner), migration batches and old-harness removal.

  • Windows: the suite runs in the new cli-snapshot-test-windows job via a cross-compiled nextest archive (no Rust toolchain on the runner); first landing on this PR, watch its leg.

@netlify

netlify Bot commented Jul 5, 2026

Copy link
Copy Markdown

Deploy Preview for viteplus-preview canceled.

Name Link
🔨 Latest commit bbf3a94
🔍 Latest deploy log https://app.netlify.com/projects/viteplus-preview/deploys/6a4a5a362be0500008ccb379

fengmk2 commented Jul 5, 2026

Copy link
Copy Markdown
Member Author

How to use the Graphite Merge Queue

Add the label auto-merge to this PR to add it to the merge queue.

You must have a Graphite account in order to use the merge queue. Sign up using this link.

An organization admin has enabled the Graphite Merge Queue in this repository.

Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue.

This stack of pull requests is managed by Graphite. Learn more about stacking.

@socket-security

socket-security Bot commented Jul 5, 2026

Copy link
Copy Markdown

Review the following changes in direct dependencies. Learn more about Socket for GitHub.

Diff Package Supply Chain
Security
Vulnerability Quality Maintenance License
Addedcargo/​nix@​0.28.0801009310070
Addedcargo/​winreg@​0.10.18210093100100
Addedcargo/​cp_r@​0.5.210010093100100
Addedcargo/​toml@​1.1.2%2Bspec-1.1.010010093100100
Addedcargo/​libtest-mimic@​0.8.210010099100100

View full report


vp-specific additions with no `vtt` counterpart: `vpt json-edit <file> <dot-path> <value>` (the existing snap-tests `json-edit` helper for fixture manifest edits) and `vpt chmod`.

Reusing `vtt` itself was considered and rejected. Cargo git dependencies provide library code only, never a dependency's binaries, so obtaining the `vtt` executable would require an out-of-band `cargo install --git` pinned in lockstep with the other vite-task git deps across local dev, CI, and nextest archives. Reusing it as a library would mean depending on `vite_task_bin` and dragging the entire `vt` product tree (task engine, TUI, server, fspy) into the harness build for a handful of trivial helpers. And vp-specific subcommands would then need upstream PRs plus dep bumps before tests here could use them. If the duplication ever becomes a maintenance burden, the designated path is upstream extraction: vite-task moves the subcommands into a small library crate (as `pty_terminal` already is for the emulator) and `vtt`/`vpt` become thin bin wrappers over it.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@wan9chi Please help me see how to make vt reusable or inheritable/extendable by vpt, so that vpt only needs to add additional commands without duplicating the code implementation.

fengmk2 added 11 commits July 5, 2026 14:03
Implements rfcs/interactive-snapshot-tests.md:

- crates/vite_cli_snapshots: libtest-mimic runner that executes every step
  in a real PTY (vt100 grid capture), synchronizes interactive input on
  OSC 8 milestones, and compares Markdown snapshots with real pass/fail
  semantics (UPDATE_SNAPSHOTS=1 to accept)
- one fixture tree with per-case vp flavor (local, global, or both for
  parity), per-case VP_HOME/HOME isolation, and managed-runtime seeding
- vpt test multitool (vtt-aligned subcommands plus json-edit, chmod, probe)
- tool migrate-snap-tests: one-click conversion of old steps.json cases,
  validated by migrating the check-pass cases end to end
- packages/prompts: milestone emission (VP_EMIT_MILESTONES=1) wired into
  select, confirm, and text renders
- just snapshot-test recipe, pnpm snapshot-test wrapper, and a CI step on
  the Rust test job (global flavor only until the JS build joins during
  migration)

Claude-Session: https://claude.ai/code/session_01NRgjMi2Vus3iJctudGEWPT
Full both-flavor snapshot coverage now runs in cli-snap-test, which builds
packages/cli/dist for the local flavor and reuses the installed release
binary for the global flavor via the new VP_SNAP_GLOBAL_VP override (no
second vite_global_cli compile). The Rust test job keeps its fast
global-only leg for early signal, with a comment explaining the split.

Claude-Session: https://claude.ai/code/session_01NRgjMi2Vus3iJctudGEWPT
The archive's package loop is kept in sync with the justfile test recipe,
which already excluded vite_cli_snapshots; the archive copy was missed, so
the relocated cli_snapshots binary panicked at nextest list time on the
Windows runner (its compile-time CARGO_MANIFEST_DIR is a Linux path).

Also prefer the runtime CARGO_MANIFEST_DIR over the compile-time value in
the harness, which is what relocated nextest archives rewrite; this is the
groundwork for the planned Windows legs.

Claude-Session: https://claude.ai/code/session_01NRgjMi2Vus3iJctudGEWPT
The e2e job's repo-wide vp check flagged the markdown table alignment.

Claude-Session: https://claude.ai/code/session_01NRgjMi2Vus3iJctudGEWPT
The fixtures under crates/vite_cli_snapshots/tests are workspaces under
test, not repo code, same as the existing snap-tests excludes.

Claude-Session: https://claude.ai/code/session_01NRgjMi2Vus3iJctudGEWPT
Adds the harness reference README (case/step/interaction schema, vpt
helpers, milestone conventions, env overrides, migration workflow) and
points AGENTS.md and CONTRIBUTING.md at it, marking the legacy snap trees
as migration-only so new cases land in the new harness.

Claude-Session: https://claude.ai/code/session_01NRgjMi2Vus3iJctudGEWPT
build-windows-tests now cross-compiles a dedicated -p vite_cli_snapshots
archive (test binary + vpt), and the new cli-snapshot-test-windows job runs
it on windows-latest with no Rust toolchain: prebuilt vp via
VP_SNAP_GLOBAL_VP, JS CLI built on the runner for the local flavor, managed
runtime prewarmed for seed-runtime. vpt resolution now prefers the runtime
CARGO_BIN_EXE_vpt that nextest rewrites under --workspace-remap, matching
the CARGO_MANIFEST_DIR handling.

Claude-Session: https://claude.ai/code/session_01NRgjMi2Vus3iJctudGEWPT
Reuse: milestone hex encoding via Buffer.toString('hex'); the migrator now
imports the legacy Steps schema from snap-test.ts instead of redeclaring it.
Simplification: shared makeTodo/redirect handling and a verbatim-vpt set in
the migrator; dead NewStep fields dropped; shared find_beside_test_exe and
manifest_dir helpers in the harness; the always-true separator guard from
the upstream port removed.
Efficiency: redaction regexes compiled once per run (LazyLock), diagnostic
sort skipped when no blocks exist, fixture staging filters harness metadata
instead of copy-then-delete, per-step env only cloned when a step overrides
it, and the prompts milestone flag is cached at module load (per-keystroke
path).

Suite output is unchanged: all 8 trials pass against existing snapshots and
the migrator reproduces byte-identical fixtures.

Claude-Session: https://claude.ai/code/session_01NRgjMi2Vus3iJctudGEWPT
@fengmk2 fengmk2 force-pushed the rfc/interactive-snapshot-tests branch from fefbe9d to e6ed290 Compare July 5, 2026 06:10
fengmk2 added 9 commits July 5, 2026 14:44
std's canonicalize returns a \\?\ verbatim path on Windows; CMD.EXE,
which runs the local flavor's .cmd shims, rejects verbatim/UNC working
directories ('UNC paths are not supported'), so every local-flavor case
failed instantly on the Windows snapshot job. Also run the Windows suite
with --no-fail-fast: on a snapshot suite every diff is diagnostic signal.

Claude-Session: https://claude.ai/code/session_01NRgjMi2Vus3iJctudGEWPT
The full both-flavor suite already runs in cli-snap-test (linux/mac) and
cli-snapshot-test-windows; the extra leg only re-ran the global cases and
cost a vite_global_cli build inside the Rust test job.

Claude-Session: https://claude.ai/code/session_01NRgjMi2Vus3iJctudGEWPT
On Windows ~/.vite-plus/bin/vp.exe is the trampoline, which re-execs
%VP_HOME%/current/bin/vp.exe; the harness gives each case an isolated
VP_HOME with no install inside, so vp_help::help::global failed with
'failed to execute ...current/bin/vp.exe'. Use current/bin/vp.exe (the
real CLI) as VP_SNAP_GLOBAL_VP, matching what bin/vp is on Unix.

Claude-Session: https://claude.ai/code/session_01NRgjMi2Vus3iJctudGEWPT
Converted with tool migrate-snap-tests (15 help/version steps, zero hand
conversions). The new snapshot keeps every legacy assertion and adds the
banner line the old pipe capture missed; vp -V now records the isolated
workspace state (tools Not found) since the new harness does not symlink
the checkout node_modules into fixtures.

Claude-Session: https://claude.ai/code/session_01NRgjMi2Vus3iJctudGEWPT
…llisions

Successfully converted case directories are now removed from the legacy
tree automatically (git history keeps the originals; --keep-old defers).
A case whose target fixture already exists is skipped and reported instead
of clobbering it: the same name in both legacy trees means a hand merge.

Also migrates packages/cli/snap-tests/cli-helper-message as the first such
merge: the fixture gains a cli_helper_message_local case (vp -h / -V) next
to the 15-step global one, sharing the legacy package.json; the global
snapshot is unchanged by the added file.

Claude-Session: https://claude.ai/code/session_01NRgjMi2Vus3iJctudGEWPT
Extracted from the cli-snap-test matrix per review: a dedicated Linux/macOS
job with no runner.os/shard filter conditions, mirroring the Windows job's
structure (build-upstream for dist + release vp, bootstrap-cli:ci, runtime
prewarm for seed-runtime, then cargo test).

Claude-Session: https://claude.ai/code/session_01NRgjMi2Vus3iJctudGEWPT
build-upstream and the snapshot suite never touch docs/ (the Windows
snapshot job already runs green without it); the step exists in
cli-e2e-test for pnpm tsgo, which this job does not run.

Claude-Session: https://claude.ai/code/session_01NRgjMi2Vus3iJctudGEWPT
Converted with tool migrate-snap-tests (env-prefixed commands became step
envs, win32 skip became skip-platforms; zero hand conversions; the old case
dir was removed by the migrator). The task-cache flow asserts identically:
cold miss, cache hit with replay trailer, and env-changed miss. Build sizes
and the asset hash are now recorded concretely instead of masked; they are
deterministic per vite version.

Claude-Session: https://claude.ai/code/session_01NRgjMi2Vus3iJctudGEWPT
The win32 skip was inherited from the legacy case (added in #544 with no
stated reason). Its plausible causes are gone in the new harness: env
prefixes are structured step envs instead of shell syntax, the task engine
supports Windows, .gitattributes forces LF so the asset content hash is
checkout-stable, and vite prints forward-slash paths on every OS. The
Windows snapshot job is the arbiter; if it disagrees, the skip returns
with a documented reason.

Claude-Session: https://claude.ai/code/session_01NRgjMi2Vus3iJctudGEWPT
@fengmk2

fengmk2 commented Jul 5, 2026

Copy link
Copy Markdown
Member Author

Feedback from #2031's Windows run (https://github.com/voidzero-dev/vite-plus/actions/runs/28742954669/job/85229263838): 9 of 31 trials fail, all snapshot mismatches from the new app-command fixtures, and both root causes are Windows gaps in the harness rather than product or fixture bugs.

1. redact.rs: the Windows backslash normalization is gated on an absolute-path redaction having matched the same screen.

In redact_string, the cow_replace("\\", "/") pass lives inside the if let Cow::Owned(..) branch, so it only runs when one of the path redaction pairs (temp dir, VP_HOME, ...) matched that screen. Screens that contain only relative native paths are never normalized. tsdown prints OS-native separators, so every fixture that snapshots pack output fails:

-ℹ entry: src/index.ts        (recorded on macOS)
+ℹ entry: src\index.ts        (Windows actual)
-ℹ dist/index.mjs  0.10 kB │ gzip: 0.11 kB
+ℹ dist\index.mjs  0.10 kB │ gzip: 0.11 kB

Affected trials: app_root_default_package::default_package (both flavors), pack_default_entry (both), single_package::pack_in_place (both), cwd_flag (both). The old harness's replaceUnstableOutput normalized backslashes unconditionally, which is why the old-format versions of these same tests passed Windows CI. Suggested fix: make the per-screen backslash-to-slash normalization unconditional on Windows (independent of whether a redaction pair matched).

2. Renderer: ConPTY repaints a row padded to full grid width with explicit spaces when a second console client attaches.

app_root_auto_select::auto_select::global and app_root_default_package::default_package::global capture one line padded with trailing spaces to the terminal width:

-Tip: run this directly with `vp -C apps/web build`
+Tip: run this directly with `vp -C apps/web build`                    ...(~400 spaces)

The signature is precise: it occurs exactly at the boundary where vp's own stdout lines are followed by the spawned tool's first writes, and only in the global flavor, where vp.exe spawns node as a second ConPTY client (the local flavor, with node as the direct PTY child, passes the same fixtures). Suggested fix: trim trailing whitespace per rendered row before snapshot comparison.

Rider on 1 (cosmetic): dist/index.mjs reports 0.11 kB on Windows vs 0.10 kB recorded in two trials (single_package::pack_in_place), so the emitted bundle is a few bytes larger on Windows (line endings or an embedded path, not yet pinned down). Only visible because sizes are not redacted; worth deciding whether sizes should be.

With fixes 1 and 2 in the harness, all nine trials should pass against the already-recorded snapshots; nothing in the #2031 fixtures needs to change. Happy to re-run the Windows job on #2031 once this lands to confirm.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant