feat(test): PTY-based interactive CLI snapshot harness#2052
Conversation
✅ Deploy Preview for viteplus-preview canceled.
|
How to use the Graphite Merge QueueAdd the label auto-merge to this PR to add it to the merge queue. You must have a Graphite account in order to use the merge queue. Sign up using this link. An organization admin has enabled the Graphite Merge Queue in this repository. Please do not merge from GitHub as this will restart CI on PRs being processed by the merge queue. This stack of pull requests is managed by Graphite. Learn more about stacking. |
|
Review the following changes in direct dependencies. Learn more about Socket for GitHub.
|
|
|
||
| vp-specific additions with no `vtt` counterpart: `vpt json-edit <file> <dot-path> <value>` (the existing snap-tests `json-edit` helper for fixture manifest edits) and `vpt chmod`. | ||
|
|
||
| Reusing `vtt` itself was considered and rejected. Cargo git dependencies provide library code only, never a dependency's binaries, so obtaining the `vtt` executable would require an out-of-band `cargo install --git` pinned in lockstep with the other vite-task git deps across local dev, CI, and nextest archives. Reusing it as a library would mean depending on `vite_task_bin` and dragging the entire `vt` product tree (task engine, TUI, server, fspy) into the harness build for a handful of trivial helpers. And vp-specific subcommands would then need upstream PRs plus dep bumps before tests here could use them. If the duplication ever becomes a maintenance burden, the designated path is upstream extraction: vite-task moves the subcommands into a small library crate (as `pty_terminal` already is for the emulator) and `vtt`/`vpt` become thin bin wrappers over it. |
There was a problem hiding this comment.
@wan9chi Please help me see how to make vt reusable or inheritable/extendable by vpt, so that vpt only needs to add additional commands without duplicating the code implementation.
Implements rfcs/interactive-snapshot-tests.md: - crates/vite_cli_snapshots: libtest-mimic runner that executes every step in a real PTY (vt100 grid capture), synchronizes interactive input on OSC 8 milestones, and compares Markdown snapshots with real pass/fail semantics (UPDATE_SNAPSHOTS=1 to accept) - one fixture tree with per-case vp flavor (local, global, or both for parity), per-case VP_HOME/HOME isolation, and managed-runtime seeding - vpt test multitool (vtt-aligned subcommands plus json-edit, chmod, probe) - tool migrate-snap-tests: one-click conversion of old steps.json cases, validated by migrating the check-pass cases end to end - packages/prompts: milestone emission (VP_EMIT_MILESTONES=1) wired into select, confirm, and text renders - just snapshot-test recipe, pnpm snapshot-test wrapper, and a CI step on the Rust test job (global flavor only until the JS build joins during migration) Claude-Session: https://claude.ai/code/session_01NRgjMi2Vus3iJctudGEWPT
Full both-flavor snapshot coverage now runs in cli-snap-test, which builds packages/cli/dist for the local flavor and reuses the installed release binary for the global flavor via the new VP_SNAP_GLOBAL_VP override (no second vite_global_cli compile). The Rust test job keeps its fast global-only leg for early signal, with a comment explaining the split. Claude-Session: https://claude.ai/code/session_01NRgjMi2Vus3iJctudGEWPT
The archive's package loop is kept in sync with the justfile test recipe, which already excluded vite_cli_snapshots; the archive copy was missed, so the relocated cli_snapshots binary panicked at nextest list time on the Windows runner (its compile-time CARGO_MANIFEST_DIR is a Linux path). Also prefer the runtime CARGO_MANIFEST_DIR over the compile-time value in the harness, which is what relocated nextest archives rewrite; this is the groundwork for the planned Windows legs. Claude-Session: https://claude.ai/code/session_01NRgjMi2Vus3iJctudGEWPT
The e2e job's repo-wide vp check flagged the markdown table alignment. Claude-Session: https://claude.ai/code/session_01NRgjMi2Vus3iJctudGEWPT
The fixtures under crates/vite_cli_snapshots/tests are workspaces under test, not repo code, same as the existing snap-tests excludes. Claude-Session: https://claude.ai/code/session_01NRgjMi2Vus3iJctudGEWPT
Adds the harness reference README (case/step/interaction schema, vpt helpers, milestone conventions, env overrides, migration workflow) and points AGENTS.md and CONTRIBUTING.md at it, marking the legacy snap trees as migration-only so new cases land in the new harness. Claude-Session: https://claude.ai/code/session_01NRgjMi2Vus3iJctudGEWPT
build-windows-tests now cross-compiles a dedicated -p vite_cli_snapshots archive (test binary + vpt), and the new cli-snapshot-test-windows job runs it on windows-latest with no Rust toolchain: prebuilt vp via VP_SNAP_GLOBAL_VP, JS CLI built on the runner for the local flavor, managed runtime prewarmed for seed-runtime. vpt resolution now prefers the runtime CARGO_BIN_EXE_vpt that nextest rewrites under --workspace-remap, matching the CARGO_MANIFEST_DIR handling. Claude-Session: https://claude.ai/code/session_01NRgjMi2Vus3iJctudGEWPT
Reuse: milestone hex encoding via Buffer.toString('hex'); the migrator now
imports the legacy Steps schema from snap-test.ts instead of redeclaring it.
Simplification: shared makeTodo/redirect handling and a verbatim-vpt set in
the migrator; dead NewStep fields dropped; shared find_beside_test_exe and
manifest_dir helpers in the harness; the always-true separator guard from
the upstream port removed.
Efficiency: redaction regexes compiled once per run (LazyLock), diagnostic
sort skipped when no blocks exist, fixture staging filters harness metadata
instead of copy-then-delete, per-step env only cloned when a step overrides
it, and the prompts milestone flag is cached at module load (per-keystroke
path).
Suite output is unchanged: all 8 trials pass against existing snapshots and
the migrator reproduces byte-identical fixtures.
Claude-Session: https://claude.ai/code/session_01NRgjMi2Vus3iJctudGEWPT
fefbe9d to
e6ed290
Compare
std's canonicalize returns a \\?\ verbatim path on Windows; CMD.EXE,
which runs the local flavor's .cmd shims, rejects verbatim/UNC working
directories ('UNC paths are not supported'), so every local-flavor case
failed instantly on the Windows snapshot job. Also run the Windows suite
with --no-fail-fast: on a snapshot suite every diff is diagnostic signal.
Claude-Session: https://claude.ai/code/session_01NRgjMi2Vus3iJctudGEWPT
The full both-flavor suite already runs in cli-snap-test (linux/mac) and cli-snapshot-test-windows; the extra leg only re-ran the global cases and cost a vite_global_cli build inside the Rust test job. Claude-Session: https://claude.ai/code/session_01NRgjMi2Vus3iJctudGEWPT
On Windows ~/.vite-plus/bin/vp.exe is the trampoline, which re-execs %VP_HOME%/current/bin/vp.exe; the harness gives each case an isolated VP_HOME with no install inside, so vp_help::help::global failed with 'failed to execute ...current/bin/vp.exe'. Use current/bin/vp.exe (the real CLI) as VP_SNAP_GLOBAL_VP, matching what bin/vp is on Unix. Claude-Session: https://claude.ai/code/session_01NRgjMi2Vus3iJctudGEWPT
Converted with tool migrate-snap-tests (15 help/version steps, zero hand conversions). The new snapshot keeps every legacy assertion and adds the banner line the old pipe capture missed; vp -V now records the isolated workspace state (tools Not found) since the new harness does not symlink the checkout node_modules into fixtures. Claude-Session: https://claude.ai/code/session_01NRgjMi2Vus3iJctudGEWPT
…llisions Successfully converted case directories are now removed from the legacy tree automatically (git history keeps the originals; --keep-old defers). A case whose target fixture already exists is skipped and reported instead of clobbering it: the same name in both legacy trees means a hand merge. Also migrates packages/cli/snap-tests/cli-helper-message as the first such merge: the fixture gains a cli_helper_message_local case (vp -h / -V) next to the 15-step global one, sharing the legacy package.json; the global snapshot is unchanged by the added file. Claude-Session: https://claude.ai/code/session_01NRgjMi2Vus3iJctudGEWPT
Extracted from the cli-snap-test matrix per review: a dedicated Linux/macOS job with no runner.os/shard filter conditions, mirroring the Windows job's structure (build-upstream for dist + release vp, bootstrap-cli:ci, runtime prewarm for seed-runtime, then cargo test). Claude-Session: https://claude.ai/code/session_01NRgjMi2Vus3iJctudGEWPT
build-upstream and the snapshot suite never touch docs/ (the Windows snapshot job already runs green without it); the step exists in cli-e2e-test for pnpm tsgo, which this job does not run. Claude-Session: https://claude.ai/code/session_01NRgjMi2Vus3iJctudGEWPT
Converted with tool migrate-snap-tests (env-prefixed commands became step envs, win32 skip became skip-platforms; zero hand conversions; the old case dir was removed by the migrator). The task-cache flow asserts identically: cold miss, cache hit with replay trailer, and env-changed miss. Build sizes and the asset hash are now recorded concretely instead of masked; they are deterministic per vite version. Claude-Session: https://claude.ai/code/session_01NRgjMi2Vus3iJctudGEWPT
The win32 skip was inherited from the legacy case (added in #544 with no stated reason). Its plausible causes are gone in the new harness: env prefixes are structured step envs instead of shell syntax, the task engine supports Windows, .gitattributes forces LF so the asset content hash is checkout-stable, and vite prints forward-slash paths on every OS. The Windows snapshot job is the arbiter; if it disagrees, the skip returns with a documented reason. Claude-Session: https://claude.ai/code/session_01NRgjMi2Vus3iJctudGEWPT
|
Feedback from #2031's Windows run (https://github.com/voidzero-dev/vite-plus/actions/runs/28742954669/job/85229263838): 9 of 31 trials fail, all snapshot mismatches from the new app-command fixtures, and both root causes are Windows gaps in the harness rather than product or fixture bugs. 1. In Affected trials: 2. Renderer: ConPTY repaints a row padded to full grid width with explicit spaces when a second console client attaches.
The signature is precise: it occurs exactly at the boundary where vp's own stdout lines are followed by the spawned tool's first writes, and only in the Rider on 1 (cosmetic): With fixes 1 and 2 in the harness, all nine trials should pass against the already-recorded snapshots; nothing in the #2031 fixtures needs to change. Happy to re-run the Windows job on #2031 once this lands to confirm. |

Implements the RFC included in this PR as
rfcs/interactive-snapshot-tests.md.crates/vite_cli_snapshots: PTY snapshot harness (libtest-mimic). Every step runs in a real pseudo-terminal with vt100 grid capture; interactive steps script keystrokes synchronized on OSC 8 milestones; snapshots are Markdown compared with real pass/fail semantics (UPDATE_SNAPSHOTS=1to accept,.md.newplus unified diff on mismatch). Built on the pty_terminal/snapshot_test crates from vite-task at the already-pinned rev.vp = "local" | "global" | ["local", "global"]. The parity matrix already caught real drift between the two help outputs. Per-caseVP_HOME/HOME/npm-prefix isolation removesserialand the bootstrap byte-match requirement;seed-runtimesymlinks a provisioned managed runtime so cases do not download Node per case.vpttest multitool: vtt-aligned subcommands plusjson-edit,chmod, andprobe(interactive payload proving milestone round-trips before product commands are instrumented).tool migrate-snap-tests <dir> --vp <flavor> [filter]: one-click migration of oldsteps.jsoncases with a report; validated by migrating the fourcheck-pass*cases, which pass under the new harness with equivalent assertions.packages/prompts: milestone emission (select/confirm/text, gated onVP_EMIT_MILESTONES=1, byte-identical to the vite-task protocol, unit-tested; render output unchanged when disabled).just snapshot-test [filter],pnpm snapshot-test. CI runs the suite on the Rust test job withVP_SNAP_SKIP_FLAVORS=localuntil the JS build joins during migration.Validation: 8 snapshot trials pass in compare mode (including the interactive milestone case), prompts suite 10/10, clippy/rustfmt/
vp checkclean,just testexcludes the harness crate.Follow-ups per the RFC phasing:
local-registrycase support, remaining prompt components (multiselect, password, spinner), migration batches and old-harness removal.