Skip to content

feat: runner-aware tools#346

Draft
branchseer wants to merge 24 commits intofeat/output-restorationfrom
runner-aware-tools
Draft

feat: runner-aware tools#346
branchseer wants to merge 24 commits intofeat/output-restorationfrom
runner-aware-tools

Conversation

@branchseer
Copy link
Copy Markdown
Member

@branchseer branchseer commented Apr 18, 2026

Set up a IPC channel between vite-task and the processes it spawns, so the spawned tools can declare at runtime what they actually read, wrote, or cared about, and then vite-task uses that to decide what to fingerprint in the cache.

Design notes: docs/runner-task-ipc/.

Problems this PR solves

Every example below is exercised by patches/vite.patch, which wires vite build into the IPC through @voidzero-dev/vite-task-client.

1. Dynamic tracked envs

Before: the user had to declare every relevant env in vite-task.json, statically:

{
  "tasks": {
    "build": { "env": ["NODE_ENV", "VITE_*"], "cache": true }
  }
}

This duplicates knowledge the tool already has. Forgetting NODE_ENV silently skips cache invalidation on mode change. envPrefix-matching envs (VITE_* by default) get inlined into the bundle through import.meta.env.* — so changing envPrefix: 'MYAPP_' in vite.config.js without updating vite-task.json drifts: the runner still tracks VITE_* while the build output is driven by MYAPP_*.

After: the tool declares its envs at runtime, driven by its own config.

// vite's resolveConfig
fetchEnv("NODE_ENV", { tracked: true });

// vite's loadEnv, one call per configured prefix — these envs are
// exposed to client code as import.meta.env.*, so their values are
// baked into the bundle
for (const prefix of envPrefix) {
  fetchEnvs(`${prefix}*`, { tracked: true });
}

The build task in vite-task.json needs no env: at all. Changing envPrefix in vite.config.js dynamically changes the set of envs the runner tracks, with zero config edits on the runner side.

2. Exclude tool's cache dir from input/output

Vite stores pre-bundled deps under node_modules/.vite/ and bundled configs under node_modules/.vite-temp/. Every build reads the cache metadata (to check staleness) and writes fresh entries when it isn't stale. Without intervention the runner sees:

  • the reads → implicit inputs, so the cache key depends on dep-cache contents
  • the writes → implicit outputs
  • the same directory both read and written → the runner refuses to cache the run at all (read-write overlap)

There is a workaround already in vite-plus: voidzero-dev/vite-plus#1096 plus its follow-up #1198 hardcode !node_modules/.vite-temp/**, !node_modules/.vite/**/results.json, and !dist/** as negative input globs on every vp subcommand (build, test, pack). That's not good enough:

  • Leaks vite internals into vp. Every time vite changes its cache layout (new path under .vite/, moved temp dir, new subcommand with its own transient files), vp has to ship a matching glob update. It's a lockstep coupling that design-wise shouldn't exist.
  • Input-only, not symmetric. The globs suppress reads for the input fingerprint (which is enough to break the read-write overlap check), but the writes are still captured as outputs — meaning transient cache contents get archived into the runner's cache and restored on every hit, bloating the cache store.
  • Per-subcommand, per-tool duplication. #1198 already had to retrofit the same glob into three subcommands. Any new subcommand, and any third-party tool with similar behavior (Nuxt's .nuxt/, SvelteKit's .svelte-kit/, Next's .next/), needs its own hand-maintained list — vp can't ship it generically.

After:

// in loadCachedDepOptimizationMetadata
const depsCacheDir = getDepsCacheDir(environment);
ignoreInput(depsCacheDir);
ignoreOutput(depsCacheDir);

The declaration lives with the tool that owns the directory. The dep cache is vite's private concern.

3. Exclude output from input when a tool clears the folder before writing it

vite build calls emptyDir(outDir) before writing dist/. emptyDir has to read the directory entries to know what to delete — those reads look identical to genuine input reads. Since dist/ is also where vite writes its final output, the runner sees a read-write overlap on the same paths and refuses to cache.

After:

// in prepareOutDir, right before emptyDir()
ignoreInput(outDir);

Only the writes count. The pattern generalizes: any tool that wipes-then-writes the same directory needs to tell the runner "my enumeration reads aren't inputs."

What's in this PR

  • Step 1 — Protocol (vite_task_ipc_shared): message types + serialization shared by both ends.
  • Step 2 — Transport (vite_task_server + vite_task_client): async server, sync blocking client, tested Rust-to-Rust.
  • Step 3 — Extract artifact crate out of fspy for dylib embedding. (Landed on main via refactor: extract materialized_artifact crate out of fspy #344 as materialized_artifact.)
  • Step 4 — JS bridge: vite_task_client_napi + @voidzero-dev/vite-task-client JS wrapper (fetchEnv single-name + fetchEnvs glob, with dedupe against already-set process.env).
  • Step 5 — Runner integration: server started per task execution, client dylib embedded/extracted, IPC envs injected via serve()'s returned iterator.
  • Step 6 — Cache integration: runner consumes reported ignored inputs/outputs, tracked env requests (single + glob), and disable-cache signals when fingerprinting.

Test plan

  • Rust integration tests for server/client transport (vite_task_server/tests/integration.rs)
  • E2E snapshot fixtures per client method: ignore_input, ignore_output, fetch_env, fetch_envs_glob, disable_cache
  • E2E test caching a real vite build via patches/vite.patch (vite_build_cache fixture): NODE_ENV-change invalidation, envPrefix-driven tracked-env set change, dist/ write restoration on cache hit

@branchseer branchseer changed the base branch from main to graphite-base/346 April 20, 2026 02:19
@branchseer branchseer changed the base branch from graphite-base/346 to feat/output-restoration April 20, 2026 02:19
Copy link
Copy Markdown
Member Author

Warning

This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
Learn more

This stack of pull requests is managed by Graphite. Learn more about stacking.

@branchseer branchseer changed the title feat(ipc): runner-aware tools — protocol + transport (partial) feat: runner-aware tools Apr 20, 2026
branchseer and others added 17 commits April 20, 2026 12:18
- vite_task_ipc_shared: shared protocol (Request/GetEnvResponse, NativeStr)
- vite_task_server: per-task IPC server (Handler trait + Recorder)
- vite_task_client: sync Rust client
- vite_task_client_napi + @voidzero-dev/vite-task-client: node addon + JS wrapper
- vite_task: wire IPC server into spawn; inject VP_IPC + VP_RUN_NODE_CLIENT_PATH;
  bundle with fspy via Tracking struct; materialize .node addon on first use

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Step 6 of docs/runner-task-ipc/plan.md.

- Apply `ignoreInputs` to filter inferred fspy reads (directory-aware)
- Apply `ignoreOutputs` to filter auto-detected writes (overlap check + archive)
- Short-circuit cache update on `disableCache()` via new
  `CacheNotUpdatedReason::ToolRequested`
- Embed `tracked: true` envs in `PostRunFingerprint.tracked_envs`; validate
  on lookup by comparing against the current parent env
- Recorder env_map sources from `std::env::vars_os()` so tools can resolve
  envs the user never declared
- Bump cache schema to 13

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
New fixture `ipc_client_test` exercises each IPC method through the JS
wrapper (@voidzero-dev/vite-task-client) inside a real cached task:

- ignoreInput → the ignored dir can mutate without invalidating cache
- ignoreOutput → read-write overlap under an ignored dir still caches
- disableCache → forces re-execution on next run
- fetchEnv(tracked: true) → env change invalidates cache; same value hits

The e2e harness now copies packages/vite-task-client into each staging
node_modules so fixtures can `import { ... } from "@voidzero-dev/vite-task-client"`
without pnpm install.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Applies a small pnpm patch to vite 8.0.8 that auto-injects a runner-aware
plugin at plugin-resolution time. When `VP_RUN_NODE_CLIENT_PATH` is set
(i.e. the child runs under `vp run`), the plugin:
- `ignoreInput(outDir)` — suppress fspy reads of the output dir (emptyDir
  scans dist/ before writing)
- `ignoreInput/Output(<root>/node_modules)` — machine state (pnpm store +
  vite's `.vite`/`.vite-temp` caches) is not user input/output
- `getEnv("NODE_ENV", true)` — tracked; drives DCE and define replacements

New e2e fixture `vite_build_cache` proves `vt run --cache build` produces
a cache hit on the second run and restores `dist/assets/main.js` after
deletion, all with zero manual input/output configuration.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…Extensions

Rework `patches/vite.patch` to match the shape of the eventual upstream
Vite PR:

- Drop the synthetic `vite:runner-aware` plugin. Each IPC call is now
  inlined right at the Vite code that triggers the fs / env access:
  - `ignoreInput(outDir)` in `prepareOutDir` before `emptyDir` scans it
  - `ignoreInput(depsCacheDir)` + `ignoreOutput(depsCacheDir)` in
    `loadCachedDepOptimizationMetadata` before the dep optimizer cache
    is read / written
  - `fetchEnv("NODE_ENV", { tracked: true })` in `resolveConfig` before
    `process.env.NODE_ENV` is first consulted
  - `ignoreInput`/`ignoreOutput` of `.vite-temp/` in
    `loadConfigFromBundledFile` (bundled-config temp write+import)
- Static `import` of `@voidzero-dev/vite-task-client` by name — the
  wrapper no-ops when no runner is connected, so no guard is needed at
  the call sites.
- Add a `packageExtensions` entry in `pnpm-workspace.yaml` that injects
  the wrapper as a real dependency of Vite. The final upstream PR would
  instead declare it in `packages/vite/package.json`; the only delta
  between experiment and PR is that one line.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The previous snapshot embedded Vite's minified JS output, which would
churn on every Vite version bump. Add a tiny `vtt stat-file` helper that
reports `exists` / `missing` and use that instead.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Demonstrates end-to-end that Vite's patched `fetchEnv("NODE_ENV", { tracked: true })`
reaches the runner: flipping NODE_ENV between runs yields `tracked env
'NODE_ENV' changed`, while holding it constant still produces a cache hit.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The previous case proved the cache invalidated when NODE_ENV flipped, but
not that the tool actually used the new value. Source now carries a
`process.env.NODE_ENV` branch whose marker (`BUILD_MODE_PROD` /
`BUILD_MODE_DEV`) is DCE-pruned by Vite's define + minifier, so only the
branch matching the current mode survives in the output.

Add a `vtt grep-file` helper to inspect the bundle without dumping its
whole (minified) body into the snapshot, and assert both markers against
the production and development builds.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…nv caching

Makes the effect of NODE_ENV changes visible in `dist/assets/main.js`: the
bundle contains only the surviving literal (`PROD build` or `DEV build`)
after Vite's define-plugin substitution + DCE.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…olves

Also update the inspection hint in the comment to match the default
`dist/assets/index-<hash>.js` filename now that vite.config.js is gone.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
… fields

Follows the convention introduced in main (#347): per-`[[e2e]]` and per-
step descriptions use the TOML `comment` field instead of bare `#` lines,
so they render under the snapshot headings and inside each step's block.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The accept loop's `tokio::select!` could exit via the shutdown branch
before ever observing a connection that had already been established at
the kernel level, so fire-and-forget clients that connect, write, and
exit right before the runner signals stop_accepting would silently lose
their requests. After the main loop exits we now do one non-blocking
`poll!` of `listener.accept()` per iteration until it returns Pending,
ensuring every backlog-queued connection gets its handle_client future
pushed and drained.

Also:
- drop the now-redundant `crates/vite_task_client_napi/tests/e2e.rs`;
  the IPC path is covered end-to-end by the `ipc_client_test` fixture
  plus `vite_build_cache`
- oxfmt the fixture scripts and the JS wrapper

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@branchseer branchseer force-pushed the feat/output-restoration branch from 0008bd7 to 994624a Compare April 20, 2026 04:20
branchseer and others added 5 commits April 20, 2026 12:36
…ests

Two CI fixes rolled together:

1. `cargo-shear --deny-warnings` failed after the removal of
   `vite_task_client_napi/tests/e2e.rs`: the crate still listed the
   tests's deps (rustc-hash, tokio, vite_task_server, vite_path) and the
   workspace still referenced `vite_task_client_napi` in non-shear-aware
   ways. Drop those deps from the napi crate and add
   `vite_task_client_napi` to the workspace-level cargo-shear ignore list
   (same rationale as fspy_preload_*: it's an artifact dep loaded by
   string name, not `use`-d in Rust).

2. Revert the speculative server-side drain-accept loop — on Windows
   the interprocess Listener's named-pipe implementation crashed the
   integration test binary at startup (no tests even ran). Instead,
   have each fire-and-forget test end with a tiny `flush(&client)`
   round-trip (a cheap `get_env` that waits for a response). Since
   frames on a single stream are read sequentially by the server, once
   the flush's response returns, every preceding fire-and-forget frame
   has definitely been dispatched to the handler — no server-side race
   fix needed. 10/10 repeat runs pass locally.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`cargo-shear 1.11.1 --deny-warnings` treats the 'test = true on lib
target X but source contains no tests' messages as errors. Add
`test = false` (plus `doctest = false` where missing) to the `[lib]`
sections of the four IPC crates so cargo does not generate empty test
harnesses for them. Integration tests in `tests/*` are unaffected.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
CI's `RUSTDOCFLAGS='-D warnings' cargo doc --no-deps --document-private-items`
fails on the `[`SpawnFingerprint`]` link in `collect_tracked_envs`'s
docstring — it's not in scope at that site. Rewrite the prose to drop
the link; no information lost.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
typos 1.45.1 rejected the PR because:
- `./patches/vite.patch` includes Vite's own hunk-header line containing
  a truncated identifier (`environmen`) that looks like a typo but isn't
  ours to fix. Add `patches` to `.typos.toml` extend-exclude.
- `docs/runner-task-ipc/index.md:39` had a real typo `respone` → `respond`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
branchseer and others added 2 commits April 20, 2026 13:09
…c abs paths

On Windows, forward-slash paths without a drive letter (`/tmp/x.txt`)
are RELATIVE, so the client's `resolve_path` joined them with the cwd
(`D:\...\tmp\x.txt`) and the server-side assertion blew up. Use
`/tmp/` on unix and `C:\tmp\` on windows so the paths are absolute on
each platform and reach the server unchanged.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
On Windows CI these ignored tests crash their child processes with
"failed to start the persistent thread of the Interprocess linger pool:
Access is denied" from interprocess 2.4 as soon as the Node addon's
client connects. The server-side unit tests on Windows already cover
the IPC protocol; the crash is a downstream interprocess crate issue
that doesn't affect our code paths. Add `platform = "unix"` so the
ignored suite passes on Windows CI, with a comment pointing at the
upstream root cause.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
* @param {string} name
* @param {{ tracked?: boolean }} [options]
*/
export function fetchEnv(name, { tracked = true } = {}) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is synchronous, and doesn't fetch anything, I recommend calling it getEnv instead.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants