Skip to content

perf(kernel): cache filesystem usage for quota checks; fix WASI hot paths#208

Merged
NathanFlurry merged 1 commit into
mainfrom
stack/perf-kernel-cache-filesystem-usage-for-quota-checks-fix-wasi-hot-paths-suolmrts
Jul 2, 2026
Merged

perf(kernel): cache filesystem usage for quota checks; fix WASI hot paths#208
NathanFlurry merged 1 commit into
mainfrom
stack/perf-kernel-cache-filesystem-usage-for-quota-checks-fix-wasi-hot-paths-suolmrts

Conversation

@NathanFlurry

Copy link
Copy Markdown
Member

The WASI syscall dispatch tax (3.1, the biggest known optimization) was
dominated by the KERNEL, not the shim: every fd_open/fd_write ran a full
filesystem usage scan for quota accounting (~10ms each on a populated VFS),
plus redundant WASI post-open stats and per-op fixture churn.

  • Kernel: filesystem usage is cached and maintained incrementally — old/new
    size deltas for file writes/pwrite/truncate (path and fd paths), inode
    deltas for create/remove/symlink; rename and mount/import/snapshot/host-dir
    events invalidate (overlay copy-up topology is not locally delta-able;
    invariants documented per mutation path). The cache populates lazily via
    the RAW filesystem so quota bookkeeping never fires guest-attributable
    permission checks.
  • WASI shim: removed redundant path_open create/truncate stat round-trips;
    per-syscall metrics kept (sub-phase timings) for future attribution.
  • native-baseline: readdir fixtures marker-cached (setup out of timing);
    fs_stat_x32 measures identical work on every lane (batching one lane
    harder than the others distorts the differential — comment documents the
    quantized sub-ms wasm reading).
  • Stale DEFAULT_WASM_EXECUTION_TIMEOUT_MS limits-inventory entry removed
    (constant retired earlier for typed max_fuel + V8 CPU watchdog).

Release results (guest = JS lane, wasm lane):

  • fs_write_small 0.24 -> 0.10ms guest, 25 -> 4ms wasm
  • fs_write_big 20.5 -> 4.4ms guest, 15 -> 14ms wasm
  • stat_storm 0.24 -> 0.07ms guest, 12 -> 5ms wasm
  • readdir_big 10.9 -> 6.1ms guest, 165 -> 14ms wasm (documented floor:
    the op stats all 1000 entries; fd_readdir itself is ~1ms)
  • ecosystem: ls_100 vmCmd 918 -> 490ms, git_init_commit 8.9s -> 1.4s

Kernel/limits/limits_audit/tls/http2 suites green; git + shell-redirect
semantics oracle 20/20; baseline regenerated; bench gate passes.

…aths

The WASI syscall dispatch tax (3.1, the biggest known optimization) was
dominated by the KERNEL, not the shim: every fd_open/fd_write ran a full
filesystem usage scan for quota accounting (~10ms each on a populated VFS),
plus redundant WASI post-open stats and per-op fixture churn.

- Kernel: filesystem usage is cached and maintained incrementally — old/new
  size deltas for file writes/pwrite/truncate (path and fd paths), inode
  deltas for create/remove/symlink; rename and mount/import/snapshot/host-dir
  events invalidate (overlay copy-up topology is not locally delta-able;
  invariants documented per mutation path). The cache populates lazily via
  the RAW filesystem so quota bookkeeping never fires guest-attributable
  permission checks.
- WASI shim: removed redundant path_open create/truncate stat round-trips;
  per-syscall metrics kept (sub-phase timings) for future attribution.
- native-baseline: readdir fixtures marker-cached (setup out of timing);
  fs_stat_x32 measures identical work on every lane (batching one lane
  harder than the others distorts the differential — comment documents the
  quantized sub-ms wasm reading).
- Stale DEFAULT_WASM_EXECUTION_TIMEOUT_MS limits-inventory entry removed
  (constant retired earlier for typed max_fuel + V8 CPU watchdog).

Release results (guest = JS lane, wasm lane):
- fs_write_small 0.24 -> 0.10ms guest, 25 -> 4ms wasm
- fs_write_big 20.5 -> 4.4ms guest, 15 -> 14ms wasm
- stat_storm 0.24 -> 0.07ms guest, 12 -> 5ms wasm
- readdir_big 10.9 -> 6.1ms guest, 165 -> 14ms wasm (documented floor:
  the op stats all 1000 entries; fd_readdir itself is ~1ms)
- ecosystem: ls_100 vmCmd 918 -> 490ms, git_init_commit 8.9s -> 1.4s

Kernel/limits/limits_audit/tls/http2 suites green; git + shell-redirect
semantics oracle 20/20; baseline regenerated; bench gate passes.
@NathanFlurry

Copy link
Copy Markdown
Member Author

Stack for rivet-dev/secure-exec

Get stack: forklift get 208
Push local edits: forklift submit
Merge when ready: forklift merge 208

@railway-app railway-app Bot temporarily deployed to secure-exec / secure-exec-pr-208 July 2, 2026 15:42 Destroyed
@NathanFlurry NathanFlurry merged commit 574caf1 into main Jul 2, 2026
1 of 4 checks passed
@NathanFlurry NathanFlurry deleted the stack/perf-kernel-cache-filesystem-usage-for-quota-checks-fix-wasi-hot-paths-suolmrts branch July 2, 2026 15:42
@railway-app railway-app Bot temporarily deployed to secure-exec / preview July 2, 2026 15:42 Inactive
@railway-app railway-app Bot temporarily deployed to secure-exec / production July 2, 2026 15:42 Inactive
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant