Skip to content

docs(w47): investigation note — variance dominates the +24 % signal#84

Merged
chaploud merged 1 commit intomainfrom
develop/w47-investigation-note
Apr 29, 2026
Merged

docs(w47): investigation note — variance dominates the +24 % signal#84
chaploud merged 1 commit intomainfrom
develop/w47-investigation-note

Conversation

@chaploud
Copy link
Copy Markdown
Contributor

Followed the W47 thread on Mac M4 Pro this session. Stable findings:

  • `string_ops` is two nested loops doing `i32.div_u 10 + br_if` — small hot loop. JIT engages (interpreter is ~23× slower).
  • 5-run hyperfine produces σ ≈ 18 % of the mean for this specific benchmark.
  • 20-run remeasurement on commit `9a1c76b`:
    • uncached: 71.8 ± 13.3 ms
    • cached: 74.4 ± 13.5 ms
  • vs v1.9.1 baseline of 62.2 / 64.5 ms → uniform ~15 % slowdown on both variants. The "cached only" framing in the original W47 entry was an artifact of the 5-run sample landing on the high side; the cached-vs-uncached delta is noise.
  • A 5-run bisect across v1.9.1 → v1.10.0 (the Zig 0.15 → 0.16 migration window) would be unreliable at this variance. Measurement harness must be stabilised first.

No code change; no fix attempted. Documenting the investigation so the next person to pick up W47 doesn't redo the same 30 minutes of hyperfine sampling.

Files

  • `.dev/w47-investigation.md` (new): self-contained log — hypothesis → repro numbers → why the benchmark is noisy → recommendations (50-run harness or in-process JIT timing; bisect once σ < 5 %; check Zig 0.16 / LLVM 19 codegen of the interpreter dispatch loop).
  • `.dev/checklist.md` W47 entry rewritten to reflect the remeasured numbers + pointer at the new note.

Test plan

  • Doc-only diff (`.dev/` markdown only)
  • CI green on this branch

Followed the W47 thread on Mac M4 Pro this session. Stable
findings:

- `string_ops` is two nested loops doing `i32.div_u 10 + br_if`,
  classic small-hot-loop. JIT engages (interpreter is ~23x slower).
- 5-run hyperfine produces σ ≈ 18 % of the mean for this specific
  benchmark; 20-run remeasurement on commit 9a1c76b puts:
    uncached: 71.8 ± 13.3 ms
    cached:   74.4 ± 13.5 ms
  vs v1.9.1 baseline of 62.2 / 64.5 ms — a uniform ~15 % slowdown
  on both variants. The "cached only" framing in the original W47
  entry was an artifact of the 5-run sample landing on the high
  side; the cached vs uncached delta is noise.
- A 5-run bisect across v1.9.1 → v1.10.0 (the Zig 0.15 → 0.16
  migration window) would be unreliable at this variance; the
  measurement harness needs stabilising first. Recommendations
  for future sessions are in the new note.

No code change; no fix attempted. Documenting the investigation so
the next person to pick up W47 doesn't redo the same 30 minutes of
hyperfine sampling.

- `.dev/w47-investigation.md`: new file. Self-contained log:
  hypothesis → repro numbers → why the benchmark is noisy → next
  steps if/when W47 is reopened with a stable harness.
- `.dev/checklist.md` W47 entry rewritten to reflect the
  remeasured numbers + pointer at the new note.
@chaploud chaploud merged commit f087bab into main Apr 29, 2026
8 checks passed
@chaploud chaploud deleted the develop/w47-investigation-note branch April 29, 2026 09:13
chaploud added a commit that referenced this pull request Apr 29, 2026
chaploud added a commit that referenced this pull request Apr 29, 2026
* docs: post-W50 cleanup — drop resume-guide, refocus memo on W53/C-g/W47

Plan B sub-3 (W50) and Plan C (W49) shipped via PRs #80..#83 in the
2026-04-29 PM autonomous session, plus the W47 investigation note in #84.
The .dev/resume-guide.md handover doc is now stale: its "Plan B sub-3 is
the next big lift" framing no longer matches reality, and the per-PR plan
items are all marked complete.

- Delete .dev/resume-guide.md; .dev/memo.md `## Current Task` is the
  single handover surface going forward.
- memo.md: refresh the active-work section to W53 → C-g → W47, with
  the per-item plan inlined (was previously split between memo and
  resume-guide).
- checklist.md: mark W50 done with the four-PR breakdown, refocus
  W49 on the C-g residual (3-platform bench baseline reset), open W53.
- environment.md: explain that the only Windows-skipped CI step left is
  `benchmark` and link to C-g for the rationale.
- roadmap.md: Windows CI guard removal flipped Done; W53 surfaced as
  the next active item.

* fix(w53): route rustup-init stdout through Out-Host to keep return scalar

PowerShell folds every native command's stdout into the enclosing
function's pipeline output. Inside `Install-Rustup`, that meant
rustup-init's `info: downloading component rust-std` (and the
similar lines from `rustup target add wasm32-wasip1`) were piling
up alongside the trailing `return $stampedDir`, so the caller's
`$rustRoot = Install-Rustup ...` was a string array rather than
a single path. The downstream

    $pathsToAdd += (Join-Path (Join-Path $paths['rust'] 'cargo') 'bin')

then exploded on the empty leading element with

    Cannot bind argument to parameter 'Path' because it is an empty string.

— matching the W53 symptom on a fresh GitHub-hosted Windows
runner. Local Windows mini-PC was unaffected because rustup's
"already installed" path is silent on stdout, so nothing leaked
into the function's return value there.

Fix: route both native command invocations through `2>&1 | Out-Host`,
which keeps the lines visible in the CI log but pulls them out of
the function's pipeline output. Also added a defensive check in the
caller so any future regression of this shape fails loud rather
than silently producing a malformed PATH.

ci.yml: drop `-SkipRust` and the separate `Setup Rust` step on the
Windows test job. The runner now goes through a single
`install-tools.ps1` path with a self-contained
`%LOCALAPPDATA%\zwasm-tools\rust-stable\` toolchain, the same as
local Windows users get.

* docs(w53): mark resolved with root-cause + fix summary

Updates the four docs that were tracking W53 as open work to
reflect the rustup-init stdout pollution diagnosis and the
`Out-Host` redirect fix in the previous commit.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant