Skip to content

feat(bench): criterion-based corpus baseline + wasm-opt version pinning (v1.0.0 Track E)#116

Merged
avrabe merged 2 commits into
mainfrom
release/v1.0.0-pr-bench-criterion
May 15, 2026
Merged

feat(bench): criterion-based corpus baseline + wasm-opt version pinning (v1.0.0 Track E)#116
avrabe merged 2 commits into
mainfrom
release/v1.0.0-pr-bench-criterion

Conversation

@avrabe
Copy link
Copy Markdown
Contributor

@avrabe avrabe commented May 15, 2026

Summary

Wires the corpus measurement harness as a cargo bench plus adds wasm-opt version pinning. cargo bench -p loom-testing --bench corpus_baseline now produces the same comparison matrix that scripts/measure_corpus.sh does.

What lands

  • loom-testing/benches/corpus_baseline.rs (~870 LOC) — criterion-driven harness that:
    • Runs the per-fixture LOOM vs wasm-opt -O3 vs meld+wasm-opt vs meld+LOOM pipeline.
    • Emits a markdown table to stdout AND to docs/measurements/v<workspace-version>-corpus-baseline-criterion.md.
    • Sums code-section bytes from wasm-tools objdump (the LEB128-correct parser from PR-R).
  • loom-testing/Cargo.toml — adds [[bench]] name = "corpus_baseline" harness = false.
  • scripts/wasm-opt.pinned — pin file with current version version_116 + workflow comments.
  • scripts/check_wasm_opt_version.sh — standalone shell pin-checker (also invoked in-process by the bench at startup).

wasm-opt version pinning workflow

  • Skip-if-same: at bench startup, check_wasm_opt_pin() compares wasm-opt --version against scripts/wasm-opt.pinned. Match → silent OK. Mismatch → non-fatal warning with upgrade guidance. Missing → wasm-opt columns marked n/a.
  • Bumping the pin: update scripts/wasm-opt.pinned to the new version string. CI / human reviewers see the diff and can re-run the baseline.

Run

cargo bench -p loom-testing --bench corpus_baseline           # full run
cargo bench -p loom-testing --bench corpus_baseline -- --test # smoke (one iteration per case)
bash scripts/check_wasm_opt_version.sh                        # standalone pin check

Note

The shell scripts/measure_corpus.sh is unchanged — it remains the manual fallback for environments without cargo.

🤖 Generated with Claude Code

avrabe added 2 commits May 15, 2026 15:35
…ng (v1.0.0 PR-S)

Adds a cargo-bench equivalent to scripts/measure_corpus.sh so the
corpus comparison matrix (LOOM vs wasm-opt -O3 vs meld+wasm-opt vs
meld+LOOM) can be produced via `cargo bench -p loom-testing --bench
corpus_baseline`. The shell script remains as the manual fallback.

What the bench does
- For every fixture in the same workload list the shell harness uses,
  shells out to LOOM, wasm-opt, wasm-tools, and (for components) meld.
- Sums code-section bytes via `wasm-tools objdump` (matches the shell
  harness's awk logic exactly).
- Renders a markdown report identical in shape to
  docs/measurements/v0.9.0-corpus-baseline.md, to:
    - stdout (so `cargo bench` log is grep-able by CI), and
    - docs/measurements/v<workspace-version>-corpus-baseline.md
      (so each bench run produces a versioned baseline artefact).
- Wraps each LOOM pass in `criterion::bench_function`, so timings land
  in target/criterion/ alongside the markdown table.
- Times out individual tool invocations after PER_RUN_TIMEOUT (default
  300s) to keep developer laptops responsive on large components.

Output rendering runs at process exit via libc `atexit`, since
criterion_main!() returns cleanly and Rust's `static` drop semantics
don't fire for non-Drop statics.

wasm-opt version pinning
- scripts/wasm-opt.pinned holds the pinned version_NNN token (initial
  value: version_116, matching the wasm-opt that produced the v0.9.0
  baseline). Comments explain the bump workflow.
- scripts/check_wasm_opt_version.sh is the standalone shell wrapper
  that CI and developers can call to verify the pin pre-flight; it
  parses both `(version_NNN)` and `version N` output forms and prints
  upgrade guidance on mismatch.
- The criterion bench performs the same check in-process at startup
  and surfaces the result in the report header (`pin: ... (match)`,
  `pin: **MISMATCH** ...`, `pin: ... (wasm-opt not installed)`).

Pin policy
- Auto-bumping is intentionally NOT performed. We want every wasm-opt
  version change in our baselines to be a deliberate, reviewable
  commit so the docs/measurements/ history stays comparable.
- A mismatch is non-fatal: the bench / harness still runs, but the
  generated report flags the mismatch so reviewers notice.
- If wasm-opt is missing entirely, the bench proceeds with wasm-opt
  columns marked n/a -- matches scripts/measure_corpus.sh behaviour.

Reuse of existing infra
- WORKLOADS catalogue is kept in lock-step with the shell harness.
- Output naming under /tmp/loom-measure-corpus matches the shell
  script so forensic outputs are discoverable from the same place.
…lobbering existing report

The previous version wrote to docs/measurements/v<workspace>-corpus-baseline.md,
which would overwrite the existing shell-harness baseline file. Switch to a
"-criterion.md" suffix so the criterion bench and the shell script can coexist
and the docs/measurements history stays comparable.
@avrabe avrabe merged commit edfafa9 into main May 15, 2026
8 of 20 checks passed
@avrabe avrabe deleted the release/v1.0.0-pr-bench-criterion branch May 15, 2026 13:39
@avrabe avrabe mentioned this pull request May 15, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant