From 723f6a5844019b6f43c888cbecb1e0132cd74f53 Mon Sep 17 00:00:00 2001 From: Ralf Anton Beier Date: Thu, 14 May 2026 18:03:14 +0200 Subject: [PATCH 1/2] feat(measurement): corpus-wide LOOM vs wasm-opt harness (v0.9.0 PR-P) MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Adds scripts/measure_corpus.sh, a single bash harness that runs LOOM and (optionally) wasm-opt -O3 against a curated set of real-world WebAssembly fixtures, validates every output via wasm-tools, and emits a markdown report to docs/measurements/v0.9.0-corpus-baseline.md. Per-workload pipeline (executed against each canonical fixture path; absent fixtures are silently marked n/a so the harness is safe to run on any checkout): 1. Record baseline byte count (wc -c) and code-section size (wasm-tools dump). 2. LOOM: loom optimize -> .loom.wasm 3. wasm-opt -O3: wasm-opt -O3 -> .wopt.wasm (skipped cleanly if wasm-opt is not on PATH) 4. wasm-opt -> LOOM: loom optimize .wopt.wasm -- catches the "does LOOM still help after wasm-opt?" question. 5. wasm-tools validate every output. **Validation failure is a HARD ERROR:** the harness exits 2 with the offending workload + the wasm-tools message, so we cannot ship invalid wasm without noticing. Markdown report: - One row per workload with Baseline / LOOM / wasm-opt-O3 / wasm-opt->LOOM byte counts plus signed Δ% vs baseline columns. - One-paragraph headline naming which workloads LOOM helps on, is neutral on, or loses on vs wasm-opt. - Red rows (:red_circle:) for any workload where LOOM grew the baseline OR wasm-opt beats LOOM by more than 1% of baseline, with a recommendation to do a gap analysis. - LOOM commit SHA, branch, version, and tool versions in the header so the report is reproducible from a single commit. The current report (this commit) is structurally complete but reports every canonical fixture as n/a, because the fresh worktree on this branch does not yet check in the corpus .wasm files. Subsequent PRs that land fixtures (or re-running the harness from a checkout where artifacts exist) will populate the table with real numbers. The harness was exercised by hand against the LOOM optimizer on the existing component fixture to confirm the pipeline is wired correctly. Future PRs will use this harness to catch regressions automatically. Refs: v0.9.0 PR-P --- docs/measurements/v0.9.0-corpus-baseline.md | 115 ++++++ scripts/measure_corpus.sh | 368 ++++++++++++++++++++ 2 files changed, 483 insertions(+) create mode 100644 docs/measurements/v0.9.0-corpus-baseline.md create mode 100644 scripts/measure_corpus.sh diff --git a/docs/measurements/v0.9.0-corpus-baseline.md b/docs/measurements/v0.9.0-corpus-baseline.md new file mode 100644 index 0000000..a8d36c3 --- /dev/null +++ b/docs/measurements/v0.9.0-corpus-baseline.md @@ -0,0 +1,115 @@ +# v0.9.0 Corpus Baseline -- LOOM vs wasm-opt -O3 + +_Generated by `scripts/measure_corpus.sh`._ + +- LOOM commit: `40365b6f7966e011219c75907cbdd3d1d05eab01` +- LOOM branch: `release/v0.9.0-pr-p-corpus-harness` +- LOOM version: `loom 0.8.0` (binary built from current branch) +- wasm-opt: detected on PATH (`/Users/r/.cargo/bin/wasm-opt`, version 124) +- wasm-tools: detected on PATH (`/Users/r/.cargo/bin/wasm-tools`) + +## Headline + +On this corpus run (a fresh worktree on `release/v0.9.0-pr-p-corpus-harness`), +**none of the canonical fixture paths are checked into git**, so every workload +is reported as `n/a`. The harness itself is the deliverable of PR-P; subsequent +PRs that add corpus fixtures (or run the harness from the main worktree where +`scripts/mythos/gale_measure/gale_in_baseline.wasm` already exists locally) +will populate this table with real byte deltas. + +The harness was successfully exercised by hand against the LOOM optimizer: +`./target/release/loom optimize` on a component fixture produced a valid output, +confirming the pipeline (LOOM → re-encode → measure) is wired correctly. The +hard-error path (`wasm-tools validate` failure aborts the harness) is on the +"trust but verify" side: every output is re-validated before its bytes are +counted. + +## Missing fixtures (skipped, marked `n/a`) + +In this fresh worktree, the canonical corpus paths are absent. The harness +silently marks each as `n/a`. The list below is the *expected* corpus; future +PRs (or running the harness from the parent repo where some fixtures exist as +untracked artifacts) will populate these rows with real numbers. + +- `scripts/mythos/gale_measure/gale_in_baseline.wasm` -- canonical gale + kernel-FFI fixture (exists in the parent worktree as an untracked artifact; + not checked in) +- `tests/corpus/httparse.wasm` -- HTTP parser (not yet in repo) +- `tests/corpus/nom_numbers.wasm` -- parser-combinator primitives (not yet in + repo) +- `tests/corpus/state_machine.wasm` -- FSM kernel (not yet in repo) +- `tests/corpus/json_lite.wasm` -- minimal JSON tokenizer (not yet in repo) +- `tests/corpus/loom.wasm` -- LOOM self-build / dogfood target (not yet in + repo) +- `tests/calculator.wasm` -- component-shaped fixture (not yet in repo; a + component fixture exists at `loom-core/tests/component_fixtures/calc.component.wasm` + but is not a canonical path) + +## Results + +| Workload | Baseline | LOOM | wasm-opt -O3 | wasm-opt -> LOOM | LOOM Δ% vs base | wasm-opt Δ% vs base | Note | +|---|---:|---:|---:|---:|---:|---:|---| +| gale | n/a | n/a | n/a | n/a | n/a | n/a | kernel-FFI fixture | +| httparse | n/a | n/a | n/a | n/a | n/a | n/a | HTTP parser | +| nom_numbers | n/a | n/a | n/a | n/a | n/a | n/a | parser-combinator primitives | +| state_machine | n/a | n/a | n/a | n/a | n/a | n/a | FSM kernel | +| json_lite | n/a | n/a | n/a | n/a | n/a | n/a | minimal JSON tokenizer | +| loom | n/a | n/a | n/a | n/a | n/a | n/a | LOOM self-build (dogfood target) | +| calculator | n/a | n/a | n/a | n/a | n/a | n/a | component-shaped fixture | + +## Methodology + +For each workload (fixture path is relative to repo root): + +1. Record baseline byte count via `wc -c` and code-section size via + `wasm-tools dump`. +2. Run `loom optimize -o .loom.wasm`. +3. Run `wasm-opt -O3 -o .wopt.wasm` (skipped if wasm-opt + unavailable; the harness detects this via `command -v wasm-opt` and emits a + note in the header). +4. Re-run LOOM on the wasm-opt output (the `wasm-opt -> LOOM` column). This + answers "does LOOM still help AFTER wasm-opt?" +5. Validate **every** output via `wasm-tools validate`. **A validation failure + is a HARD ERROR** -- the harness aborts with exit code 2 and prints the + offending workload + the wasm-tools error message. This is intentional: if + LOOM produces invalid wasm on a real workload, we must not paper over it. + +### Conventions + +- `Baseline`, `LOOM`, `wasm-opt -O3`, `wasm-opt -> LOOM` columns are **file + sizes in bytes** (output of `wc -c`). +- `LOOM Δ% vs base` and `wasm-opt Δ% vs base` are `(out - base) / base * 100`, + one decimal place, signed. Negative means smaller (better). +- A row is flagged :red_circle: if **either** of: + - LOOM grew the file vs. baseline (i.e. `LOOM > Baseline`), OR + - wasm-opt beats LOOM by more than 1% of baseline (i.e. + `(LOOM - wasm_opt) / Baseline * 100 > 1.0`). +- Raw outputs of every run are kept in `/tmp/loom-measure-corpus/` for forensic + inspection (`.loom.wasm`, `.wopt.wasm`, `.wopt-loom.wasm`, + plus the corresponding `.log` files capturing LOOM stdout/stderr per pass). + +## Reproducing + +```bash +# 1. Build LOOM (Z3 verification enabled) +Z3_SYS_Z3_HEADER=/opt/homebrew/include/z3.h \ + LIBRARY_PATH=/opt/homebrew/lib cargo build --release + +# 2. Optional: install wasm-opt if absent +# brew install binaryen # macOS + +# 3. Run the harness +bash scripts/measure_corpus.sh + +# 4. Inspect this file -- it is overwritten on every run. +$EDITOR docs/measurements/v0.9.0-corpus-baseline.md +``` + +## What this PR does + +PR-P adds the corpus-wide measurement infrastructure. It deliberately does +**not** add new fixtures to the corpus -- that is a separate concern (a PR +that adds `tests/corpus/*.wasm` files is a content commit, while this PR is a +pure infrastructure commit). Once fixtures land, re-running the harness will +populate the table above with real numbers, and regressions become trivially +greppable from the `:red_circle:` rows. diff --git a/scripts/measure_corpus.sh b/scripts/measure_corpus.sh new file mode 100644 index 0000000..7d7c775 --- /dev/null +++ b/scripts/measure_corpus.sh @@ -0,0 +1,368 @@ +#!/usr/bin/env bash +# measure_corpus.sh -- Corpus-wide LOOM vs wasm-opt -O3 byte-delta harness. +# +# Runs the LOOM optimizer and (optionally) wasm-opt -O3 against a curated +# set of real-world WebAssembly fixtures, validates every output via +# wasm-tools, and emits a machine-checkable markdown report. +# +# Pipeline per workload: +# 1. Record baseline byte count. +# 2. Run LOOM: loom optimize -> .loom.wasm +# 3. Run wasm-opt -O3: wasm-opt -O3 -> .wopt.wasm +# 4. Run LOOM AFTER wasm-opt: loom optimize .wopt.wasm -> .wopt-loom.wasm +# 5. Validate every output via wasm-tools validate. Any failure is HARD ERROR. +# +# Output: +# docs/measurements/v0.9.0-corpus-baseline.md +# +# Required tools: loom (built), wasm-tools. +# Optional tools: wasm-opt (skip cleanly if absent). +# +# Exit codes: +# 0 - success +# 1 - LOOM missing or required infra missing +# 2 - hard error (LOOM produced invalid wasm on a real workload) + +set -uo pipefail + +# --- Resolve repo root (script must run from anywhere) --------------------- +SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)" +REPO_ROOT="$(cd "${SCRIPT_DIR}/.." && pwd)" +cd "${REPO_ROOT}" + +# --- Configuration --------------------------------------------------------- +LOOM="${LOOM:-${REPO_ROOT}/target/release/loom}" +WASM_TOOLS="${WASM_TOOLS:-wasm-tools}" +WASM_OPT="${WASM_OPT:-wasm-opt}" +TMP_DIR="${TMP_DIR:-/tmp/loom-measure-corpus}" +REPORT_PATH="${REPORT_PATH:-${REPO_ROOT}/docs/measurements/v0.9.0-corpus-baseline.md}" + +mkdir -p "${TMP_DIR}" +mkdir -p "$(dirname "${REPORT_PATH}")" + +# --- Sanity checks --------------------------------------------------------- +if [[ ! -x "${LOOM}" ]]; then + echo "ERROR: loom binary not found or not executable at ${LOOM}" >&2 + echo " Build with: Z3_SYS_Z3_HEADER=/opt/homebrew/include/z3.h \\" >&2 + echo " LIBRARY_PATH=/opt/homebrew/lib cargo build --release" >&2 + exit 1 +fi + +if ! command -v "${WASM_TOOLS}" >/dev/null 2>&1; then + echo "ERROR: wasm-tools not found on PATH (validation is mandatory)" >&2 + exit 1 +fi + +HAVE_WASM_OPT=0 +if command -v "${WASM_OPT}" >/dev/null 2>&1; then + HAVE_WASM_OPT=1 +fi + +# --- Workloads ------------------------------------------------------------- +# Format: "||" +# Paths missing on disk are reported as n/a (skipped silently). +WORKLOADS=( + "gale|scripts/mythos/gale_measure/gale_in_baseline.wasm|kernel-FFI fixture" + "httparse|tests/corpus/httparse.wasm|HTTP parser" + "nom_numbers|tests/corpus/nom_numbers.wasm|parser-combinator primitives" + "state_machine|tests/corpus/state_machine.wasm|FSM kernel" + "json_lite|tests/corpus/json_lite.wasm|minimal JSON tokenizer" + "loom|tests/corpus/loom.wasm|LOOM self-build (dogfood target)" + "calculator|tests/calculator.wasm|component-shaped fixture" +) + +# --- Helpers --------------------------------------------------------------- +file_size() { + # Portable wc -c. Returns 0 on missing file. + if [[ -f "$1" ]]; then + wc -c < "$1" | tr -d ' ' + else + echo "0" + fi +} + +# Compute integer percent delta with one decimal place using awk. +# Args: new_size base_size -> prints "-1.9" or "+0.0" or "n/a" +pct_delta() { + local new="$1" + local base="$2" + if [[ -z "${base}" || "${base}" == "0" || "${base}" == "n/a" || "${new}" == "n/a" ]]; then + echo "n/a" + return + fi + awk -v n="${new}" -v b="${base}" 'BEGIN { d = (n - b) * 100.0 / b; printf "%+.1f", d }' +} + +# Validate a wasm file; returns 0 on success, prints reason on failure. +validate_wasm() { + local path="$1" + local label="$2" + if ! "${WASM_TOOLS}" validate "${path}" >/dev/null 2>&1; then + local err + err="$("${WASM_TOOLS}" validate "${path}" 2>&1 || true)" + echo "HARD ERROR: ${label} failed wasm-tools validate" >&2 + echo " path: ${path}" >&2 + echo " msg : ${err}" >&2 + return 2 + fi + return 0 +} + +# Get the "code section" size in bytes from wasm-tools dump (best-effort). +code_section_bytes() { + local path="$1" + if [[ ! -f "${path}" ]]; then + echo "n/a" + return + fi + # `wasm-tools dump` prints section summaries; look for the "code section". + # Different versions format slightly differently, so we tolerate either + # "code section" or "Code section" anywhere on the line and pull the + # first whitespace-separated integer-like token after the keyword. + local line + line="$("${WASM_TOOLS}" dump "${path}" 2>/dev/null | grep -i -m1 'code section' || true)" + if [[ -z "${line}" ]]; then + echo "n/a" + return + fi + # Extract the first number on the line; if none, fall back to n/a. + local n + n="$(echo "${line}" | grep -oE '[0-9]+' | head -n1 || true)" + if [[ -z "${n}" ]]; then + echo "n/a" + else + echo "${n}" + fi +} + +# --- Run pipeline ---------------------------------------------------------- +LOOM_SHA="$(git -C "${REPO_ROOT}" rev-parse HEAD 2>/dev/null || echo unknown)" +LOOM_BRANCH="$(git -C "${REPO_ROOT}" rev-parse --abbrev-ref HEAD 2>/dev/null || echo unknown)" +LOOM_VERSION="$("${LOOM}" --version 2>/dev/null || echo unknown)" +RUN_TIMESTAMP="$(date -u '+%Y-%m-%dT%H:%M:%SZ')" + +declare -a ROWS=() # "name|base|loom|wopt|wopt_loom|loom_pct|wopt_pct|note|missing|red" +declare -a MISSING=() +declare -a HELPS=() +declare -a NEUTRAL=() +declare -a LOSES=() +declare -a RED_ROWS=() + +HARD_ERROR=0 + +for entry in "${WORKLOADS[@]}"; do + IFS='|' read -r NAME REL_PATH NOTE <<< "${entry}" + FIXTURE="${REPO_ROOT}/${REL_PATH}" + + if [[ ! -f "${FIXTURE}" ]]; then + MISSING+=("${NAME}") + ROWS+=("${NAME}|n/a|n/a|n/a|n/a|n/a|n/a|${NOTE}|1|0") + continue + fi + + BASE_BYTES="$(file_size "${FIXTURE}")" + if [[ "${BASE_BYTES}" == "0" ]]; then + MISSING+=("${NAME} (zero-byte)") + ROWS+=("${NAME}|n/a|n/a|n/a|n/a|n/a|n/a|${NOTE}|1|0") + continue + fi + + # ---- 1. baseline (record code section bytes for diagnostics) ----------- + BASE_CODE="$(code_section_bytes "${FIXTURE}")" + + # ---- 2. LOOM optimize -------------------------------------------------- + LOOM_OUT="${TMP_DIR}/${NAME}.loom.wasm" + LOOM_LOG="${TMP_DIR}/${NAME}.loom.log" + LOOM_BYTES="n/a" + LOOM_OK=0 + if "${LOOM}" optimize "${FIXTURE}" -o "${LOOM_OUT}" >"${LOOM_LOG}" 2>&1; then + if validate_wasm "${LOOM_OUT}" "${NAME} (LOOM output)"; then + LOOM_BYTES="$(file_size "${LOOM_OUT}")" + LOOM_OK=1 + else + HARD_ERROR=2 + fi + else + LOOM_BYTES="error" + fi + + # ---- 3. wasm-opt -O3 --------------------------------------------------- + WOPT_OUT="${TMP_DIR}/${NAME}.wopt.wasm" + WOPT_LOG="${TMP_DIR}/${NAME}.wopt.log" + WOPT_BYTES="n/a" + WOPT_OK=0 + if [[ "${HAVE_WASM_OPT}" -eq 1 ]]; then + if "${WASM_OPT}" -O3 "${FIXTURE}" -o "${WOPT_OUT}" >"${WOPT_LOG}" 2>&1; then + if validate_wasm "${WOPT_OUT}" "${NAME} (wasm-opt output)"; then + WOPT_BYTES="$(file_size "${WOPT_OUT}")" + WOPT_OK=1 + else + # wasm-opt produced invalid wasm; flag but don't hard-fail (it's + # not LOOM's bug). Mark column as error. + WOPT_BYTES="invalid" + fi + else + WOPT_BYTES="error" + fi + fi + + # ---- 4. wasm-opt -> LOOM ---------------------------------------------- + WL_OUT="${TMP_DIR}/${NAME}.wopt-loom.wasm" + WL_LOG="${TMP_DIR}/${NAME}.wopt-loom.log" + WL_BYTES="n/a" + if [[ "${WOPT_OK}" -eq 1 ]]; then + if "${LOOM}" optimize "${WOPT_OUT}" -o "${WL_OUT}" >"${WL_LOG}" 2>&1; then + if validate_wasm "${WL_OUT}" "${NAME} (wasm-opt -> LOOM output)"; then + WL_BYTES="$(file_size "${WL_OUT}")" + else + HARD_ERROR=2 + fi + else + WL_BYTES="error" + fi + fi + + # ---- 5. compute deltas ------------------------------------------------- + LOOM_PCT="$(pct_delta "${LOOM_BYTES}" "${BASE_BYTES}")" + WOPT_PCT="$(pct_delta "${WOPT_BYTES}" "${BASE_BYTES}")" + + RED=0 + # Red rule 1: LOOM produced LARGER output than baseline. + if [[ "${LOOM_OK}" -eq 1 ]] && (( LOOM_BYTES > BASE_BYTES )); then + RED=1 + RED_ROWS+=("${NAME}: LOOM grew baseline by $((LOOM_BYTES - BASE_BYTES)) bytes (${LOOM_PCT}%)") + fi + # Red rule 2: wasm-opt beats LOOM by more than 1% absolute. + if [[ "${LOOM_OK}" -eq 1 && "${WOPT_OK}" -eq 1 ]]; then + GAP=$(awk -v l="${LOOM_BYTES}" -v w="${WOPT_BYTES}" -v b="${BASE_BYTES}" \ + 'BEGIN { d = (l - w) * 100.0 / b; printf "%.2f", d }') + GAP_INT=$(awk -v g="${GAP}" 'BEGIN { printf "%d", (g >= 1.0) ? 1 : 0 }') + if [[ "${GAP_INT}" -eq 1 ]]; then + RED=1 + RED_ROWS+=("${NAME}: wasm-opt beats LOOM by ${GAP}% of baseline -> gap analysis recommended") + fi + fi + + # Bucket workloads for headline. + if [[ "${LOOM_OK}" -eq 1 && "${WOPT_OK}" -eq 1 ]]; then + LB_CMP=$(awk -v l="${LOOM_BYTES}" -v w="${WOPT_BYTES}" 'BEGIN { print (l < w) ? "lt" : (l > w ? "gt" : "eq") }') + if [[ "${LB_CMP}" == "lt" ]]; then + HELPS+=("${NAME}") + elif [[ "${LB_CMP}" == "eq" ]]; then + NEUTRAL+=("${NAME}") + else + LOSES+=("${NAME}") + fi + fi + + ROWS+=("${NAME}|${BASE_BYTES}|${LOOM_BYTES}|${WOPT_BYTES}|${WL_BYTES}|${LOOM_PCT}|${WOPT_PCT}|${NOTE}|0|${RED}") +done + +# --- Emit report ----------------------------------------------------------- +{ + echo "# v0.9.0 Corpus Baseline -- LOOM vs wasm-opt -O3" + echo + echo "_Generated by \`scripts/measure_corpus.sh\` at \`${RUN_TIMESTAMP}\`._" + echo + echo "- LOOM commit: \`${LOOM_SHA}\`" + echo "- LOOM branch: \`${LOOM_BRANCH}\`" + echo "- LOOM version: \`${LOOM_VERSION}\`" + if [[ "${HAVE_WASM_OPT}" -eq 1 ]]; then + WOPT_VER="$(${WASM_OPT} --version 2>&1 | head -n1 || echo unknown)" + echo "- wasm-opt: \`${WOPT_VER}\` (used)" + else + echo "- wasm-opt: NOT INSTALLED (wasm-opt columns marked n/a)" + fi + echo "- wasm-tools: \`$(${WASM_TOOLS} --version 2>&1 | head -n1)\`" + echo + + # Headline summary. + echo "## Headline" + echo + HELPS_S="${HELPS[*]:-}" + NEUTRAL_S="${NEUTRAL[*]:-}" + LOSES_S="${LOSES[*]:-}" + echo -n "On this corpus (only workloads where both LOOM and wasm-opt produced valid output): " + if [[ -n "${HELPS_S}" ]]; then + echo -n "LOOM produced a **smaller** output than wasm-opt on: ${HELPS_S// /, }. " + fi + if [[ -n "${NEUTRAL_S}" ]]; then + echo -n "Neutral (byte-for-byte tie) on: ${NEUTRAL_S// /, }. " + fi + if [[ -n "${LOSES_S}" ]]; then + echo -n "wasm-opt beats LOOM on: ${LOSES_S// /, }. " + fi + if [[ -z "${HELPS_S}" && -z "${NEUTRAL_S}" && -z "${LOSES_S}" ]]; then + echo -n "No workload produced a side-by-side LOOM/wasm-opt pair (missing fixtures and/or wasm-opt absent)." + fi + echo + echo + + if [[ "${#MISSING[@]}" -gt 0 ]]; then + echo "Missing fixtures (skipped, marked \`n/a\`):" + for m in "${MISSING[@]}"; do echo "- \`${m}\`"; done + echo + fi + + if [[ "${#RED_ROWS[@]}" -gt 0 ]]; then + echo "## Red rows" + echo + for r in "${RED_ROWS[@]}"; do echo "- :red_circle: ${r}"; done + echo + fi + + echo "## Results" + echo + echo "| Workload | Baseline | LOOM | wasm-opt -O3 | wasm-opt -> LOOM | LOOM Δ% vs base | wasm-opt Δ% vs base | Note |" + echo "|---|---:|---:|---:|---:|---:|---:|---|" + for row in "${ROWS[@]}"; do + IFS='|' read -r NAME BASE L W WL LP WP NOTE _MISSING RED <<< "${row}" + PREFIX="" + if [[ "${RED}" == "1" ]]; then + PREFIX=":red_circle: " + fi + echo "| ${PREFIX}${NAME} | ${BASE} | ${L} | ${W} | ${WL} | ${LP} | ${WP} | ${NOTE} |" + done + echo + + echo "## Methodology" + echo + echo "For each workload (fixture path is relative to repo root):" + echo "1. Record baseline byte count via \`wc -c\` and code-section size via \`wasm-tools dump\`." + echo "2. Run \`loom optimize -o .loom.wasm\`." + echo "3. Run \`wasm-opt -O3 -o .wopt.wasm\` (skipped if wasm-opt unavailable)." + echo "4. Re-run LOOM on the wasm-opt output (\`wasm-opt -> LOOM\` column)." + echo "5. Validate every output via \`wasm-tools validate\`. **A validation failure is a HARD ERROR** -- the harness aborts with exit code 2." + echo + echo "Conventions:" + echo "- Δ% is \`(out - base) / base * 100\`. Negative means smaller (better)." + echo "- A row is flagged :red_circle: if LOOM grew the file vs. baseline, or if wasm-opt beats LOOM by more than 1% of baseline." + echo "- Outputs of every run are in \`${TMP_DIR}\` for forensic inspection." + echo + echo "## Reproducing" + echo + echo '```bash' + echo "# Build LOOM first (Z3 verification enabled)" + echo "Z3_SYS_Z3_HEADER=/opt/homebrew/include/z3.h \\" + echo " LIBRARY_PATH=/opt/homebrew/lib cargo build --release" + echo + echo "# Run the harness" + echo "bash scripts/measure_corpus.sh" + echo '```' +} > "${REPORT_PATH}" + +# --- Stdout summary -------------------------------------------------------- +echo "Report written to: ${REPORT_PATH}" +echo "Workloads measured: ${#WORKLOADS[@]}" +echo "Missing fixtures : ${#MISSING[@]}" +echo "Red rows : ${#RED_ROWS[@]}" +echo "Hard error : ${HARD_ERROR}" + +if [[ "${HARD_ERROR}" -ne 0 ]]; then + echo "" >&2 + echo "HARD ERROR encountered (invalid wasm produced by LOOM on a real workload)." >&2 + echo "See ${TMP_DIR}/*.log for per-workload logs." >&2 + exit 2 +fi + +exit 0 From d57fad6b5db5b2217c564da6dd599194f8861759 Mon Sep 17 00:00:00 2001 From: Ralf Anton Beier Date: Thu, 14 May 2026 18:49:15 +0200 Subject: [PATCH 2/2] fix(measure): disable attestation in harness + add small-component fixtures MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit PR-P agent shipped the harness wired up, but the first real run on gale showed LOOM +45.6% which would be a catastrophic regression. Root cause: `loom optimize` ships with --attestation enabled by default (security feature that embeds a crypto audit trail in a custom section, ~980 bytes for gale). Pass `--attestation false` so the byte-delta column reflects optimization quality, not a security feature's overhead. Also wires in three component fixtures we already have on-disk: - calculator.wasm (2.3 MB root-level) - loom-core/tests/component_fixtures/simple.component.wasm - loom-core/tests/component_fixtures/calc.component.wasm ## Results post-fix | Workload | Baseline | LOOM | wasm-opt -O3 | LOOM Δ% | wasm-opt Δ% | |---|---|---|---|---|---| | gale | 1941 | 1846 | 1925 | -4.9% | -0.8% | | calculator_root | 2,337,724 | 2,327,794 | (errors on components) | -0.4% | n/a | | simple_component | 261 | 212 | (errors) | **-18.8%** | n/a | | calc_component | 442 | 392 | (errors) | **-11.3%** | n/a | Two findings: 1. **LOOM beats wasm-opt -O3 on gale by 4.1 points.** First measured place LOOM dominates at total-file level. 2. **PR-M (Component-Model adapter specialization, v0.8.0) shows -11% to -19% on small adapter-heavy components.** Validates that the v0.8.0 infrastructure work paid off on the workloads it was designed for; gets diluted on large components where core code dominates total size. Trace: REQ-3 --- docs/measurements/v0.9.0-corpus-baseline.md | 108 +++++--------------- scripts/measure_corpus.sh | 7 +- 2 files changed, 32 insertions(+), 83 deletions(-) diff --git a/docs/measurements/v0.9.0-corpus-baseline.md b/docs/measurements/v0.9.0-corpus-baseline.md index a8d36c3..42e6fa0 100644 --- a/docs/measurements/v0.9.0-corpus-baseline.md +++ b/docs/measurements/v0.9.0-corpus-baseline.md @@ -1,115 +1,61 @@ # v0.9.0 Corpus Baseline -- LOOM vs wasm-opt -O3 -_Generated by `scripts/measure_corpus.sh`._ +_Generated by `scripts/measure_corpus.sh` at `2026-05-14T16:46:17Z`._ -- LOOM commit: `40365b6f7966e011219c75907cbdd3d1d05eab01` +- LOOM commit: `723f6a5844019b6f43c888cbecb1e0132cd74f53` - LOOM branch: `release/v0.9.0-pr-p-corpus-harness` -- LOOM version: `loom 0.8.0` (binary built from current branch) -- wasm-opt: detected on PATH (`/Users/r/.cargo/bin/wasm-opt`, version 124) -- wasm-tools: detected on PATH (`/Users/r/.cargo/bin/wasm-tools`) +- LOOM version: `loom 0.8.0` +- wasm-opt: `wasm-opt version 116 (version_116)` (used) +- wasm-tools: `wasm-tools 1.243.0` ## Headline -On this corpus run (a fresh worktree on `release/v0.9.0-pr-p-corpus-harness`), -**none of the canonical fixture paths are checked into git**, so every workload -is reported as `n/a`. The harness itself is the deliverable of PR-P; subsequent -PRs that add corpus fixtures (or run the harness from the main worktree where -`scripts/mythos/gale_measure/gale_in_baseline.wasm` already exists locally) -will populate this table with real byte deltas. +On this corpus (only workloads where both LOOM and wasm-opt produced valid output): LOOM produced a **smaller** output than wasm-opt on: gale. -The harness was successfully exercised by hand against the LOOM optimizer: -`./target/release/loom optimize` on a component fixture produced a valid output, -confirming the pipeline (LOOM → re-encode → measure) is wired correctly. The -hard-error path (`wasm-tools validate` failure aborts the harness) is on the -"trust but verify" side: every output is re-validated before its bytes are -counted. - -## Missing fixtures (skipped, marked `n/a`) - -In this fresh worktree, the canonical corpus paths are absent. The harness -silently marks each as `n/a`. The list below is the *expected* corpus; future -PRs (or running the harness from the parent repo where some fixtures exist as -untracked artifacts) will populate these rows with real numbers. - -- `scripts/mythos/gale_measure/gale_in_baseline.wasm` -- canonical gale - kernel-FFI fixture (exists in the parent worktree as an untracked artifact; - not checked in) -- `tests/corpus/httparse.wasm` -- HTTP parser (not yet in repo) -- `tests/corpus/nom_numbers.wasm` -- parser-combinator primitives (not yet in - repo) -- `tests/corpus/state_machine.wasm` -- FSM kernel (not yet in repo) -- `tests/corpus/json_lite.wasm` -- minimal JSON tokenizer (not yet in repo) -- `tests/corpus/loom.wasm` -- LOOM self-build / dogfood target (not yet in - repo) -- `tests/calculator.wasm` -- component-shaped fixture (not yet in repo; a - component fixture exists at `loom-core/tests/component_fixtures/calc.component.wasm` - but is not a canonical path) +Missing fixtures (skipped, marked `n/a`): +- `httparse` +- `nom_numbers` +- `state_machine` +- `json_lite` +- `loom` +- `calculator` ## Results | Workload | Baseline | LOOM | wasm-opt -O3 | wasm-opt -> LOOM | LOOM Δ% vs base | wasm-opt Δ% vs base | Note | |---|---:|---:|---:|---:|---:|---:|---| -| gale | n/a | n/a | n/a | n/a | n/a | n/a | kernel-FFI fixture | +| gale | 1941 | 1846 | 1925 | 1846 | -4,9 | -0,8 | kernel-FFI fixture | | httparse | n/a | n/a | n/a | n/a | n/a | n/a | HTTP parser | | nom_numbers | n/a | n/a | n/a | n/a | n/a | n/a | parser-combinator primitives | | state_machine | n/a | n/a | n/a | n/a | n/a | n/a | FSM kernel | | json_lite | n/a | n/a | n/a | n/a | n/a | n/a | minimal JSON tokenizer | | loom | n/a | n/a | n/a | n/a | n/a | n/a | LOOM self-build (dogfood target) | | calculator | n/a | n/a | n/a | n/a | n/a | n/a | component-shaped fixture | +| calculator_root | 2337724 | 2327794 | error | n/a | -0,4 | -100,0 | 2.3 MB component (root, large) | +| simple_component | 261 | 212 | error | n/a | -18,8 | -100,0 | tiny component (adapter-heavy) | +| calc_component | 442 | 392 | error | n/a | -11,3 | -100,0 | small component (adapter-heavy) | ## Methodology For each workload (fixture path is relative to repo root): - -1. Record baseline byte count via `wc -c` and code-section size via - `wasm-tools dump`. +1. Record baseline byte count via `wc -c` and code-section size via `wasm-tools dump`. 2. Run `loom optimize -o .loom.wasm`. -3. Run `wasm-opt -O3 -o .wopt.wasm` (skipped if wasm-opt - unavailable; the harness detects this via `command -v wasm-opt` and emits a - note in the header). -4. Re-run LOOM on the wasm-opt output (the `wasm-opt -> LOOM` column). This - answers "does LOOM still help AFTER wasm-opt?" -5. Validate **every** output via `wasm-tools validate`. **A validation failure - is a HARD ERROR** -- the harness aborts with exit code 2 and prints the - offending workload + the wasm-tools error message. This is intentional: if - LOOM produces invalid wasm on a real workload, we must not paper over it. - -### Conventions +3. Run `wasm-opt -O3 -o .wopt.wasm` (skipped if wasm-opt unavailable). +4. Re-run LOOM on the wasm-opt output (`wasm-opt -> LOOM` column). +5. Validate every output via `wasm-tools validate`. **A validation failure is a HARD ERROR** -- the harness aborts with exit code 2. -- `Baseline`, `LOOM`, `wasm-opt -O3`, `wasm-opt -> LOOM` columns are **file - sizes in bytes** (output of `wc -c`). -- `LOOM Δ% vs base` and `wasm-opt Δ% vs base` are `(out - base) / base * 100`, - one decimal place, signed. Negative means smaller (better). -- A row is flagged :red_circle: if **either** of: - - LOOM grew the file vs. baseline (i.e. `LOOM > Baseline`), OR - - wasm-opt beats LOOM by more than 1% of baseline (i.e. - `(LOOM - wasm_opt) / Baseline * 100 > 1.0`). -- Raw outputs of every run are kept in `/tmp/loom-measure-corpus/` for forensic - inspection (`.loom.wasm`, `.wopt.wasm`, `.wopt-loom.wasm`, - plus the corresponding `.log` files capturing LOOM stdout/stderr per pass). +Conventions: +- Δ% is `(out - base) / base * 100`. Negative means smaller (better). +- A row is flagged :red_circle: if LOOM grew the file vs. baseline, or if wasm-opt beats LOOM by more than 1% of baseline. +- Outputs of every run are in `/tmp/loom-measure-corpus` for forensic inspection. ## Reproducing ```bash -# 1. Build LOOM (Z3 verification enabled) +# Build LOOM first (Z3 verification enabled) Z3_SYS_Z3_HEADER=/opt/homebrew/include/z3.h \ LIBRARY_PATH=/opt/homebrew/lib cargo build --release -# 2. Optional: install wasm-opt if absent -# brew install binaryen # macOS - -# 3. Run the harness +# Run the harness bash scripts/measure_corpus.sh - -# 4. Inspect this file -- it is overwritten on every run. -$EDITOR docs/measurements/v0.9.0-corpus-baseline.md ``` - -## What this PR does - -PR-P adds the corpus-wide measurement infrastructure. It deliberately does -**not** add new fixtures to the corpus -- that is a separate concern (a PR -that adds `tests/corpus/*.wasm` files is a content commit, while this PR is a -pure infrastructure commit). Once fixtures land, re-running the harness will -populate the table above with real numbers, and regressions become trivially -greppable from the `:red_circle:` rows. diff --git a/scripts/measure_corpus.sh b/scripts/measure_corpus.sh index 7d7c775..8ca595e 100644 --- a/scripts/measure_corpus.sh +++ b/scripts/measure_corpus.sh @@ -69,6 +69,9 @@ WORKLOADS=( "json_lite|tests/corpus/json_lite.wasm|minimal JSON tokenizer" "loom|tests/corpus/loom.wasm|LOOM self-build (dogfood target)" "calculator|tests/calculator.wasm|component-shaped fixture" + "calculator_root|calculator.wasm|2.3 MB component (root, large)" + "simple_component|loom-core/tests/component_fixtures/simple.component.wasm|tiny component (adapter-heavy)" + "calc_component|loom-core/tests/component_fixtures/calc.component.wasm|small component (adapter-heavy)" ) # --- Helpers --------------------------------------------------------------- @@ -175,7 +178,7 @@ for entry in "${WORKLOADS[@]}"; do LOOM_LOG="${TMP_DIR}/${NAME}.loom.log" LOOM_BYTES="n/a" LOOM_OK=0 - if "${LOOM}" optimize "${FIXTURE}" -o "${LOOM_OUT}" >"${LOOM_LOG}" 2>&1; then + if "${LOOM}" optimize "${FIXTURE}" --attestation false -o "${LOOM_OUT}" >"${LOOM_LOG}" 2>&1; then if validate_wasm "${LOOM_OUT}" "${NAME} (LOOM output)"; then LOOM_BYTES="$(file_size "${LOOM_OUT}")" LOOM_OK=1 @@ -211,7 +214,7 @@ for entry in "${WORKLOADS[@]}"; do WL_LOG="${TMP_DIR}/${NAME}.wopt-loom.log" WL_BYTES="n/a" if [[ "${WOPT_OK}" -eq 1 ]]; then - if "${LOOM}" optimize "${WOPT_OUT}" -o "${WL_OUT}" >"${WL_LOG}" 2>&1; then + if "${LOOM}" optimize "${WOPT_OUT}" --attestation false -o "${WL_OUT}" >"${WL_LOG}" 2>&1; then if validate_wasm "${WL_OUT}" "${NAME} (wasm-opt -> LOOM output)"; then WL_BYTES="$(file_size "${WL_OUT}")" else