fix(opt): decouple slot-stack from inst_id in wasm_to_ir (#121) by avrabe · Pull Request #122 · pulseengine/synth

avrabe · 2026-05-18T16:29:55Z

Summary

Closes #121 — the root architectural fix that PR #117's five rounds of continue patches were skating around. The wasm_to_ir lowering now tracks producer vregs via an explicit `slot_stack: Vec` parallel to `inst_id`, instead of overloading `inst_id` as both the IR id and the wasm-stack-slot index.

This is silicon-priority: the bug Gale reported in the field (wasm modules with Drop/LocalSet/Store mid-stream) is fixed here. PR #117's temporary demotion of `wasm_ops_lower_or_error` to `gating: false` is unaffected (the demote commit was on #117's branch, never reached main), but #117 may now rebase and revert its workflow change once this lands.

What was broken

`OptimizerBridge::wasm_to_ir` used a single `inst_id` counter as both the unique IR-instruction id AND a vreg-slot index. Binary/unary handlers used `inst_id.saturating_sub(N)` to look up operands, assuming a 1:1 correspondence with wasm value stack positions. That assumption broke whenever a wasm op consumed a slot without producing a vreg (Drop, LocalSet, GlobalSet, Stores, control-flow ops...). The result was either:

A loud panic at `get_arm_reg` (the defensive guard at line 1670 — what the fuzz harness kept hitting).
A silent miscompilation reading whatever stale vreg was bound to the consumed slot (the Gale class — much worse on hardware).

The non-optimized path (`select_with_stack`) was unaffected because it uses a real value stack.

What this PR does

Introduces `slot_stack: Vec` alongside `inst_id`. Producers push their dest slot, consumers pop to discover their src slots.
Migrates all ~30 op handlers and 136 `inst_id.saturating_sub` call sites to slot_stack semantics.
i64 register-pair model preserved: i64 values occupy 2 consecutive slot_stack entries (lo, hi).
Drop becomes an explicit `slot_stack.pop(); continue;` — no IR instruction emitted.
Nop/Unreachable/Return still emit no slot effect (matches the PR fix(lowering): return Err on stack underflow instead of panic — fuzz #113 #117 fix surface).
Catch-all `_ => Opcode::Nop` preserved as the issue memset/memcpy/memmove i64-codegen produces non-terminating loop on Cortex-M (silicon-blocking) #93 bug-finder for unknown ops.

Drive-by fix

`i64_operand_count` was missing the i64 div/rem variants (I64DivS/U, I64RemS/U). The old `inst_id.saturating_sub(4)` math happened to fortuitously work for the existing `i64_div.wast` test due to saturating-sub at slot boundaries; the slot_stack refactor unmasked this as a real `pop()` on an empty stack. Added the four ops to `i64_operand_count` to resolve cleanly.

Tests

`crates/synth-synthesis/tests/regression_issue_121_slot_stack.rs` — 12 new tests, all passing:

Panic-free shapes (the fuzz-found inputs):

`drop_between_producer_and_consumer` — the PR fix(lowering): return Err on stack underflow instead of panic — fuzz #113 #117 round-6 input.
`local_set_between_producer_and_consumer`, `global_set_between_producer_and_consumer`, `i32_store_between_producer_and_consumer`.
`block_loop_end_between_producer_and_consumer`, `br_if_between_producer_and_consumer`, `local_tee_then_consumer`.
`double_drop_then_const`, `mixed_i32_i64_with_drop`, `i64_drop_between_i64_consts`.

Semantic correctness (proving the silent-miscompilation path is fixed, not just the panic path):

`drop_preserves_correct_value_for_consumer` — `[const(7), const(11), drop, popcnt]` must operate on 7, not 11. Asserts the Popcnt instruction's src points at const(7)'s slot.
`local_set_preserves_correct_value_for_consumer` — same shape with LocalSet instead of Drop.

Full workspace test (excluding synth-verify — z3 network issue): 1041 passing, 0 regressions. The 4 existing AAPCS/i64 regression tests from PR #100/#101/#103/#104 plus the #93 memset/i64-shift tests all continue to pass.

Test plan

CI green across Test / Clippy / Format / Z3 / Kani / Bazel.
Gating fuzz harness `wasm_ops_lower_or_error` passes on the bd4ae7f/120c187/round-6 corpus seeds (it'll run automatically).
No regression in existing AAPCS / i64 / i32 selector tests.
Once this lands, PR fix(lowering): return Err on stack underflow instead of panic — fuzz #113 #117 can be rebased; its temporary demotion of `wasm_ops_lower_or_error` to non-gating is no longer needed.

Refs

Issue wasm_to_ir: inst_id overloaded as both IR-id and vreg-slot — decouple via explicit slot_stack #121 (silicon-priority — Gale-reported in the field)
PR fix(lowering): return Err on stack underflow instead of panic — fuzz #113 #117 (fuzz-harness reproductions across six rounds)
Issue memset/memcpy/memmove i64-codegen produces non-terminating loop on Cortex-M (silicon-blocking) #93 (silent-drop class — the catch-all bug-finder)
PR fix(opt): defensive panic on unmapped vreg instead of silent R0 fallback #101 (defensive panic that surfaced this)

🤖 Generated with Claude Code

`OptimizerBridge::wasm_to_ir` overloaded `inst_id` as both the unique IR instruction id AND a vreg-slot index, with back-references like `inst_id.saturating_sub(2)` assuming a one-to-one correspondence with the wasm value stack. That assumption broke whenever any wasm op consumed a stack slot without producing one — Drop, LocalSet, GlobalSet, the i32/i64 store family, BrIf, and the structural Block/Loop/End markers. The next binary or unary op's back-reference would then index a stale or never-mapped vreg, and `get_arm_reg` would either trip the PR #101 defensive panic or (pre-PR-101) silently fall back to R0 — the silent-miscompilation class first surfaced in issue #93. Gale (the real-hardware test rig) caught WASM modules in the field that tripped this on production silicon; the cargo-fuzz `wasm_ops_lower_or_error` harness on PR #117 surfaced the same class six different ways (Nop/Unreachable/Return were closed there; Drop, LocalSet, Store, Block/Loop/End remained until this PR). Fix: introduce `slot_stack: Vec<u32>` in `wasm_to_ir` that mirrors the wasm value stack. Each producer pushes its dest vreg onto slot_stack; each consumer pops to discover its source vreg. `inst_id` reverts to its original meaning — a monotonically increasing unique IR id — and is no longer used for slot lookup. i64 values occupy two consecutive entries on slot_stack (lo first, then hi), matching the (dest_lo, dest_hi) two-vreg-pair layout already used by i64 opcodes. I64ExtendI32U/S aliases dest_lo to the consumed i32 src vreg by IR convention (preserved); I32WrapI64 aliases dest to src_lo (preserved). Drop becomes an explicit `slot_stack.pop(); continue` no-IR-emit arm; Nop/Unreachable/Return emit Opcode::Nop with no slot_stack effect. Drive-by: `i64_operand_count` was missing I64DivS/I64DivU/I64RemS/ I64RemU (so `analyze_i64_local_gets` failed to mark their i64 operands), which was masked by the same inst_id-slot scrambling. Added them; the existing i64-div WAST tests now exercise the i64 LocalGet path instead of fortuitously-correct i32 Loads. The catch-all `_ => Opcode::Nop` is preserved as a bug-finder: unknown ops do not touch slot_stack, so subsequent consumers fail loudly via `slot_stack.pop().expect(...)` instead of silently mis-binding vregs. Regression coverage: new `crates/synth-synthesis/tests/regression_issue_121_slot_stack.rs` exercises Drop/LocalSet/GlobalSet/Store/BrIf/Block-Loop-End/LocalTee between producer-and-consumer plus i32 and i64 variants. Two semantic-correctness probes confirm that Popcnt reads the surviving stack value (not the dropped one) — proving the fix addresses silent miscompilation, not just the panic. Test delta: +12 tests, 0 regressions. The 4 fuzz-related regression tests from #100/#101/#103/#104 plus the #93 memset/i64-shift tests all continue to pass. Refs: issue #121, PR #117 (fuzz-harness reproductions), issue #93 (silent-drop class), PR #101 (defensive panic), PR #100 (fuzz harness).

codecov · 2026-05-18T18:00:02Z

Codecov Report

❌ Patch coverage is 75.32957% with 131 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
crates/synth-synthesis/src/optimizer_bridge.rs	75.32%	131 Missing ⚠️

📢 Thoughts on this report? Let us know!

avrabe mentioned this pull request May 18, 2026

wasm_to_ir: inst_id overloaded as both IR-id and vreg-slot — decouple via explicit slot_stack #121

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(opt): decouple slot-stack from inst_id in wasm_to_ir (#121)#122

fix(opt): decouple slot-stack from inst_id in wasm_to_ir (#121)#122
avrabe wants to merge 1 commit into
mainfrom
fix/issue-121-wasm-to-ir-slot-stack

avrabe commented May 18, 2026

Uh oh!

codecov Bot commented May 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

avrabe commented May 18, 2026

Summary

What was broken

What this PR does

Drive-by fix

Tests

Test plan

Refs

Uh oh!

codecov Bot commented May 18, 2026

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant