fix(adapter): pad CABI alignment in async-lift retptr writeback (LS-A-10)#148
Merged
Conversation
…-10)
Both generate_async_callback_adapter and generate_async_stackful_adapter
wrote result globals to the caller's retptr buffer by advancing the
byte offset by only the flat size of each value (4 or 8 bytes) and
never aligning up to the next field's natural alignment.
For any P3 async lift whose flat-lowered result globals contain an i32
or f32 followed by an i64 or f64 — e.g. record {u32, u64}, tuple
(s32, s64), result<u64, u32> — the canonical ABI requires the 8-byte
field at offset 8 (with 4 bytes of padding after the 4-byte field). The
emitters wrote it at offset 4. Wasm engines treat MemArg.align as a
hint and do not trap on misalignment, so the bug was silent data
corruption: the caller's canon.lower reads the retptr per canonical
ABI and sees stale/zero bytes for the high-order field.
The defect predated #146 / #147; the recent stackful emitter inherited
it via a faithful pattern copy from the callback emitter. The
emit_param_copy_step refactor from #147 did not cover the retptr
writeback, so the duplication remained.
Fix:
- New emit_globals_to_retptr_cabi helper aligns the offset up before
every store (size, align) = natural(val_ty); offset = align_up(offset,
align); offset += size). align_up uses the standard power-of-two bit-
twiddle (n + a - 1) & !(a - 1).
- Both emitters now route through this single CABI-correct writeback.
- Regression test cabi_alignment_stackful_retptr_writes_i64_at_offset_8
builds a synthetic [I32, I64] result, decodes the emitted i64.store
offset via ULEB, and asserts it equals 8.
Discovery:
- v0.8.0 pre-release Mythos delta-pass on adapter/fact.rs per
scripts/mythos/discover.md. Oracle pair: Kani harness (offset
divergence between emitter loop and align_up reference model) +
PoC failing test (the regression test added here).
- LS-A-10 added to safety/stpa/loss-scenarios.yaml under UCA-A-13 with
approved status (the Mythos validate.md round produced a clean
confirmation).
Tests: 207 lib pass (206 prior + 1 regression), 11 p3_async_lowering
pass (callback path unchanged behaviour for same-aligned results).
Refs: LS-A-10, UCA-A-13, H-4, H-4.5
Discovered by: Mythos discover.md v0.8.0 delta-pass
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
avrabe
added a commit
that referenced
this pull request
May 14, 2026
P3 stackful lifting trampoline emitter (#140 / SR-32, sub-#94). Phase 1 (#146) added the four thread_* host intrinsics under pulseengine:async with pinned signatures, the uses_stackful_lift() derived helper, and ADR-1 addendum (a) documenting the two-mode policy. Phase 2 (#147) shipped the trampoline emitter. The dispatcher routes async-lift sites without a [callback] companion to a new generate_async_stackful_adapter; the trampoline calls the lift function directly and relies on the runtime to treat the call as a transparent fiber boundary (transparent suspension on task.wait). The Phase 1 thread_* ABI remains valid for component-internal concurrency but is not consumed by the trampoline. ADR-1 addendum (b) records the shipped design versus the originally sketched thread_new + drive loop and why the simpler direct-call design was chosen. Pre-release Mythos delta-pass on adapter/fact.rs surfaced LS-A-10 — a CABI alignment padding bug in the async-lift retptr writeback that predated this milestone and affected both callback and stackful emitters. Fix landed in #148 before tag: extracted shared emit_globals_to_retptr_cabi helper that pads via align_up before each store. Regression pinned by cabi_alignment_stackful_retptr_writes_i64_at_offset_8. SR-32 status: implemented. Cross-memory (ptr, len) result returns from stackful mode are deferred to a follow-up under #140 reopen if needed; the emitter produces a clear error for that case rather than emit wrong wasm. Tests: 207 lib pass, 11 P3 async integration pass. Refs: #94, #140 (closed), SR-32, LS-A-10, ADR-1, #146, #147, #148 Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Pre-release Mythos delta-pass on `meld-core/src/adapter/fact.rs` (the
Tier-5 file changed by v0.8.0) found a real bug. Release-blocking
for v0.8.0 — this PR fixes it.
The bug
Both `generate_async_callback_adapter` and the new
`generate_async_stackful_adapter` wrote each task.return result global
to the caller's retptr buffer with the byte offset advanced by only
the flat size of the value (4 or 8 bytes) — never aligning up to the
next field's natural alignment.
For any P3 async lift returning an aggregate whose flat lowering mixes
4-byte and 8-byte values — record `{u32, u64}`, tuple `(s32, s64)`,
`result<u64, u32>` — the canonical ABI requires the 8-byte field at
offset 8 (with 4 bytes of padding after the 4-byte field). The emitters
wrote it at offset 4.
Wasm engines treat `MemArg.align` as a hint and do not trap on
misalignment, so this was silent data corruption: the caller's
canon.lower reads the retptr per canonical ABI and sees stale / zero
bytes for the high-order field. Affects every downstream runtime
consuming a meld-fused module (kiln, wasmtime, browsers).
Why both emitters?
The defect pre-dated #146 / #147. The recent stackful emitter inherited
it via a faithful pattern copy from the callback emitter — exactly the
shared-pattern bug category where a refactor for code reuse amplifies
a latent defect because the original code looked correct enough to
copy.
The `emit_param_copy_step` refactor in #147 covered step 0.5 but did
not cover the step-3 retptr writeback. This PR closes that gap.
Fix
New `emit_globals_to_retptr_cabi` helper:
```rust
let (size, align) = natural(val_ty);
offset = align_up(offset, align);
// emit store at offset
offset += size;
```
`align_up` is the standard power-of-two bit-twiddle
`(n + a - 1) & !(a - 1)`. Both emitters now route through this single
CABI-correct writeback.
Test
Regression test `cabi_alignment_stackful_retptr_writes_i64_at_offset_8`
builds a synthetic component with `[I32, I64]` result globals, decodes
the emitted `i64.store` offset via ULEB128, and asserts it equals 8.
Before fix: test fails with offset 4 (confirmed).
After fix: test passes; 207/207 lib + 11/11 P3 async integration green.
Safety case
`safety/stpa/loss-scenarios.yaml` under UCA-A-13 (H-4 / H-4.5).
and the "wasm align is a hint" silent-corruption multiplier.
`scripts/mythos/discover.md`. Oracle pair: Kani harness + PoC test.
v0.8.0 release gate
Per `docs/roadmap.md`: "Releases without a Mythos delta-pass on
changed Tier-5 files are blocked." The pass ran, surfaced LS-A-10,
this PR fixes it and adds the regression. v0.8.0 cut proceeds after
this lands.
Test plan
Refs
and stackful emitter from feat(p3-async): stackful lifting trampoline emitter (#140 phase 2, SR-32) #147
🤖 Generated with Claude Code