Skip to content

fix: LLVM -O2 miscompilation of BEAM unicode/iodata on wasm32#259

Merged
brandonpayton merged 2 commits into
mainfrom
erlang-otp
Apr 11, 2026
Merged

fix: LLVM -O2 miscompilation of BEAM unicode/iodata on wasm32#259
brandonpayton merged 2 commits into
mainfrom
erlang-otp

Conversation

@brandonpayton
Copy link
Copy Markdown
Member

Summary

  • Fix LLVM wasm32 backend miscompiling aggregate struct initialization with shadow-stack pointers at -O2
  • Add global.h patch: explicit field-by-field ESTACK/WSTACK init under #ifdef __wasm32__
  • Add Makefile patch: compile erl_unicode.c at -O1 to avoid iodata traversal codegen bug
  • Both patches applied automatically by build-erlang.sh

Details

BEAM's unicode:characters_to_binary/2 BIF and erts_native_filename_need traverse iodata using ESTACK macros (stack-based term traversal). At -O2, LLVM's wasm32 backend miscompiles the aggregate initialization of ErtsEStack/ErtsWStack structs that contain pointers to shadow-stack local arrays. This manifests as:

  • {error, <<>>, {}} from unicode:characters_to_binary([[]])
  • EISDIR error opening start_clean.boot (filename conversion fails)

The bug is a classic heisenbug — adding fprintf(stderr, ...) debug calls changes code layout enough to prevent miscompilation.

Root cause: LLVM optimization passes at -O2 incorrectly handle struct fields pointing to __stack_pointer-relative local arrays on wasm32.

Fix: Two-pronged approach:

  1. global.h: Replace aggregate initialization with explicit field-by-field assignment under #ifdef __wasm32__ (defensive)
  2. Makefile: Compile erl_unicode.c at -O1 (fixes the actual codegen bug)

Test plan

  • BEAM boots and runs: io:format("Hello from BEAM!~n"), halt().
  • Arithmetic, string, list comprehension, map, binary, atom operations work
  • Cargo tests: 703 passed
  • Vitest: 199 passed
  • musl libc-test: 0 FAIL (19 XFAIL)
  • POSIX test suite: 0 FAIL (1 XFAIL)

- Rewrite ring.erl with correct ring topology (self as counter,
  spawned processes as forwarders). Old version had a deadlock bug
  where set_next broke the chain.

- Work around BEAM wasm32 ~f format bug: float_to_list/1 uses broken
  ryu/snprintf path, use float_to_list/2 with {decimals,N} instead.

- Track thread workers in serve.ts for proper cleanup on exit. BEAM's
  halt(0) hangs in pthread_join because threads are blocked on futex
  waits. Add halt detection: if stdout goes idle for 2s after output,
  force clean exit.

- Remove unused threadChannelOffsets array and keepalive timer.

Ring benchmark results: 1000 processes, 100 rounds = 100K messages
in ~54ms (~1.85M messages/sec).
LLVM's wasm32 backend miscompiles aggregate initialization of structs
containing pointers to shadow-stack local arrays at -O2. This causes
BEAM's ESTACK macros and iodata traversal in erl_unicode.c to produce
garbage, breaking unicode:characters_to_binary/2 and filename conversion.

Two fixes:
- global.h: explicit field-by-field ESTACK/WSTACK init on wasm32
- Makefile: compile erl_unicode.c at -O1 to avoid the codegen bug

Both patches are applied automatically by build-erlang.sh.
@brandonpayton brandonpayton merged commit 6772ef3 into main Apr 11, 2026
@brandonpayton brandonpayton deleted the erlang-otp branch April 11, 2026 12:25
brandonpayton added a commit that referenced this pull request May 29, 2026
## Summary
- Fix LLVM wasm32 backend miscompiling aggregate struct initialization
with shadow-stack pointers at -O2
- Add `global.h` patch: explicit field-by-field ESTACK/WSTACK init under
`#ifdef __wasm32__`
- Add Makefile patch: compile `erl_unicode.c` at `-O1` to avoid iodata
traversal codegen bug
- Both patches applied automatically by `build-erlang.sh`

## Details
BEAM's `unicode:characters_to_binary/2` BIF and
`erts_native_filename_need` traverse iodata using ESTACK macros
(stack-based term traversal). At `-O2`, LLVM's wasm32 backend
miscompiles the aggregate initialization of `ErtsEStack`/`ErtsWStack`
structs that contain pointers to shadow-stack local arrays. This
manifests as:
- `{error, <<>>, {}}` from `unicode:characters_to_binary([[]])`
- EISDIR error opening `start_clean.boot` (filename conversion fails)

The bug is a classic heisenbug — adding `fprintf(stderr, ...)` debug
calls changes code layout enough to prevent miscompilation.

**Root cause**: LLVM optimization passes at `-O2` incorrectly handle
struct fields pointing to `__stack_pointer`-relative local arrays on
wasm32.

**Fix**: Two-pronged approach:
1. `global.h`: Replace aggregate initialization with explicit
field-by-field assignment under `#ifdef __wasm32__` (defensive)
2. `Makefile`: Compile `erl_unicode.c` at `-O1` (fixes the actual
codegen bug)

## Test plan
- [x] BEAM boots and runs: `io:format("Hello from BEAM!~n"), halt().`
- [x] Arithmetic, string, list comprehension, map, binary, atom
operations work
- [x] Cargo tests: 703 passed
- [x] Vitest: 199 passed
- [x] musl libc-test: 0 FAIL (19 XFAIL)
- [x] POSIX test suite: 0 FAIL (1 XFAIL)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant