test/relaxed_dot: add i16-intermediate overflow boundary cases by matthargett · Pull Request #164 · WebAssembly/relaxed-simd

matthargett · 2026-05-21T08:32:02Z

Summary

Adds two new assert_return cases to
test/core/relaxed-simd/relaxed_dot_product.wast that exercise
the i16-intermediate truncation point with inputs that overflow
i16.

Both new cases use a = b = -128 for all 16 lanes:

i16x8.relaxed_dot_i8x16_i7x16_s — each i16 lane's pair
sum is -128*-128 + -128*-128 = 32768, which overflows
signed-i16. Spec-allowed set: {-32768, 32767} (per-pair wrap
or saturate).
i32x4.relaxed_dot_i8x16_i7x16_add_s with c = 0 — every
pair sum overflows the same way. After extadd_pairwise_i16x8_s
and the final +c, the spec-allowed set is
{-65536, -1, 65534} (the wrap/sat combinations of the two
pair sums).

A direct-byte-sum implementation (one that skips the i16
truncation point and sums all four byte products directly into
i32) produces 65536 per lane, which is outside that
spec-allowed set. The existing assertions in the file don't
catch that shape because their chosen byte values keep the pair
sum within signed-i16 range — e.g. -128 * -127 * 2 = 32512,
no overflow.

How this came up

While adding relaxed-SIMD support to WAMR (the wasm-micro-
runtime) fast-interp at
rebeckerspecialties/wasm-micro-runtime#3,
a chatgpt-codex-connector code-review bot flagged that our
i32x4.relaxed_dot_i8x16_i7x16_add_s implementation skipped the
i16 truncation step and summed all four byte products directly
into i32. We verified the report, fixed the implementation, and
added a unit test pinned at the chosen wrap value.

Then, as a sanity check, we ran the upstream relaxed-SIMD spec
testsuite against the WAMR build with the fix reverted —
expecting it to catch the bug we'd just fixed. It didn't.
All 69 of the relaxed-SIMD assert_returns passed with the
buggy implementation, because every existing test input keeps
the i16 pair sum within range. The bug shape is invisible
to the current test set.

This PR closes that gap.

Verification

WAT well-formed: wast2json --enable-relaxed-simd accepts
the modified file; assertion count goes from 11 → 13.
With WAMR fast-interp's fixed implementation in place, both
new assertions pass (WAMR's result is in the spec-allowed
set).

With the fix reverted, the new add_s assertion fails:

relaxed_dot_product.wast:118 i32x4.relaxed_dot_i8x16_i7x16_add_s(
  a = v128 -128*16, b = v128 -128*16, c = v128 0*4):
  got      v128 i32x4 65536 65536 65536 65536
  expected one of { -65536, -1, 65534 } per lane

— exactly the bug the original PR fixed, now visible at the
spec-test layer.

Notes on the chosen `either` sets

For the _s (no-add) variant I list only the two
homogeneous-across-lanes outcomes {[-32768]*8, [32767]*8},
matching the existing convention in the file (impls are
uniform across lanes; per-lane mixed wrap/sat is in principle
spec-allowed but not enumerated). Happy to expand to the full
combinatorial set if reviewers prefer.
For the _add_s variant the either includes the mixed
-1 outcome (one pair wraps, one saturates) because that's
the within-a-single-lane spec-allowed combination, not a
cross-lane mixing.
"Force-i7-to-zero" interpretation (which would yield 0 for
this input since b has the top bit set) is omitted, matching
the convention of the existing test at lines 63-71 in the same
file that also doesn't enumerate that case.

The existing assertions for both `i16x8.relaxed_dot_i8x16_i7x16_s` and `i32x4.relaxed_dot_i8x16_i7x16_add_s` exercise byte values where the i16 pair sum fits in i16 (e.g. `-128 * -127 * 2 = 32512`, within range). On those inputs an implementation that skips the i16 truncation point and sums byte products directly into the wider lane still produces a spec-allowed answer, so the bug stays hidden. Adding two assertions with `a = b = -128` for all 16 lanes. The i16 pair sum is `-128*-128 + -128*-128 = 32768`, which overflows signed-i16, so the spec-defined `wrap` and `saturate` paths diverge: - `i16x8.relaxed_dot_i8x16_i7x16_s`: {-32768, 32767} - `i32x4.relaxed_dot_i8x16_i7x16_add_s` (c=0): {-65536, -1, 65534} A direct-byte-sum implementation produces 32768 / 65536 respectively, both OUTSIDE the spec-allowed set. How this came up ---------------- While adding relaxed-SIMD support to WAMR (the wasm-micro- runtime) fast-interp at rebeckerspecialties/wasm-micro-runtime#3, a `chatgpt-codex-connector` code-review bot flagged that our `i32x4.relaxed_dot_i8x16_i7x16_add_s` implementation skipped the i16 truncation step and summed all four byte products directly into i32. We verified the report, fixed the implementation, and added a unit test pinned at the chosen wrap value. Then, as a sanity check, we ran the upstream relaxed-SIMD spec testsuite against the WAMR build with the fix REVERTED — to confirm the spec suite would have caught the bug. It didn't: the existing test inputs all stay within the i16 pair-sum range, so the missing i16 truncation didn't change any result. This commit closes that gap. Verified locally: with the fix in place, WAMR produces results in the spec-allowed set. With the fix reverted, the new assertion fails as expected, surfacing the bug. Signed-off-by: Matt Hargett <plaztiksyke@gmail.com>

matthargett mentioned this pull request May 21, 2026

fast-interp: relaxed-SIMD opcode lowering bytecodealliance/wasm-micro-runtime#4950

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test/relaxed_dot: add i16-intermediate overflow boundary cases#164

test/relaxed_dot: add i16-intermediate overflow boundary cases#164
matthargett wants to merge 1 commit into
WebAssembly:mainfrom
matthargett:relaxed-dot-i16-overflow-test

matthargett commented May 21, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

matthargett commented May 21, 2026

Summary

How this came up

Verification

Notes on the chosen either sets

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Notes on the chosen `either` sets