test/relaxed_dot: add i16-intermediate overflow boundary cases#164
Open
matthargett wants to merge 1 commit into
Open
test/relaxed_dot: add i16-intermediate overflow boundary cases#164matthargett wants to merge 1 commit into
matthargett wants to merge 1 commit into
Conversation
The existing assertions for both `i16x8.relaxed_dot_i8x16_i7x16_s`
and `i32x4.relaxed_dot_i8x16_i7x16_add_s` exercise byte values
where the i16 pair sum fits in i16 (e.g. `-128 * -127 * 2 =
32512`, within range). On those inputs an implementation that
skips the i16 truncation point and sums byte products directly
into the wider lane still produces a spec-allowed answer, so the
bug stays hidden.
Adding two assertions with `a = b = -128` for all 16 lanes. The
i16 pair sum is `-128*-128 + -128*-128 = 32768`, which overflows
signed-i16, so the spec-defined `wrap` and `saturate` paths
diverge:
- `i16x8.relaxed_dot_i8x16_i7x16_s`: {-32768, 32767}
- `i32x4.relaxed_dot_i8x16_i7x16_add_s` (c=0): {-65536, -1, 65534}
A direct-byte-sum implementation produces 32768 / 65536
respectively, both OUTSIDE the spec-allowed set.
How this came up
----------------
While adding relaxed-SIMD support to WAMR (the wasm-micro-
runtime) fast-interp at rebeckerspecialties/wasm-micro-runtime#3,
a `chatgpt-codex-connector` code-review bot flagged that our
`i32x4.relaxed_dot_i8x16_i7x16_add_s` implementation skipped the
i16 truncation step and summed all four byte products directly
into i32. We verified the report, fixed the implementation, and
added a unit test pinned at the chosen wrap value.
Then, as a sanity check, we ran the upstream relaxed-SIMD spec
testsuite against the WAMR build with the fix REVERTED — to
confirm the spec suite would have caught the bug. It didn't:
the existing test inputs all stay within the i16 pair-sum
range, so the missing i16 truncation didn't change any result.
This commit closes that gap.
Verified locally: with the fix in place, WAMR produces results
in the spec-allowed set. With the fix reverted, the new
assertion fails as expected, surfacing the bug.
Signed-off-by: Matt Hargett <plaztiksyke@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds two new
assert_returncases totest/core/relaxed-simd/relaxed_dot_product.wastthat exercisethe i16-intermediate truncation point with inputs that overflow
i16.
Both new cases use
a = b = -128for all 16 lanes:i16x8.relaxed_dot_i8x16_i7x16_s— each i16 lane's pairsum is
-128*-128 + -128*-128 = 32768, which overflowssigned-i16. Spec-allowed set:
{-32768, 32767}(per-pair wrapor saturate).
i32x4.relaxed_dot_i8x16_i7x16_add_swithc = 0— everypair sum overflows the same way. After
extadd_pairwise_i16x8_sand the final
+c, the spec-allowed set is{-65536, -1, 65534}(the wrap/sat combinations of the twopair sums).
A direct-byte-sum implementation (one that skips the i16
truncation point and sums all four byte products directly into
i32) produces
65536per lane, which is outside thatspec-allowed set. The existing assertions in the file don't
catch that shape because their chosen byte values keep the pair
sum within signed-i16 range — e.g.
-128 * -127 * 2 = 32512,no overflow.
How this came up
While adding relaxed-SIMD support to WAMR (the wasm-micro-
runtime) fast-interp at
rebeckerspecialties/wasm-micro-runtime#3,
a
chatgpt-codex-connectorcode-review bot flagged that ouri32x4.relaxed_dot_i8x16_i7x16_add_simplementation skipped thei16 truncation step and summed all four byte products directly
into i32. We verified the report, fixed the implementation, and
added a unit test pinned at the chosen wrap value.
Then, as a sanity check, we ran the upstream relaxed-SIMD spec
testsuite against the WAMR build with the fix reverted —
expecting it to catch the bug we'd just fixed. It didn't.
All 69 of the relaxed-SIMD
assert_returns passed with thebuggy implementation, because every existing test input keeps
the i16 pair sum within range. The bug shape is invisible
to the current test set.
This PR closes that gap.
Verification
wast2json --enable-relaxed-simdacceptsthe modified file; assertion count goes from 11 → 13.
new assertions pass (WAMR's result is in the spec-allowed
set).
add_sassertion fails:spec-test layer.
Notes on the chosen
eithersetsFor the
_s(no-add) variant I list only the twohomogeneous-across-lanes outcomes
{[-32768]*8, [32767]*8},matching the existing convention in the file (impls are
uniform across lanes; per-lane mixed wrap/sat is in principle
spec-allowed but not enumerated). Happy to expand to the full
combinatorial set if reviewers prefer.
For the
_add_svariant theeitherincludes the mixed-1outcome (one pair wraps, one saturates) because that'sthe within-a-single-lane spec-allowed combination, not a
cross-lane mixing.
"Force-i7-to-zero" interpretation (which would yield
0forthis input since
bhas the top bit set) is omitted, matchingthe convention of the existing test at lines 63-71 in the same
file that also doesn't enumerate that case.