xfail upstream numpy 2.3 longdouble FFT flake by spMohanty · Pull Request #28 · AIcrowd/whest

spMohanty · 2026-04-17T19:42:39Z

Summary

Follow-up to #17. Marks an upstream numpy flake as xfail so CI on numpy 2.3 × Linux × longdouble isn't red on an unlucky random draw.

The flake

numpy.fft.tests.test_pocketfft.py::TestFFT1D::test_identity_long_short_reversed[longdouble] uses atol = 5 * np.spacing(longdouble(1.0)). On Linux x86 where longdouble is 80-bit extended precision (ε ≈ 1.08e-19), that gives only ~5.4e-19 of headroom. The test feeds unseeded random data, so ~1 in 10 runs hits a mismatch of roughly 5.5–6 × ε and trips the assertion.

Observed on PR #17 run 24572375511 job 71848669583:

FAILED test_pocketfft.py::TestFFT1D::test_identity_long_short_reversed[longdouble]
Mismatched elements: 1 / 16 (6.25%)
Max absolute difference among violations: 5.96311195e-19

A re-run of the same commit (no code change, just re-rolled random input) went green.

Upstream acknowledgement

numpy itself has patched this. Commit 0c514d9 on main, Oct 2025:

"MAINT, TST: Increase tolerance in fft test. test_identity_long_short_reversed fails fairly often, so increase the test tolerance by a bit."

The fix bumps atol from 5 × spacing → 6 × spacing. It shipped in numpy 2.4 but was not backported to maintenance/2.3.x (verified via git compare).

Fix in this PR

One new entry in tests/numpy_compat/xfails.py scoped tightly to the flaky parametrize:

"*TestFFT1D::test_identity_long_short_reversed*longdouble*": (
    "UPSTREAM_NUMPY_FLAKE: numpy 2.3 longdouble FFT tolerance is too "
    "tight (5×spacing on ε≈1.08e-19 leaves no headroom); unseeded "
    "random input hits this occasionally. Fixed upstream in numpy "
    "commit 0c514d9 (atol: 5→6×spacing), shipped in 2.4, not "
    "backported to 2.3. xpass on 2.2/2.4 is expected."
),

strict=False (default in our conftest) means the xpass on 2.2 / 2.4 / macOS-ARM is non-fatal.
Pattern uses *longdouble* wildcards because fnmatch treats literal [longdouble] as a character class; the single / double parametrizes continue to run normally.
The single-line pattern auto-retires if we ever drop numpy 2.3 from the CI matrix.

Test plan

Locally verified on numpy 2.2 / 2.3 / 2.4: the pattern marks only the longdouble parametrize xfail; single and double pass normally.
Verified fnmatch + substring fallback match the real pytest node ID.
No other xfail counts change across the three versions.

Out of scope (tracked separately)

A backport of 0c514d9 to numpy's own maintenance/2.3.x would be a cleaner upstream fix (would likely land in 2.3.6). Willing to send that upstream PR if wanted — just didn't want to couple it with this whest-side mitigation.

numpy 2.3's test_identity_long_short_reversed[longdouble] uses atol = 5 * np.spacing(longdouble(1.0)) ≈ 5.4e-19. On Linux x86 where longdouble is 80-bit extended precision (ε ≈ 1.08e-19), unseeded random input occasionally hits mismatches of ~5.5× spacing, tripping the assertion. Upstream acknowledges this as flaky. numpy commit 0c514d9 (Oct 2025, "MAINT, TST: Increase tolerance in fft test. test_identity_long_short_reversed fails fairly often") bumped the tolerance from 5×spacing to 6×spacing. That fix shipped in numpy 2.4 but was not backported to 2.3. Since conftest.pytest_collection_modifyitems uses strict=False, the xfail is non-blocking: on macOS ARM or Linux configurations where the mismatch doesn't hit, the test xpasses (which is the common case), which is reported but doesn't fail the suite. The xfail covers the narrow window of numpy 2.3 × Linux-x86-longdouble × unlucky random input. Pattern uses *longdouble* wildcard (not literal [longdouble]) because fnmatch treats square brackets as a character class; the existing TestArrayComparisons pattern shape demonstrates this convention. Reproduces a failure seen on PR #17 run 24572375511 job 71848669583 (numpy 2.3.5, Python 3.12, ubuntu-latest).

spMohanty merged commit fa58cc8 into main Apr 17, 2026
15 checks passed

spMohanty deleted the dev/numpy-2.3-longdouble-fft-xfail branch April 18, 2026 14:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

xfail upstream numpy 2.3 longdouble FFT flake#28

xfail upstream numpy 2.3 longdouble FFT flake#28
spMohanty merged 1 commit intomainfrom
dev/numpy-2.3-longdouble-fft-xfail

spMohanty commented Apr 17, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

spMohanty commented Apr 17, 2026

Summary

The flake

Upstream acknowledgement

Fix in this PR

Test plan

Out of scope (tracked separately)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant