xfail upstream numpy 2.3 longdouble FFT flake#28
Merged
Conversation
numpy 2.3's test_identity_long_short_reversed[longdouble] uses atol = 5 * np.spacing(longdouble(1.0)) ≈ 5.4e-19. On Linux x86 where longdouble is 80-bit extended precision (ε ≈ 1.08e-19), unseeded random input occasionally hits mismatches of ~5.5× spacing, tripping the assertion. Upstream acknowledges this as flaky. numpy commit 0c514d9 (Oct 2025, "MAINT, TST: Increase tolerance in fft test. test_identity_long_short_reversed fails fairly often") bumped the tolerance from 5×spacing to 6×spacing. That fix shipped in numpy 2.4 but was not backported to 2.3. Since conftest.pytest_collection_modifyitems uses strict=False, the xfail is non-blocking: on macOS ARM or Linux configurations where the mismatch doesn't hit, the test xpasses (which is the common case), which is reported but doesn't fail the suite. The xfail covers the narrow window of numpy 2.3 × Linux-x86-longdouble × unlucky random input. Pattern uses *longdouble* wildcard (not literal [longdouble]) because fnmatch treats square brackets as a character class; the existing TestArrayComparisons pattern shape demonstrates this convention. Reproduces a failure seen on PR #17 run 24572375511 job 71848669583 (numpy 2.3.5, Python 3.12, ubuntu-latest).
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Follow-up to #17. Marks an upstream numpy flake as xfail so CI on numpy 2.3 × Linux ×
longdoubleisn't red on an unlucky random draw.The flake
numpy.fft.tests.test_pocketfft.py::TestFFT1D::test_identity_long_short_reversed[longdouble]usesatol = 5 * np.spacing(longdouble(1.0)). On Linux x86 wherelongdoubleis 80-bit extended precision (ε ≈ 1.08e-19), that gives only ~5.4e-19 of headroom. The test feeds unseeded random data, so ~1 in 10 runs hits a mismatch of roughly 5.5–6 × ε and trips the assertion.Observed on PR #17 run 24572375511 job 71848669583:
A re-run of the same commit (no code change, just re-rolled random input) went green.
Upstream acknowledgement
numpy itself has patched this. Commit
0c514d9onmain, Oct 2025:The fix bumps
atolfrom5 × spacing→6 × spacing. It shipped in numpy 2.4 but was not backported tomaintenance/2.3.x(verified viagit compare).Fix in this PR
One new entry in
tests/numpy_compat/xfails.pyscoped tightly to the flaky parametrize:strict=False(default in our conftest) means thexpasson 2.2 / 2.4 / macOS-ARM is non-fatal.*longdouble*wildcards becausefnmatchtreats literal[longdouble]as a character class; thesingle/doubleparametrizes continue to run normally.Test plan
longdoubleparametrize xfail;singleanddoublepass normally.fnmatch+ substring fallback match the real pytest node ID.Out of scope (tracked separately)
A backport of
0c514d9to numpy's ownmaintenance/2.3.xwould be a cleaner upstream fix (would likely land in 2.3.6). Willing to send that upstream PR if wanted — just didn't want to couple it with this whest-side mitigation.