TEST: 1.20.x + blas variants #227

h-vetinari · 2021-02-10T21:09:23Z

Following the same scheme as #196, but for the 1.20 branch. Should not be merged due to conda/conda-build#3947.

conda-forge-linter · 2021-02-10T21:09:28Z

Hi! This is the friendly automated conda-forge-linting service.

I just wanted to let you know that I linted all conda-recipes in your PR (recipe) and found it was in an excellent condition.

h-vetinari · 2021-02-10T22:29:20Z

Update with 1.20.1

From 9 failures out of 92, there are now 3 failures out of 64 (-16 cpython3.6 runs, -12 pypy36 runs)

The good news:

win + blis passes 🥳 🚀

The bad news:

pypy still segfaulting on ppc + openblas

Details

lib	before	after
`numpy`	`1.19.5`	`1.20.1`
`libblas`	`3.9.0-7`	`3.9.0-8`
`blis`	`0.8.0-1`	`0.8.0-1`
`openblas`	`0.3.12-pthreads-1`	`0.3.12-pthreads-1`
`mkl`	`2020.4-304`	`2020.4-304`
`netlib`	`3.9.0-3`	`3.9.0-3`

variant	before	after
linux + ppc64le + pypy	segfault	segfault only on openblas
win + blis	first passed test-suite with py38 only, for reasons unclear; 11-13 failures otherwise	passes 🥳
win	3 failures due to `The process tried to write to a nonexistent pipe.` across blis/mkl/openblas	same error returned twice on mkl builds

variant	blis	mkl	netlib	openblas	sum*
linux / x86	✔️	✔️	✔️	✔️	-
linux / aarch	➖	➖	✔️	✔️	-
linux / ppc64le	➖	➖	✔️	✔️ (cpython) / ❌ (pypy)	1F
osx / x86	✔️	✔️	✔️	✔️	-
osx / arm	➖	➖	✔️	✔️	-
win	✔️	✔️ (py39) / ❌ (py37, py38)	✔️	✔️	2F
sum*	-	2F	-	1F	3F

* sum of Failures (out of a total of 64 CI combinations being tested)

Build logs:
Azure
Drone
Travis

ppc + openblas + pypy: SEGFAULT

Hard cut in the log:

lib/tests/test_format.py::test_bad_header PASSED                         [ 59%]
lib/tests/test_format.py::test_large_file_support PASSED                 [ 59%]

win + mkl + cpython 3.7 / 3.8: hard error

  File "D:\bld\numpy_1612991625372\_test_env\lib\site-packages\_pytest\_io\terminalwriter.py", line 155, in write
    self._file.write(msg)
OSError: [Errno 22] Invalid argument
##[error]Cmd.exe exited with code '1'.
The process tried to write to a nonexistent pipe.

h-vetinari · 2021-05-08T20:16:01Z

Update for 1.20.2 & new blas builds

From 3 failures out of 64, there are now 20 failures, where 18 are (most likely) due to an openblas bug.

The bad news:

one test fails under openblas for all arches / OSes / python versions, related to nan-handling. Not sure if this comes from numpy or openblas, but since numpy only had a patch-release, I'm guessing this is on the openblas-side. CC @martin-frbg
segfault under ppc + pypy remains
one blis-run regressed under windows (and in a flaky manner as well...)

Details

lib	before	after
`numpy`	`1.20.1`	`1.20.2`
`libblas`	`3.9.0-8`	`3.9.0-9`
`blis`	`0.8.0-1`	`0.8.1-0`
`openblas`	`0.3.12-pthreads-1`	`0.3.15-pthreads-0`
`mkl`	`2020.4-304`	`2021.2-389`
`netlib`	`3.9.0-3`	`3.9.0-5`

variant	before	after
linux + ppc64le + openblas + pypy	segfault	segfault remains 😒
win + blis	passed	11 failures for py38-only for first run, 14 failures for py37-only on rerun
win + mkl	2 failures due to `The process tried to write to a nonexistent pipe.`	happened once again out of 6 runs (incl. restart)

variant	blis	mkl	netlib	openblas	sum*
linux / x86	✔️	✔️	✔️	❌	4F
linux / aarch	➖	➖	✔️	❌	4F
linux / ppc64le	➖	➖	✔️	❌	4F
osx / arm	➖	➖	✔️	❌**	2F
osx / x86	✔️	✔️	✔️	❌	4F
win / x86	✔️ (py38, py39) / ❌ (py37)	✔️	✔️	❌	4F
sum*	1F	-	-	19F	20F

* sum of Failures (out of a total of 64 CI combinations being tested)
** tests not run for osx-arm, but only reasonable assumption is that these would fail as well

Build logs:
Azure (& previous run)
Drone
Travis (& previous run)

linux (all arches) / osx / win + openblas: 1 failure numpy.linalg.tests.test_linalg.TestCond.test_nan

=================================== FAILURES ===================================
______________________________ TestCond.test_nan _______________________________

self = <numpy.linalg.tests.test_linalg.TestCond object at 0x7f3853dbc7d0>

    def test_nan(self):
        # nans should be passed through, not converted to infs
        ps = [None, 1, -1, 2, -2, 'fro']
        p_pos = [None, 1, 2, 'fro']
    
        A = np.ones((2, 2))
        A[0,1] = np.nan
        for p in ps:
>           c = linalg.cond(A, p)

[...]/numpy/linalg/tests/test_linalg.py:777: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 
<__array_function__ internals>:6: in cond
    ???
[...]/numpy/linalg/linalg.py:1765: in cond
    s = svd(x, compute_uv=False)
<__array_function__ internals>:6: in svd
    ???
[...]/numpy/linalg/linalg.py:1672: in svd
    s = gufunc(a, signature=signature, extobj=extobj)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

err = 'invalid value', flag = 8

    def _raise_linalgerror_svd_nonconvergence(err, flag):
>       raise LinAlgError("SVD did not converge")
E       numpy.linalg.LinAlgError: SVD did not converge

linux + ppc + pypy: SEGFAULT

lib/tests/test_format.py::test_bad_header PASSED                         [ 60%]
lib/tests/test_format.py::test_large_file_support PASSED                 [ 60%]
/home/conda/feedstock_root/build_artifacts/numpy_1620502851667/test_tmp/run_test.sh: line 9:  2294 Killed                  pytest --verbose --pyargs numpy -k "not (_not_a_real_test or test_einsum_sums_cfloat64 or test_loss_of_precision or test_large_zip or test_may_share_memory_easy_fuzz or test_may_share_memory_harder_fuzz or test_unary_ufunc_call_fuzz or test_count_nonzero_all or test_diophantine_fuzz or test_generalized_sq_cases or test_may_share_memory_harder_fuzz or test_large_zip)" --durations=0
Tests failed for numpy-1.20.2-py37h9e3b4ae_0.tar.bz2 - moving package to /home/conda/feedstock_root/build_artifacts/broken

win + blis + cpython 3.7: 14 failures

=========================== short test summary info ===========================
FAILED core/tests/test_multiarray.py::TestMatmul::test_dot_equivalent[args4]
FAILED core/tests/test_multiarray.py::TestMatmul::test_matmul_object - Assert...
FAILED linalg/tests/test_linalg.py::TestSolve::test_sq_cases - AssertionError...
FAILED linalg/tests/test_linalg.py::TestSolve::test_generalized_sq_cases - As...
FAILED linalg/tests/test_linalg.py::TestInv::test_sq_cases - AssertionError: ...
FAILED linalg/tests/test_linalg.py::TestInv::test_generalized_sq_cases - Asse...
FAILED linalg/tests/test_linalg.py::TestPinv::test_nonsq_cases - AssertionErr...
FAILED linalg/tests/test_linalg.py::TestPinv::test_generalized_sq_cases - Ass...
FAILED linalg/tests/test_linalg.py::TestPinv::test_generalized_nonsq_cases - ...
FAILED linalg/tests/test_linalg.py::TestDet::test_sq_cases - AssertionError: ...
FAILED linalg/tests/test_linalg.py::TestDet::test_generalized_sq_cases - Asse...
FAILED linalg/tests/test_linalg.py::TestMatrixPower::test_power_is_minus_one[dt13]
FAILED linalg/tests/test_linalg.py::TestCholesky::test_basic_property - Asser...
FAILED linalg/tests/test_regression.py::TestRegression::test_lstsq_complex_larger_rhs
= 14 failed, 13540 passed, 716 skipped, 20 xfailed, 1 xpassed, 229 warnings in 409.45s (0:06:49) =

win + blis + cpython 3.8 (first run): 11 failures

=========================== short test summary info ===========================
FAILED core/tests/test_multiarray.py::TestMatmul::test_dot_equivalent[args4]
FAILED linalg/tests/test_linalg.py::TestSolve::test_sq_cases - AssertionError...
FAILED linalg/tests/test_linalg.py::TestSolve::test_generalized_sq_cases - As...
FAILED linalg/tests/test_linalg.py::TestInv::test_sq_cases - AssertionError: ...
FAILED linalg/tests/test_linalg.py::TestInv::test_generalized_sq_cases - Asse...
FAILED linalg/tests/test_linalg.py::TestPinv::test_nonsq_cases - AssertionErr...
FAILED linalg/tests/test_linalg.py::TestPinv::test_generalized_sq_cases - Ass...
FAILED linalg/tests/test_linalg.py::TestDet::test_sq_cases - AssertionError: ...
FAILED linalg/tests/test_linalg.py::TestDet::test_generalized_sq_cases - Asse...
FAILED linalg/tests/test_linalg.py::TestMatrixPower::test_power_is_minus_one[dt13]
FAILED linalg/tests/test_linalg.py::TestCholesky::test_basic_property - Asser...
= 11 failed, 13547 passed, 714 skipped, 20 xfailed, 1 xpassed, 227 warnings in 470.35s (0:07:50) =

martin-frbg · 2021-05-08T21:44:39Z

svd test failure is caused by a change in NaN handling within LAPACK 3.9.1 xGESDD that was merged in 0.3.15, see OpenMathLib/OpenBLAS#3225
What do I need to know to reproduce the ppc64le segfault with pypy?

mattip · 2021-05-09T03:25:36Z

What do I need to know to reproduce the ppc64le segfault with pypy?

It most likely is a PyPy problem, the ppc64le version is not widely used and there may be bugs in the ppc64le JIT backend. @h-vetinari could you add a pypy --jit off variant to check that hypothesis?

h-vetinari · 2021-05-09T21:16:35Z

@mattip: @h-vetinari could you add a pypy --jit off variant to check that hypothesis?

Took a bit longer than I hoped, because I remembered that Isuru had already done this for a previous scipy PR (conda-forge/scipy-feedstock@da63fd6) and it needed a bit of understanding & refactoring the run_test.py machinery.

Also unskipped some tests (aside from being good hygiene to try removing old skips occasionally, I also didn't want to port them to the new format if unnecessary), so it could be that "new" failures arise.

recipe/meta.yaml

h-vetinari · 2021-05-09T23:14:14Z

@mattip @martin-frbg
Turning off the JIT resolved the segfault with linux-ppc + pypy + openblas.

h-vetinari · 2021-05-09T23:30:01Z

svd test failure is caused by a change in NaN handling within LAPACK 3.9.1 xGESDD that was merged in 0.3.15, see xianyi/OpenBLAS#3225

And thanks a lot for stopping by so quickly! 😊

h-vetinari · 2021-05-11T20:49:41Z

@martin-frbg: What do I need to know to reproduce the ppc64le segfault with pypy?

@mattip: It most likely is a PyPy problem, the ppc64le version is not widely used and there may be bugs in the ppc64le JIT backend. @h-vetinari could you add a pypy --jit off variant to check that hypothesis?

@h-vetinari: Turning off the JIT resolved the segfault with linux-ppc + pypy + openblas.

Just for reference / discussion, if the pypy jit on ppc is to blame, I don't understand why it works with ppc + pypy + netlib, but not with ppc + pypy + openblas.

martin-frbg · 2021-05-12T01:25:56Z

I have no idea about the inner workings of the pypy jit, but maybe it is simply running out of stack space with the default ulimit on ppc ?

h-vetinari · 2021-06-29T19:46:36Z

Update for 1.20.3

From 20 failures out of 64 (18 of which due to numpy/numpy#18914), there is now only 1 (flaky) failure.

The good news:

segfault under ppc + pypy is gone, even when using the jit 🥳

The bad news:

win+blis runs remain flaky, failing occasionally with ~10 failures that are however numerically critical (e.g. producing all-nan where non-nan results are expected, or producing [[0, 0], [0, 0]] instead of [[1, 0], [0, 1]])

Other notable things:

investigated failures around using pytest-xdist on windows; those failures were flaky (mostly 4-5 failures due to memory errors and broken pipes) and didn't (seem to) appear when setting OPENBLAS_NUM_THREADS=1. Due to the flaky things, I'm disregarding those extra CI jobs now. See this CI run for some examples.

Details

lib	before	after
`numpy`	`1.20.2`	`1.20.3`
`libblas`	`3.9.0-9`	`3.9.0-9`
`blis`	`0.8.1-0`	`0.8.1-0`
`openblas`	`0.3.15-pthreads-0`	`0.3.15-pthreads-1`
`mkl`	`2021.2-389`	`2021.2-389`
`netlib`	`3.9.0-5`	`3.9.0-5`
`pypy`	`7.3.4-4`(?)	`7.3.4-4`

variant	before	after
linux + ppc64le + openblas + pypy	segfault	passes 🥳
win + blis	11 failures for py38-only for first run, 14 failures for py37-only on rerun	12 failures for py37-only
win + mkl	Occasional failures due to `The process tried to write to a nonexistent pipe.`	did not reoccur 🥳

variant	blis	mkl	netlib	openblas	sum*
linux / x86	✔️	✔️	✔️	✔️	-
linux / aarch	➖	➖	✔️	✔️	-
linux / ppc64le	➖	➖	✔️	✔️	-
osx / arm	➖	➖	✔️	✔️	-
osx / x86	✔️	✔️	✔️	✔️	-
win / x86	✔️ / ❌	✔️	✔️	✔️	1F
sum*	1F	-	-	-	1F

* sum of Failures (out of a total of 64 CI combinations being tested)

Build logs:
Azure (previously)
Drone (previously & originally)
Travis (previously & originally)

win + blis + cpython 3.7: 12 failures

=========================== short test summary info ===========================
FAILED core/tests/test_multiarray.py::TestMatmul::test_dot_equivalent[args4]
FAILED core/tests/test_multiarray.py::TestMatmul::test_matmul_object - Assert...
FAILED linalg/tests/test_linalg.py::TestSolve::test_sq_cases - AssertionError...
FAILED linalg/tests/test_linalg.py::TestSolve::test_generalized_sq_cases - As...
FAILED linalg/tests/test_linalg.py::TestInv::test_sq_cases - AssertionError: ...
FAILED linalg/tests/test_linalg.py::TestInv::test_generalized_sq_cases - Asse...
FAILED linalg/tests/test_linalg.py::TestPinv::test_generalized_sq_cases - Ass...
FAILED linalg/tests/test_linalg.py::TestPinv::test_generalized_nonsq_cases - ...
FAILED linalg/tests/test_linalg.py::TestDet::test_sq_cases - AssertionError: ...
FAILED linalg/tests/test_linalg.py::TestDet::test_generalized_sq_cases - Asse...
FAILED linalg/tests/test_linalg.py::TestMatrixPower::test_power_is_minus_one[dt13]
FAILED linalg/tests/test_linalg.py::TestCholesky::test_basic_property - Asser...
= 12 failed, 13576 passed, 716 skipped, 1 deselected, 20 xfailed, 1 xpassed, 229 warnings in 321.30s (0:05:21) =

rgommers · 2021-06-29T19:55:29Z

That's looking pretty good!

h-vetinari · 2021-08-07T18:45:05Z

Time to close this one I think... Work continues in #237.

…nda-forge-pinning 2022.04.12.21.56.28

This reverts commit 7e729fa.

h-vetinari · 2022-04-13T01:26:43Z

Revival (new PyPy builds and BLAS updates): all green except PPC

Due to rebuilding 1.20 for pypy3.8/3.9, much less several relevant BLAS (& infrastructure) changes, I thought I'd revive this PR for one last update.

From 1 failure out of 64 runs, we're now at 12 failures (PPC-only) out of 108 runs.

Notable

Added accelerate BLAS flavour on osx
Testing against PyPy 3.8 and 3.9 added everywhere but for osx-arm
Big bumps for openblas, blis & MKL
In the meantime, the previous blis errors have been tracked down and fixed

Details

variant	before	after
win + blis	12 failures	fixed 🥳
linux + ppc	...	test failures due to emulation problems (on azure)
win + pypy	...	two spurious failures in `test_closing_fid`, resolved by restart
osx + pypy	...	one spurious failures in `test_may_share_memory_easy_fuzz`, resolved by restart

lib	before	after	updated version	updated build
`numpy`	`1.20.3`	`1.20.3`
`libblas`	`3.9.0-9`	`3.9.0-14`		X
`blis`	`0.8.1-0`	`0.9.0-0`	X
`openblas`	`0.3.15-pthreads-1`	`0.3.20-pthreads-0`	X
`mkl`	`2021.2-389`	`2022.0.1-803`	X
`netlib`	`3.9.0-5`	`3.9.0-5`
`pypy`	`7.3.4-4`	`7.3.9-1` (pypy38/39) `7.3.7-3` (pypy37)	X
`qemu-user-static`	?	`6.1.0-8`

variant	accelerate	blis	mkl	netlib	openblas	sum*
linux / x86	➖	✔️	✔️	✔️	✔️	-
linux / aarch	➖	➖	➖	✔️	✔️	-
linux / ppc64le	➖	➖	➖	✖️	✖️	12F
osx / arm	✔️	➖	➖	✔️	✔️	-
osx / x86	✔️	✔️	✔️	✔️	✔️	-
win / x86	➖	✔️	✔️	✔️	✔️	-
sum*	-	-	-	6F	6F	12F

* sum of Failures (out of a total of 108 CI combinations being tested)

Build logs:
Azure

h-vetinari requested review from isuruf, jakirkham, msarahan, ocefpaf, pelson, rgommers and xhochy as code owners February 10, 2021 21:09

h-vetinari marked this pull request as draft February 10, 2021 21:27

h-vetinari mentioned this pull request Feb 10, 2021

TEST: 1.19.x + blas variants #196

Closed

h-vetinari changed the title ~~WIP: 1.20.x + blas variants~~ TEST: 1.20.x + blas variants Feb 10, 2021

h-vetinari force-pushed the 1.20_blas_vars branch from b8fa1e0 to b0d4cda Compare May 8, 2021 18:37

h-vetinari closed this May 8, 2021

h-vetinari reopened this May 8, 2021

isuruf reviewed May 9, 2021

View reviewed changes

recipe/meta.yaml Outdated Show resolved Hide resolved

h-vetinari force-pushed the 1.20_blas_vars branch 2 times, most recently from c1a766b to 35c0828 Compare May 9, 2021 22:38

h-vetinari mentioned this pull request May 9, 2021

TestAVXFloat32Transcendental test failure numpy/numpy#15179

Closed

h-vetinari force-pushed the 1.20_blas_vars branch from 79add25 to 605098e Compare May 9, 2021 23:34

h-vetinari closed this May 11, 2021

h-vetinari reopened this May 11, 2021

h-vetinari mentioned this pull request Jun 29, 2021

TEST: 1.21.x + blas variants #237

Closed

h-vetinari closed this Jun 29, 2021

h-vetinari reopened this Jun 29, 2021

h-vetinari mentioned this pull request Jun 29, 2021

Failures with AVX512 in numpy test suite when using blis on windows (through conda-forge) flame/blis#514

Closed

h-vetinari closed this Aug 7, 2021

h-vetinari deleted the 1.20_blas_vars branch August 7, 2021 18:45

h-vetinari mentioned this pull request Jan 3, 2022

TEST: 1.22.x + blas variants #252

Closed

h-vetinari restored the 1.20_blas_vars branch April 6, 2022 21:48

h-vetinari deleted the 1.20_blas_vars branch April 6, 2022 21:56

h-vetinari restored the 1.20_blas_vars branch April 12, 2022 22:24

h-vetinari reopened this Apr 12, 2022

h-vetinari changed the base branch from master to numpy120 April 12, 2022 22:24

test for all blas variants

a18459c

h-vetinari force-pushed the 1.20_blas_vars branch from 93c7582 to 7049b7e Compare April 12, 2022 22:38

conda-forge deleted a comment from conda-forge-linter Apr 12, 2022

h-vetinari added 2 commits April 13, 2022 09:42

MNT: Re-rendered with conda-build 3.21.8, conda-smithy 3.19.0, and co…

c70b57e

…nda-forge-pinning 2022.04.12.21.56.28

Revert "skip test suite on PPC due to QEMU bugs"

c68df85

This reverts commit 7e729fa.

h-vetinari force-pushed the 1.20_blas_vars branch from 7049b7e to c68df85 Compare April 12, 2022 22:42

only run fast tests on travis

12dda56

h-vetinari closed this Apr 19, 2022

h-vetinari deleted the 1.20_blas_vars branch April 21, 2022 11:08

h-vetinari mentioned this pull request Jun 23, 2022

TEST: 1.23.x + blas variants #273

Closed

h-vetinari mentioned this pull request Jan 27, 2023

TEST: 1.24.x + blas variants #288

Closed

h-vetinari mentioned this pull request Jun 18, 2023

TEST: 1.25.x + blas variants #293

Draft

h-vetinari mentioned this pull request Jan 9, 2024

TEST: 1.26.x + blas variants #307

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TEST: 1.20.x + blas variants #227

TEST: 1.20.x + blas variants #227

h-vetinari commented Feb 10, 2021

conda-forge-linter commented Feb 10, 2021

h-vetinari commented Feb 10, 2021 •

edited

h-vetinari commented May 8, 2021

martin-frbg commented May 8, 2021

mattip commented May 9, 2021

h-vetinari commented May 9, 2021

h-vetinari commented May 9, 2021

h-vetinari commented May 9, 2021

h-vetinari commented May 11, 2021

martin-frbg commented May 12, 2021

h-vetinari commented Jun 29, 2021

rgommers commented Jun 29, 2021

h-vetinari commented Aug 7, 2021 •

edited

h-vetinari commented Apr 13, 2022

TEST: 1.20.x + blas variants #227

TEST: 1.20.x + blas variants #227

Conversation

h-vetinari commented Feb 10, 2021

conda-forge-linter commented Feb 10, 2021

h-vetinari commented Feb 10, 2021 • edited

Update with 1.20.1

The good news:

The bad news:

Details

h-vetinari commented May 8, 2021

Update for 1.20.2 & new blas builds

The bad news:

Details

martin-frbg commented May 8, 2021

mattip commented May 9, 2021

h-vetinari commented May 9, 2021

h-vetinari commented May 9, 2021

h-vetinari commented May 9, 2021

h-vetinari commented May 11, 2021

martin-frbg commented May 12, 2021

h-vetinari commented Jun 29, 2021

Update for 1.20.3

The good news:

The bad news:

Other notable things:

Details

rgommers commented Jun 29, 2021

h-vetinari commented Aug 7, 2021 • edited

h-vetinari commented Apr 13, 2022

Revival (new PyPy builds and BLAS updates): all green except PPC

Notable

Details

h-vetinari commented Feb 10, 2021 •

edited

h-vetinari commented Aug 7, 2021 •

edited