BUG: nan returned by np.linalg.det while it should be 0 on arm64 mac #22025

matteoacrossi · 2022-07-22T11:56:26Z

Describe the issue:

np.linalg.det on a 4x4 real matrix returns nan, while it should be 0.

If the same code example is run on an intel64 linux machine, the determinant is zero for both matrices.

This happens both with conda-forge and pypi numpy, on both python 3.9 and 3.10.

Reproduce the code example:

import numpy as np
np.linalg.det(np.array([[1.0000000000000000e+000, 0.0000000000000000e+000,
         0.0000000000000000e+000, 0.0000000000000000e+000],
        [0.0000000000000000e+000, 1.0000000000000000e+000,
         3.8307904270117927e-146, 1.4674955295685193e-291],
        [0.0000000000000000e+000, 3.8307904270117927e-146,
         1.4674955295685193e-291, 0.0000000000000000e+000],
        [0.0000000000000000e+000, 1.4674955295685193e-291,
         0.0000000000000000e+000, 0.0000000000000000e+000]]))
/opt/homebrew/Caskroom/miniforge/base/envs/numpy-test/lib/python3.10/site-packages/numpy/linalg/linalg.py:2154: RuntimeWarning: overflow encountered in det
  r = _umath_linalg.det(a, signature=signature)
/opt/homebrew/Caskroom/miniforge/base/envs/numpy-test/lib/python3.10/site-packages/numpy/linalg/linalg.py:2154: RuntimeWarning: invalid value encountered in det
  r = _umath_linalg.det(a, signature=signature)
nan

This one instead works fine:

np.linalg.det(np.array([[1.0000000000000000e+000, 0.0000000000000000e+000,
         0.0000000000000000e+000, 0.0000000000000000e+000],
        [0.0000000000000000e+000, 1.0000000000000000e+000,
         3.8307904315347111e-146, 1.4674955330337903e-291],
        [0.0000000000000000e+000, 3.8307904315347111e-146,
         1.4674955330337899e-291, 0.0000000000000000e+000],
        [0.0000000000000000e+000, 1.4674955330337903e-291,
         0.0000000000000000e+000, 0.0000000000000000e+000]]))

I'm also attaching a .zip file with the matrices saved as .npy files in case there are numerical precision problems:
numpy_nan.zip.

test_nan.npy gives the nan while for test_ok.npy the determinant is correct.

Error message:

No response

NumPy/Python version information:

1.23.1 3.10.0 | packaged by conda-forge | (default, Nov 20 2021, 02:27:15) [Clang 11.1.0 ]

Output of np.show_config():

>>> np.show_config()
openblas64__info:
    libraries = ['openblas64_', 'openblas64_']
    library_dirs = ['/usr/local/lib']
    language = c
    define_macros = [('HAVE_CBLAS', None), ('BLAS_SYMBOL_SUFFIX', '64_'), ('HAVE_BLAS_ILP64', None)]
    runtime_library_dirs = ['/usr/local/lib']
blas_ilp64_opt_info:
    libraries = ['openblas64_', 'openblas64_']
    library_dirs = ['/usr/local/lib']
    language = c
    define_macros = [('HAVE_CBLAS', None), ('BLAS_SYMBOL_SUFFIX', '64_'), ('HAVE_BLAS_ILP64', None)]
    runtime_library_dirs = ['/usr/local/lib']
openblas64__lapack_info:
    libraries = ['openblas64_', 'openblas64_']
    library_dirs = ['/usr/local/lib']
    language = c
    define_macros = [('HAVE_CBLAS', None), ('BLAS_SYMBOL_SUFFIX', '64_'), ('HAVE_BLAS_ILP64', None), ('HAVE_LAPACKE', None)]
    runtime_library_dirs = ['/usr/local/lib']
lapack_ilp64_opt_info:
    libraries = ['openblas64_', 'openblas64_']
    library_dirs = ['/usr/local/lib']
    language = c
    define_macros = [('HAVE_CBLAS', None), ('BLAS_SYMBOL_SUFFIX', '64_'), ('HAVE_BLAS_ILP64', None), ('HAVE_LAPACKE', None)]
    runtime_library_dirs = ['/usr/local/lib']
Supported SIMD extensions in this NumPy install:
    baseline = NEON,NEON_FP16,NEON_VFPV4,ASIMD
    found = ASIMDHP
    not found = ASIMDDP,ASIMDFHM

The text was updated successfully, but these errors were encountered:

rossbar · 2022-07-22T13:16:44Z

Thanks for the report @matteoacrossi - since these are only 4x4 arrays, would you mind embedding the examples here in the issue? I don't like downloading random zip files from the internet :)

You can embed the examples in a fenced Python code block in markdown, like so:

```python
test_nan = np.array([<whatever>])
test_ok = np.array([<whatever>])
```

matteoacrossi · 2022-07-22T13:24:14Z

@rossbar sure, I updated the OP

rossbar · 2022-07-22T14:06:56Z

Ah in this case there may be precision problems with the text representation of the floating point numbers. I can't reproduce but maybe that's because my values aren't the same as yours. I guess we will need to find a better way to share exact values...

matteoacrossi · 2022-07-22T14:37:19Z

That is why I put the .npy files in the zip file. But I can reproduce the problem by pasting the text in the OP on my system (M1 Pro Macbook Pro)

RBrearton · 2022-07-27T14:50:59Z

FYI, with numpy 1.22.4 and python 3.10.5, the following code (copy pastable) gives nan on on my machine (macbook pro, M1 max chip) when it should clearly give 0. It's only slightly modified (simplified a bit) from OP's example.

import numpy as np

np.linalg.det(np.array([[1, 0, 0, 0],
                        [0, 1, 3.8307904270117927e-146, 0],
                        [0, 3.8307904270117927e-146, 1.4674955295685193e-291, 0],
                        [0, 0, 0, 0]]))

The first time I run the code in an interactive python environment, I also get the warning:

/path/to/env/lib/python3.10/site-packages/numpy/linalg/linalg.py:2146: RuntimeWarning: overflow encountered in det
  r = _umath_linalg.det(a, signature=signature)
/path/to/env/lib/python3.10/site-packages/numpy/linalg/linalg.py:2146: RuntimeWarning: invalid value encountered in det
  r = _umath_linalg.det(a, signature=signature)

Weirdly, the RuntimeWarning only shows up once, but the output is always nan (maybe there's just something I don't know about RuntimeWarnings and this makes sense?)

I played around with the last digit of the element whose exponent is e-291 for fun. The values 1.4674955295685193e-291 and 1.4674955295685194e-291 give NaN, but 1.4674955295685192e-291 and 1.4674955295685195e-291 give 0. The values that give 0 also, predictably, don't give a RuntimeWarning.

Looks like a fun bug. I would've been happy to dig deeper, but I have 0 idea how to investigate further.

seberg · 2022-08-08T06:02:48Z

Weirdly, the RuntimeWarning only shows up once, but the output is always nan (maybe there's just something I don't know about RuntimeWarnings and this makes sense?)

This is the default behavior of warnings in Python, you can customize it and further customize it using np.errstate.

It sounds like an OpenBLAS issue, is this in rosetta or native mode? Could you also install threadpoolctl and paste the output of threadpoolctl.threadpool_info()

matteoacrossi · 2022-08-08T06:58:20Z

I'm not using Rosetta (at least, I'm confident that I'm not).

This is the output of threadpoolctl.threadpool_info() when using the conda-forge numpy:

>>> threadpoolctl.threadpool_info()
[{'user_api': 'blas', 
  'internal_api': 'openblas', 
  'prefix': 'libopenblas', 
  'filepath': '/opt/homebrew/Caskroom/miniforge/base/envs/numpy-test/lib/python3.10/site-packages/numpy/.dylibs/libopenblas64_.0.dylib', 
  'version': '0.3.20', 
  'threading_layer': 'pthreads', 
  'architecture': 'armv8', 
  'num_threads': 10}]

When using the pip-installed numpy I get

>>> threadpoolctl.threadpool_info()
[]

Edit (mattip): reformatting

mattip · 2022-08-08T08:42:30Z

From the output above, you are running the arm64 variant and using the homebrew-provided OpenBLAS 0.3.20. It is strange that the pip-installed numpy does not show anything, I think you need to import numpy first:

python -c "import numpy, threadpoolctl, pprint; \
pprint.pprint(threadpoolctl.threadpool_info())"

OpenBLAS released 0.3.21 yesterday, I wonder if that will change anything.

matteoacrossi · 2022-08-08T08:47:20Z

Indeed, after importing numpy I got the output, sorry about that:

[{'architecture': 'armv8',
  'filepath': '/opt/homebrew/Caskroom/miniforge/base/envs/testnumpy/lib/python3.10/site-packages/numpy/.dylibs/libopenblas64_.0.dylib',
  'internal_api': 'openblas',
  'num_threads': 10,
  'prefix': 'libopenblas',
  'threading_layer': 'pthreads',
  'user_api': 'blas',
  'version': '0.3.20'}]

How do I get the latest openblas?

mattip · 2022-08-08T09:10:30Z

How do I get the latest openblas?

You can build from source (not recommended) or wait a few weeks until the conda forge feedstock is updated, and then install that. Note that it seems you are only using the pip installed numpy, to install from conda forge you would have to do

pip uninstall numpy
conda install openblas==0.3.21
conda install --force-reinstall numpy

h-vetinari · 2022-08-08T09:25:52Z

or wait a few weeks until the conda forge feedstock is updated

my goodness, I hope that's not the general impression of the time necessary for builds to arrive in conda-forge 😅

Normally it should be a routine thing (couple of days depending how quickly someone sees it). Occasionally there are other issues - seems there are some segfaults with 0.3.21 that will need solving, and in this case that means a currently indeterminate hold for analysis & fix, but hopefully it'll not be "weeks" even so.

mattip · 2022-08-08T09:50:33Z

Sorry, I didn't mean to disparage the great work the conda-forge people do to get packages out quickly.

h-vetinari · 2022-08-08T10:02:50Z

No offence taken. Long delays can & do happen, but hopefully not often enough to be considered the norm. ;-)

h-vetinari · 2022-08-11T07:46:06Z

Openblas 0.3.21 is available in conda-forge. Could you give it a whirl?

matteoacrossi · 2022-08-21T12:51:29Z

I tried updating Openblas 0.3.21 with conda-forge (checked with threadpool_info()) and I get the same nan problem. Also updated numpy to 1.23.2 but no change.

h-vetinari · 2022-08-21T12:53:38Z

Could you post the output of conda list please?

matteoacrossi · 2022-08-21T12:54:29Z

conda list
# packages in environment at /opt/homebrew/Caskroom/miniforge/base/envs/numpy-test:
#
# Name                    Version                   Build  Channel
bzip2                     1.0.8                h3422bc3_4    conda-forge
ca-certificates           2022.6.15            h4653dfc_0    conda-forge
libblas                   3.9.0           16_osxarm64_openblas    conda-forge
libcblas                  3.9.0           16_osxarm64_openblas    conda-forge
libcxx                    14.0.6               h04bba0f_0    conda-forge
libffi                    3.4.2                h3422bc3_5    conda-forge
libgfortran               5.0.0           11_3_0_hd922786_25    conda-forge
libgfortran5              11.3.0              hdaf2cc0_25    conda-forge
liblapack                 3.9.0           16_osxarm64_openblas    conda-forge
libopenblas               0.3.21          openmp_hc731615_2    conda-forge
libsqlite                 3.39.2               h2c9beb0_1    conda-forge
libzlib                   1.2.12               ha287fd2_2    conda-forge
llvm-openmp               14.0.4               hd125106_0    conda-forge
ncurses                   6.3                  h07bb92c_1    conda-forge
numpy                     1.23.2          py310h127c7cf_0    conda-forge
openssl                   3.0.5                h7aea29f_1    conda-forge
pip                       22.2.2             pyhd8ed1ab_0    conda-forge
python                    3.10.0          h43b31ca_3_cpython    conda-forge
python_abi                3.10                    2_cp310    conda-forge
readline                  8.1.2                h46ed386_0    conda-forge
setuptools                65.1.0          py310hbe9552e_0    conda-forge
sqlite                    3.39.2               h40dfcc0_1    conda-forge
threadpoolctl             3.1.0              pyh8a188c0_0    conda-forge
tk                        8.6.12               he1e0b03_0    conda-forge
tzdata                    2022c                h191b570_0    conda-forge
wheel                     0.37.1             pyhd8ed1ab_0    conda-forge
xz                        5.2.6                h57fd34a_0    conda-forge

BTW @h-vetinari, it says here that it is a known issue. Is there a issue I can subscribe to?

h-vetinari · 2022-08-21T12:58:46Z

Thanks. Looks like you're on the newest gfortran builds, which makes this yet another issue compared to the previous ones.

BTW @h-vetinari, it says here that it is a known issue. Is there a issue I can subscribe to?

Known only in the sense that I was collecting issue links I was aware of, not "known" in the sense that we're not trying to fix it. ;)
That issue is not a bad place to subscribe to.

h-vetinari · 2022-08-21T13:06:15Z

Known only in the sense that I was collecting issue links I was aware of, not "known" in the sense that we're not trying to fix it. ;)

I had missed that my overview issue had been edited. I don't have an issue for tracking this in upstream openblas unfortunately.

martin-frbg · 2022-08-24T11:46:05Z

probably older problem, may not be limited to m1 (at least i could reproduce it with armv8 target as well, while truly generic C kernels are ok - have not tried on non-apple arm64 yet though). not sure if genuinely openblas bug or just an fma accuracy effect or compiler behaviour wrt denormals. istr there are more numpy issues about linalg.det division by zero warnings ?

seberg · 2023-02-21T13:35:48Z

istr there are more numpy issues about linalg.det division by zero warnings ?

There are at least in the complex version. I tracked down the reason (on the OpenBLAS version I am using, which also shows this issue) to the code (also below) https://github.com/xianyi/OpenBLAS/blob/974acb39ff86121a5a94be4853f58bd728b56b81/lapack/getf2/zgetf2_k.c#L115-L125

	if (fabs(temp1) >= fabs(temp2)){
	  ratio = temp2 / temp1;
	  den = dp1 /(temp1 * ( 1 + ratio * ratio));
	  temp3 =  den;
	  temp4 = -ratio * den;
	} else {
	  ratio = temp1 / temp2;
	  den = dp1 /(temp2 * ( 1 + ratio * ratio));
	  temp3 =  ratio * den;
	  temp4 = -den;
	}

The issue is that it is (on my machine) compiled to:

    0x1065527e0 <+532>: 0x1e60c120   fabs   d0, d9
    0x1065527e4 <+536>: 0x1e60c141   fabs   d1, d10
    0x1065527e8 <+540>: 0x1e612000   fcmp   d0, d1
->  0x1065527ec <+544>: 0x1e6a1920   fdiv   d0, d9, d10
    0x1065527f0 <+548>: 0x1f402001   fmadd  d1, d0, d0, d8
    0x1065527f4 <+552>: 0x1e610941   fmul   d1, d10, d1
    0x1065527f8 <+556>: 0x1e611901   fdiv   d1, d8, d1
    0x1065527fc <+560>: 0x1e610800   fmul   d0, d0, d1
    0x106552800 <+564>: 0x1e614021   fneg   d1, d1
    0x106552804 <+568>: 0x1e691942   fdiv   d2, d10, d9
    0x106552808 <+572>: 0x1f422043   fmadd  d3, d2, d2, d8
    0x10655280c <+576>: 0x1e630923   fmul   d3, d9, d3
    0x106552810 <+580>: 0x1e631903   fdiv   d3, d8, d3
    0x106552814 <+584>: 0x1e614042   fneg   d2, d2
    0x106552818 <+588>: 0x1e620862   fmul   d2, d3, d2
    0x10655281c <+592>: 0x1e63bc00   fcsel  d0, d0, d3, lt
    0x106552820 <+596>: 0x1e62bc21   fcsel  d1, d1, d2, lt

(the marked line is the one that causes the warning, it divides 1 by 0.)

If you look closer, the compiler executes both branches there and merges the result or so. That is wrong w.r.t. to FPEs. (I am surprise that doing 4 instead of 2 divisions seems like a an unsafe optimization.)

Wondering if @Developer-Ecosystem-Engineering wants to have a look at that. It isn't quite related to this issue probably, but does lead to spurious warnings in the tests.

martin-frbg · 2023-02-21T14:10:35Z

Uh, I do not think the compiler should be doing this, is this GCC or LLVM being creative ? One could probably slap a #pragma on this file to disable optimization at least with "known bad" compiler versions...

seberg · 2023-02-21T14:16:33Z

Should be clang on M1, maybe more. It does seem s a bit overly creative. If these were vector registers/instructions it would be less weird...

I am seeing it with both the wheel and conda-forge openblas on M1. (I assume both is clang on Mac, but would have to dig for the versions. Probably 13.0.1 for conda, as that is what Python has.)

seberg · 2023-02-21T14:36:00Z

Not identical code, but I think this shows similar thing on clang 14+15 for armv8, I think: https://godbolt.org/z/xE96cTvh7

IIRC, MacOS versioned clang differently, so that may match up.

charris · 2023-02-21T15:02:34Z

We ran into a similar problem with gcc-11.1 where the compiler executed both branches of an if statement in parallel for speed, but did not reset the flags for the branch not taken. The problem was fixed in the next minor gcc release. A local temporary fix was to compile at a lower optimization level.

martin-frbg · 2023-02-21T15:18:12Z

@charris thanks - do you recall if this was also on M1 (or at least either of OSX/arm64) ?

charris · 2023-02-21T15:30:19Z

do you recall if this was also on M1

No, it was Intel. I only noticed because fedora is cutting edge and I upgrade twice a year, which has its drawbacks :) See #18949 for the discussion.

martin-frbg · 2023-02-21T15:53:38Z

Thx, good to know (though probably no point in trying to proactively "fix" such things everywhere in OpenBLAS' C codes). Now to pragma or not to pragma this known case of zgetf2_k.c ...

seberg · 2023-02-21T15:57:41Z

Hehe, I was hoping @Developer-Ecosystem-Engineering has a thought. Also not sure what pragma might actually work (beyond -O0).

The original issue here is maybe still more interesting. I can repro it in the 1.24.2 wheels locally on M1, but I guess that doesn't mean it isn't fixed already.

martin-frbg · 2023-02-21T16:10:17Z

I'm not even sure if clang supports more fine-grained than "optimize off" (especially if we want it to work with older versions), must admit I did not look at who`s behind the account you tagged 🙂

seberg · 2023-02-21T16:11:52Z

Ah, the account is apple engineer(s) looking into M1 specific issues mainly.

martin-frbg · 2023-02-25T23:14:26Z

Returning to the original issue, on non-OSX arm64 I can reproduce it even with a fully generic, C-only build of OpenBLAS with compiler optimizations turned off. Have not tracked down the bit of code where the (alleged) overflow occurs though.

matteoacrossi · 2023-02-26T18:01:21Z

I instead switched to accelerate from OpenBLAS and don't reproduce the problem anymore.

martin-frbg · 2023-02-26T22:28:37Z

ok, problem seems to be in OpenBLAS' lapack/getf2 (getf2_k.c) where a division by a minuscule value overflows (code prevents division by exact zero only, should probably cut off around 1.e-300 or so)

matteoacrossi added the 00 - Bug label Jul 22, 2022

h-vetinari mentioned this issue Aug 16, 2022

Problems on osx-arm (correctness & speed) conda-forge/openblas-feedstock#143

Closed

martin-frbg mentioned this issue Feb 26, 2023

Avoid overflow from division in GETF2 potentially causing NaN OpenMathLib/OpenBLAS#3924

Merged

ilayn mentioned this issue Mar 28, 2023

BUG: complex determinants broken with openblas 0.3.22 scipy/scipy#18208

Closed

martin-frbg mentioned this issue Mar 29, 2023

0.3.22 breaks functions downstream in GNU Octave OpenMathLib/OpenBLAS#3976

Closed

husisy mentioned this issue Sep 4, 2023

BUG: incorrect warning message when call np.linalg.det on apple M1 #24633

Closed

FranzForstmayr mentioned this issue Nov 13, 2023

darwin aarch64 skrf/calibration/tests/test_calibration.py failing with divide by zero (determinate calculations) scikit-rf/scikit-rf#972

Open

purva-thakre mentioned this issue May 10, 2024

macos-latest passes with warnings unitaryfund/mitiq#2357

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: nan returned by np.linalg.det while it should be 0 on arm64 mac #22025

BUG: nan returned by np.linalg.det while it should be 0 on arm64 mac #22025

matteoacrossi commented Jul 22, 2022 •

edited

Loading

rossbar commented Jul 22, 2022

matteoacrossi commented Jul 22, 2022

rossbar commented Jul 22, 2022

matteoacrossi commented Jul 22, 2022

RBrearton commented Jul 27, 2022

seberg commented Aug 8, 2022

matteoacrossi commented Aug 8, 2022 •

edited by mattip

Loading

mattip commented Aug 8, 2022

matteoacrossi commented Aug 8, 2022

mattip commented Aug 8, 2022

h-vetinari commented Aug 8, 2022

mattip commented Aug 8, 2022

h-vetinari commented Aug 8, 2022

h-vetinari commented Aug 11, 2022

matteoacrossi commented Aug 21, 2022

h-vetinari commented Aug 21, 2022

matteoacrossi commented Aug 21, 2022

h-vetinari commented Aug 21, 2022

h-vetinari commented Aug 21, 2022

martin-frbg commented Aug 24, 2022 •

edited

Loading

seberg commented Feb 21, 2023

martin-frbg commented Feb 21, 2023

seberg commented Feb 21, 2023

seberg commented Feb 21, 2023

charris commented Feb 21, 2023

martin-frbg commented Feb 21, 2023

charris commented Feb 21, 2023

martin-frbg commented Feb 21, 2023

seberg commented Feb 21, 2023

martin-frbg commented Feb 21, 2023

seberg commented Feb 21, 2023

martin-frbg commented Feb 25, 2023

matteoacrossi commented Feb 26, 2023 •

edited

Loading

martin-frbg commented Feb 26, 2023

BUG: nan returned by np.linalg.det while it should be 0 on arm64 mac #22025

BUG: nan returned by np.linalg.det while it should be 0 on arm64 mac #22025

Comments

matteoacrossi commented Jul 22, 2022 • edited Loading

Describe the issue:

Reproduce the code example:

Error message:

NumPy/Python version information:

rossbar commented Jul 22, 2022

matteoacrossi commented Jul 22, 2022

rossbar commented Jul 22, 2022

matteoacrossi commented Jul 22, 2022

RBrearton commented Jul 27, 2022

seberg commented Aug 8, 2022

matteoacrossi commented Aug 8, 2022 • edited by mattip Loading

mattip commented Aug 8, 2022

matteoacrossi commented Aug 8, 2022

mattip commented Aug 8, 2022

h-vetinari commented Aug 8, 2022

mattip commented Aug 8, 2022

h-vetinari commented Aug 8, 2022

h-vetinari commented Aug 11, 2022

matteoacrossi commented Aug 21, 2022

h-vetinari commented Aug 21, 2022

matteoacrossi commented Aug 21, 2022

h-vetinari commented Aug 21, 2022

h-vetinari commented Aug 21, 2022

martin-frbg commented Aug 24, 2022 • edited Loading

seberg commented Feb 21, 2023

martin-frbg commented Feb 21, 2023

seberg commented Feb 21, 2023

seberg commented Feb 21, 2023

charris commented Feb 21, 2023

martin-frbg commented Feb 21, 2023

charris commented Feb 21, 2023

martin-frbg commented Feb 21, 2023

seberg commented Feb 21, 2023

martin-frbg commented Feb 21, 2023

seberg commented Feb 21, 2023

martin-frbg commented Feb 25, 2023

matteoacrossi commented Feb 26, 2023 • edited Loading

martin-frbg commented Feb 26, 2023

matteoacrossi commented Jul 22, 2022 •

edited

Loading

matteoacrossi commented Aug 8, 2022 •

edited by mattip

Loading

martin-frbg commented Aug 24, 2022 •

edited

Loading

matteoacrossi commented Feb 26, 2023 •

edited

Loading