ENH: use OpenBLAS64 bit interfaces #13956

refraction-ray · 2019-07-10T08:52:16Z

Reproducing code example:

See this gist for reproducing codes and system version information.

Basically, I used intelpython3, whose python and numpy are shipped with intel parallel studio xe 2019.3. Numpy version is 1.16.1.

For symmetric matrix with dimension larger or equal than 32767, np.linalg.eigh() returns wrong results with all zeros immediately (no error message). Other eigen functions like eigvalsh works as expected. 32767 reminds me of possible issues with data type int16.

~~This may also be an issue from intel mkl or intel version of numpy or python~~ (very unlikely, since the issue exists for various distributions of numpy, including the default one linked to openblas). Anyway, I will open the issue here until I have further analysis on this problem and find better place to report the issue.

The text was updated successfully, but these errors were encountered:

refraction-ray · 2019-07-10T14:38:24Z

Also tested on pip installed numpy(1.16.3) with python(3.6.5) on Ubuntu 18.04 server. Similar issue happened.

>>> np.__config__.show()
blas_mkl_info:
  NOT AVAILABLE
blis_info:
  NOT AVAILABLE
openblas_info:
    libraries = ['openblas', 'openblas']
    library_dirs = ['/usr/local/lib']
    language = c
    define_macros = [('HAVE_CBLAS', None)]
blas_opt_info:
    libraries = ['openblas', 'openblas']
    library_dirs = ['/usr/local/lib']
    language = c
    define_macros = [('HAVE_CBLAS', None)]
lapack_mkl_info:
  NOT AVAILABLE
openblas_lapack_info:
    libraries = ['openblas', 'openblas']
    library_dirs = ['/usr/local/lib']
    language = c
    define_macros = [('HAVE_CBLAS', None)]
lapack_opt_info:
    libraries = ['openblas', 'openblas']
    library_dirs = ['/usr/local/lib']
    language = c
    define_macros = [('HAVE_CBLAS', None)]

For symmetric matrix m with dimension no less than 32767, np.linalg.eigh(m) fails with the error

File "blahblah/python3.6/site-packages/numpy/linalg/linalg.py", line 1456, in eigh
    w, vt = gufunc(a, signature=signature, extobj=extobj)
ValueError: On entry to DORMTR parameter number 12 had an illegal value

When one try the line np.linalg.eigh(m) for the second time, python exit with Segmentation fault (core dumped) error.
Though np.linalg.eigvalsh(m) behaves correctly.

It is worth noting this issue is not due to memory limitation, since diagonalization for matrix of this dimension requires only 12GB memory (and I watched on htop for the whole process), while I have 256GB memory on the workstation. Though in other words, you need more than 12GB memory to repeat and catch such error.

For matrix with dimension no more than 32766, everything seems fine. Therefore, I would be surprised if this issue has nothing to do with int16.

Update: 3) Test on another workstation with Ubuntu16.04, and python3.5.2 with pip installed numpy 1.16.2. The problem is exactly the same as described above.
4) Test on some supercomputers with RedHat 6.5, python 3.6.3 and openblas linked numpy 1.14.5. For symmetric matrix with size larger or equal than 32767, python crashes with Segmentation fault at first try of np.linalg.eigh(m). Again, eigvalsh works well, matrix with size smaller than 32767 works well.
5) Test on the same supercomputer, but with Anaconda 4.2.0, python 3.5.2 and numpy 1.13.1, error the same as 4).

Summary: Though the error shows somehow slightly different pattern, the error persists for a wide range of distributions and lapack backends.

Relevant posts:

Intel MKL LAPACKE_dsyevd with n > 32766 --> Not enough memory to allocate work array in LAPACKE_dsyevd

mattip · 2019-07-10T15:00:41Z

Short reproducer

n=32767
b=np.random.rand(n)
m_32767=np.diag(b)
m_32767.shape
V_32767=np.linalg.eigh(m_32767)

seberg · 2019-07-11T14:50:51Z

@mattip do you have a simple setup where you can test it with BLIS or maybe MKL to see if this is numpy or openblas? Otherwise, I guess I will try to remember to test it on our new machine.

refraction-ray · 2019-07-15T02:18:03Z

I have identified the issue, it turns out to be an old issue related with 32bit int interface of lapack. See #5906 (comment). If I am understanding right, this issue is still here in numpy lapack-lite interface implementation. (The 4 bit int input parameter lwork strongly limited the possible range of the length of work space and thus the matrix dimension. To be more specific, as for lapack routine dsyevd, lwork argument requires an integer at least $1+6n+2n^2$. When the matrix dimension n is $2^15-1$, the value of lwork is more than $2^31-1$, leading to an overflow in int type which is 4 byte).

So basically, various of lapack routines including eigen solver and svd would be broken if the matrix dimension is large enough. And such crashing size of matrix is not really large enough considering modern hardwares (matrix size around O(10^4) times O(10^4) with working memory size O(10)GB ).

IMHO, the need to do eigen or svd decomposition on matrix with size larger than 32767*32767 is becoming common due to the development of hardwares. So it would be better the 32bit lapack interface in numpy get improved soon. Otherwise, I am not aware of any reasonable and simple approach to do such numerical task in python now (directly calling lapack routine using C or Fortran is always possible though).

Temporary workaround: For eigh, use scipy.linalg.eigh(). I have inspected the source code on both sides, and it seems to me that eigh in numpy utilizes dsyevd lapack routine while eigh in scipy utilizes dsyevr routine. Somehow, arguments for dsyevr routine is small, (size of work space is in O(N) order) and thus free from 32bit int overflow for matrix with large enough dimension.

charris · 2019-07-15T03:09:45Z

Yep. I was discussing this yesterday. I think we should open an issue so we could optionally use libraries compiled with 64 bit integers. They aren't common yet, so it should be a flag and there needs to some way of checking what the libraries use, but the time is coming, if not already past, when 32 bits will no longer serve.

@eric-wieser Is it possible to compile the current fallback library with 64 bit integers? IIRC, it is a typedef.
@tylerjereddy Do you know if it is possible to compile OpenBLAS with 64 bit integers?
@stefanv Maybe we should put this on the road map, possibly along with quad precision.
@rgommers @pv Any thoughts about scipy in this regard?

I know that most modern Fortran compilers have a flag that chooses between 32 and 64 bit indexes, so theoretically, it can be done. The problem might be compatibility issues from having NumPy out there with different precisions.

matthew-brett · 2019-07-15T09:21:39Z

I believe Julia already uses 64 bit BLAS via OpenBLAS - e.g. JuliaLang/julia#4923

Previous discussion on Numpy lapack_lite: #5906

seberg · 2019-07-16T18:20:23Z

Tagging with 1.18.0, not that we must fix it by then, but it would be awesome if we can make some progress here and do not forget about it.

tylerjereddy · 2019-07-18T03:34:07Z

@isuruf & @martin-frbg may also be able to think of some roadbloacks for this, if there are any

martin-frbg · 2019-07-18T06:09:42Z

Can't think of any roadblocks, might actually make sense to use this example in the OpenBLAS FAQ and descriptions of the INTERFACE64 build parameter. (I guess the current wording in Makefile.rule - dating back to GotoBLAS - could be both confusing and discouraging)

isuruf · 2019-07-18T15:19:13Z

Note that for scipy, you'll have to provide 32bit int interface even if you switch to 64bit interface to avoid breaking downstream packages as downstream packages use cimport scipy.linalg.cython_blas. This is doable with symbols prefixed.

pv · 2019-11-29T21:51:07Z

#15012 implements Julia's approach to this problem. (i) Build openblas with make INTERFACE64=1 SYMBOLSUFFIX=64_ to get libopenblas64_.so. (ii) export NPY_USE_BLAS64_=1 and build numpy to get 64-bit blas/lapack used everywhere. (iii) The symbol suffix should prevent the ABI clashes that usually plague 64-bit blas/lapack, so stuff doesn't segfault if you ~~also use a Python library linked to 32-bit BLAS~~ use it embedded in an application linked to 32-bit BLAS.

mattip · 2020-05-06T18:45:39Z

We have been testing 64-bit OpenBLAS for a while now. We should start releasing 64-bit wheels and see what breaks.

charris · 2020-11-23T10:59:26Z

Pushing this off again. Perhaps we should go to 64 bits for the 1.21 release?

charris · 2021-05-04T03:50:07Z

@mattip @seberg We should discuss using the 64 bit libraries for the 1.21 wheels. If not for 1.21, we should plan on releasing 64 bit wheels for 1.22.0.dev. I'd like to standardize on 64 bits going forward, although we should check that the dtype bundled functions also support it.

seberg · 2021-05-04T04:27:48Z

Good idea to package 64bit OpenBLAS in the future. What do you mean with the "dtype bundled functions", the NumPy fallbacks?

charris · 2021-05-04T12:44:31Z

"dtype bundled functions"

I was thinking of dtype->f->dotfunc, which may already be fixed.

seberg · 2021-05-04T18:35:28Z

I just checked, our ->dotfunc seems to be more or less fine. It works around blas limitations and our path uses intp.

mattip · 2021-05-05T18:23:37Z

Please remind us to ship 64-bit OpenBLAS in the wheels after the 1.21 release, so there will be time to test it before 1.22.

h-vetinari · 2021-06-28T13:51:09Z

Please remind us to ship 64-bit OpenBLAS in the wheels after the 1.21 release, so there will be time to test it before 1.22.

1.21.0 has shipped - time to try this out?

mattip · 2021-06-28T15:40:45Z

Sure. This should be a change to use NPY_USE_BLAS_ILP64 in the MacPython/numpy-wheels repo, (similar to the way it is done in this repo's CI and a release note here.

rgommers · 2021-11-12T15:23:08Z

This change was made, and I think we're happy that it's working. Despite some uncertainty about whether it broke something in SciPy, IIRC the conclusion was that that was unrelated. So I think we're shipping 1.22.0rc1 with a 64-bit OpenBLAS, right @charris?

charris · 2021-11-12T16:07:58Z

So I think we're shipping 1.22.0rc1 with a 64-bit OpenBLAS

I am planning on it. The change was made early in the release cycle, so the major downstream projects should have tested against it.

charris · 2021-11-12T17:27:41Z

I'm going to close this for now. If we need to change things later we can reopen.

pearu · 2024-01-10T16:31:13Z

The issue of np.linalg.eigh returning wrong results or crashing is still real (using numpy 1.26.2):

>>> import numpy as np
>>> n=32767
>>> b=np.random.rand(n)
>>> m_32767=np.diag(b)
>>> m_32767.shape
(32767, 32767)
>>> V_32767=np.linalg.eigh(m_32767)
 ** On entry to DSTEDC parameter number  8 had an illegal value
 ** On entry to DORMTR parameter number 12 had an illegal value
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/pearu/miniconda3/envs/jax-cuda-dev/lib/python3.11/site-packages/numpy/linalg/linalg.py", line 1487, in eigh
    w, vt = gufunc(a, signature=signature, extobj=extobj)
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/pearu/miniconda3/envs/jax-cuda-dev/lib/python3.11/site-packages/numpy/linalg/linalg.py", line 118, in _raise_linalgerror_eigenvalues_nonconvergence
    raise LinAlgError("Eigenvalues did not converge")
numpy.linalg.LinAlgError: Eigenvalues did not converge

where the exception "Eigenvalues did not converge" is very likely wrong and misleading. With n == 32766, the above example works fine.

The underlying problem is that when computing lwork for the lapack syevd function using expression 1 + 6 * n + 2 * n * n, it will overflow when n == 32767. This problem is not unique to syevd but it exists for all lapack functions that work array sizes are quadratic wrt input sizes.

While switching to lapack implementations that uses 64-bit integer inputs, the overflow issue is seemingly resolved but in fact it is just harder to reproduce because the critical n size will be 2147483647 where the issue re-merges when the lwork expression above will overflow for int64.

I have implemented a solution to the same problem in JAX (google/jax#19288) that will lead to an overflow exception rather than wrong results or crashes. I think something similar is appropriate for NumPy as well.

rgommers · 2024-01-10T17:38:45Z

Can you please open a new issue instead @pearu? 64-bit OpenBLAS in wheels was implemented a long time ago and is a large feature that is not about this particular bug.

pearu · 2024-01-10T17:50:50Z

@rgommers , done in #25564.

$@refraction-ray$ refraction-ray mentioned this issue Jul 16, 2019

Feature request: memory efficient interface for eigh and similar eigensolvers #14024

Closed

seberg added 03 - Maintenance component: numpy.linalg labels Jul 16, 2019

seberg added this to the 1.18.0 release milestone Jul 16, 2019

mattip mentioned this issue Aug 16, 2019

init dgesdd failed init python for estimating large matrix #14284

Closed

mattip modified the milestones: 1.18.0 release, 1.19.0 release Nov 24, 2019

pv mentioned this issue Nov 29, 2019

ENH: allow using symbol-suffixed 64-bit BLAS/LAPACK for numpy.dot and linalg #15012

Merged

seberg modified the milestones: 1.19.0 release, 1.20.0 release May 6, 2020

ghost mentioned this issue Sep 20, 2020

when the dimension of ntk_train_train exceed 32767, gradient_descent_mse returns error result. google/neural-tangents#70

Closed

mattjj mentioned this issue Sep 23, 2020

linalg.eigh returns error for n >= 32767 google/jax#4358

Closed

charris modified the milestones: 1.20.0 release, 1.21.0 release Nov 23, 2020

charris added the triage review Issue/PR to be discussed at the next triage meeting label May 4, 2021

mattip changed the title ~~Wrong results on linalg.eigh for matrix with dimension larger than 32767~~ ENH: use OpenBLAS64 bit interfaces May 5, 2021

mattip modified the milestones: 1.21.0 release, 1.22.0 release May 5, 2021

mattip removed the triage review Issue/PR to be discussed at the next triage meeting label May 5, 2021

charris closed this as completed Nov 12, 2021

carlocab mentioned this issue Oct 16, 2022

openblas64 0.3.21 (new formula) Homebrew/homebrew-core#113207

Closed

6 tasks

pearu reopened this Jan 10, 2024

pearu mentioned this issue Jan 10, 2024

BUG: Silent int32 overflow in lapack work size computation leads to wrong exception #25564

Open

pearu closed this as completed Jan 10, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH: use OpenBLAS64 bit interfaces #13956

ENH: use OpenBLAS64 bit interfaces #13956

refraction-ray commented Jul 10, 2019 •

edited

Loading

refraction-ray commented Jul 10, 2019 •

edited

Loading

mattip commented Jul 10, 2019

seberg commented Jul 11, 2019

refraction-ray commented Jul 15, 2019 •

edited

Loading

charris commented Jul 15, 2019 •

edited

Loading

matthew-brett commented Jul 15, 2019

seberg commented Jul 16, 2019

tylerjereddy commented Jul 18, 2019

martin-frbg commented Jul 18, 2019

isuruf commented Jul 18, 2019

pv commented Nov 29, 2019 •

edited

Loading

mattip commented May 6, 2020

charris commented Nov 23, 2020

charris commented May 4, 2021

seberg commented May 4, 2021

charris commented May 4, 2021

seberg commented May 4, 2021

mattip commented May 5, 2021

h-vetinari commented Jun 28, 2021

mattip commented Jun 28, 2021

rgommers commented Nov 12, 2021

charris commented Nov 12, 2021

charris commented Nov 12, 2021

pearu commented Jan 10, 2024

rgommers commented Jan 10, 2024

pearu commented Jan 10, 2024

ENH: use OpenBLAS64 bit interfaces #13956

ENH: use OpenBLAS64 bit interfaces #13956

Comments

refraction-ray commented Jul 10, 2019 • edited Loading

Reproducing code example:

refraction-ray commented Jul 10, 2019 • edited Loading

mattip commented Jul 10, 2019

seberg commented Jul 11, 2019

refraction-ray commented Jul 15, 2019 • edited Loading

charris commented Jul 15, 2019 • edited Loading

matthew-brett commented Jul 15, 2019

seberg commented Jul 16, 2019

tylerjereddy commented Jul 18, 2019

martin-frbg commented Jul 18, 2019

isuruf commented Jul 18, 2019

pv commented Nov 29, 2019 • edited Loading

mattip commented May 6, 2020

charris commented Nov 23, 2020

charris commented May 4, 2021

seberg commented May 4, 2021

charris commented May 4, 2021

seberg commented May 4, 2021

mattip commented May 5, 2021

h-vetinari commented Jun 28, 2021

mattip commented Jun 28, 2021

rgommers commented Nov 12, 2021

charris commented Nov 12, 2021

charris commented Nov 12, 2021

pearu commented Jan 10, 2024

rgommers commented Jan 10, 2024

pearu commented Jan 10, 2024

refraction-ray commented Jul 10, 2019 •

edited

Loading

refraction-ray commented Jul 10, 2019 •

edited

Loading

refraction-ray commented Jul 15, 2019 •

edited

Loading

charris commented Jul 15, 2019 •

edited

Loading

pv commented Nov 29, 2019 •

edited

Loading