ValueError in scipy.linalg.eigvalsh for large matrix #8205

mgbukov · 2017-12-07T22:46:12Z

scipy.linalg.eigvalsh() throws ValueError() for large matrices. The bug appears for arbitrary matrices.

However, it also shows up in trivial examples, such as a large identity matrix, and large diagonal matrices with random coefficients in [0,1].

Reproducing code example:

import numpy as np 
import scipy.linalg as la

H=np.eye(6470)
#np.random.seed(0)
#H=np.diag(np.random.uniform(size=6470))
E=la.eigvalsh(H)

Error message:

Traceback (most recent call last):
  File "example0.py", line 7, in <module>
    E=la.eigvalsh(H)
  File "/Users/mbukov/anaconda3/lib/python3.6/site-packages/scipy/linalg/decomp.py", line 734, in eigvalsh
    check_finite=check_finite)
  File "/Users/mbukov/anaconda3/lib/python3.6/site-packages/scipy/linalg/decomp.py", line 384, in eigh
    iu=a1.shape[0], overwrite_a=overwrite_a)
ValueError: On entry to DSBRDB parameter number 12 had an illegal value

Scipy/Numpy/Python version information:

0.19.1 1.13.0 sys.version_info(major=3, minor=6, micro=2, releaselevel='final', serial=0)

The text was updated successfully, but these errors were encountered:

ilayn · 2017-12-07T23:00:18Z

Hi @mgbukov this is a duplicate of #6666. I've tested again with SciPy 1.0 and 1.1 and I can't still reproduce it. It might be a conda/lapack/mkl combo bug. I'm on OpenBLAS built with :

>>> la.lapack.ilaver()
(3, 7, 0)

mgbukov · 2017-12-08T00:57:08Z

Hi @ilayn, could be that this is a lapack issue. I my case, I also have

>>> la.lapack.ilaver()
(3, 7, 0)

ilayn · 2017-12-08T07:32:22Z

Judging by the google results statistics, this looks like a special bug for MKL but not LAPACK.

chris-n-self · 2017-12-20T13:41:01Z

I'm having this problem too, I recently changed from homebrew python (on mac OS X) to anaconda and that's when it started happening (https://stackoverflow.com/questions/47836266/error-when-diagonalising-large-matrices-using-anaconda-scipy). So perhaps it is an anaconda bug and should be reported to them.

ilayn · 2017-12-20T14:35:47Z

@chris-n-self I've checked it again and the routines reported here don't seem to be Reference LAPACK routines. But instead they pop up with MKL searches (such as DSBRDB, ZHBRDB). From the naming they sound like banded Symmetric/Hermitian matrix reduction to (again?) banded matrix routines.

That tells me that this has to be reported to Intel MKL which is a particular implementation, let alone the fact that we can't even fix anything about LAPACK bugs. This is beyond our reach. And I'm almost certain beyond Anaconda too.

If you can test a nonMKL SciPy inside anaconda that would settle the issue.

brad-alt · 2017-12-22T04:22:53Z

@ilayn I'm able to reproduce this error in conda python2.7 using MKL -
For me the call to scipy.linalg.eigvalsh resolves to MKL function dsyevr. I wrote a c program to call MKL's fortran function dsyevr that successfully computed the eigenvals of a 6470+ matrix, and I tried some much larger matrices without issue as well. There is still probably an issue with MKL since this all works when scipy is using OpenBLAS. However, perhaps the MKL issue is still specific to how scipy calls dsyevr (it's got these work size parameters that should be optimized), and there can be a workaround.. I guess I'll look into the cython code to understand that better, which may reveal the MKL issue...

summary of test
My code is basically Intel's example for dsyevr, but I increased array size and had to be dynamically allocated to prevent segmentation fault (I can post code if interested): https://software.intel.com/sites/products/documentation/doclib/mkl_sa/11/mkl_lapack_examples/dsyevr_ex.c.htm

Here is the make and test run:
brad@brad-G11CD:/projects/iss8205/dsyevr_mkl$ make
cc -m64 -I/opt/intel/compilers_and_libraries_2018.1.163/linux/mkl/include test.c -Wl,--start-group /opt/intel/compilers_and_libraries_2018.1.163/linux/mkl/lib/intel64/libmkl_intel_lp64.a /opt/intel/compilers_and_libraries_2018.1.163/linux/mkl/lib/intel64/libmkl_sequential.a /opt/intel/compilers_and_libraries_2018.1.163/linux/mkl/lib/intel64/libmkl_core.a -Wl,--end-group -lpthread -lm -ldl -o test
brad@brad-G11CD:/projects/iss8205/dsyevr_mkl$ ./test
DSYEVR Example Program Results
The total number of eigenvalues found:6470

ilayn · 2017-12-22T07:09:19Z

@brd4790 Thank you for the analysis. Indeed I've seen that we don't query liwork, lwork as we should have done but going with minimal required needed.

Did you call DSYEVR thrice with lwork, liwork supplying -1? If yes could you please also test in your C program if lowering liwork to 10*n and lwork to 26*n produce problems ?

brad-alt · 2017-12-22T12:52:06Z

@ilayn thanks for the suggestion! That does indeed reproduce the error and returns all zero eigenvalues. Same as noted in #6666 , this works for n = 6143 and fails for n > 6143. I can report this to Intel. Otherwise, what do you suggest? Is it worth optimizing the work parameters to work around this issue? Thanks!

Error message was "Intel MKL ERROR: Parameter 12 was incorrect on entry to DSBRDB."

ilayn · 2017-12-22T13:01:03Z

@brd4790 Ouch. No, that means the blame is probably on us in the f2py wrapper 😃 I'll try to compile a PR to fix this as soon as I can confirm it locally.

In the meantime which lwork and liwork values did you use in your previous C program if I may ask? I mean the one that didn't give any errors.

Thank you for dissecting this further for us by the way.

EDIT: By the way, this still shows that MKL has an internal bug probably assuming the optimal lwork size but nevertheless we should have used lwork and liwork queries anyhow.

brad-alt · 2017-12-22T22:32:24Z

@ilayn right, that's what I was thinking is that MKL has a bug since dysevr is not living up to it's documented contract. I'll still report that to Intel for what it's worth.

To calculate the optimal values I called dysevr with lwork & liwork set to -1... I have no idea what the optimization is doing. For n=6144 , the optimization returns lwork = 768000 and liwork = 61440. So liwork is at the minimum, but not lwork.

happy to help, thank you!

ilayn · 2017-12-23T16:11:45Z

@brd4790 It seems that the wrapper did not expose every possibility and while modifying it I've found out that proper fix will break the <s/d>evr signatures. I've already fixed the wrapper but it needs a dev-team decision. So it will take some more discussion to fix this apparently. Take a look at the linked PR above.

brad-alt · 2017-12-23T18:41:03Z

@ilayn thanks, that's cool.. I've never seen pyf signature file before. Just out of curiosity, what's the issue with marking the work sizes as optional and leaving the default values at the minimums?

brad-alt · 2017-12-24T12:33:56Z

FYI: Intel responded to me and referenced a similar issue in 2018 "MKLD-3350"
link: intel MKL topic 753935

ilayn · 2017-12-24T13:03:45Z

Interesting, yesterday I got the same problem with DORMQR having an illegal parameter when I exposed all the inputs/outputs. There might be a bit more to this story then.

brad-alt · 2018-10-05T11:17:10Z

@ilayn FYI Intel has fixed the MKL bug with minimum work sizes in release 19.0
https://software.intel.com/en-us/forums/intel-math-kernel-library/topic/753935

hzlucq · 2019-03-04T20:38:20Z

@ilayn , I have IMKL/2019.2.187, numpy-1.16.2, scipy-1.2.1 and python/3.7.0, but I still have the same error with hermitian matrix , but it is correct with symm matrix as shown with the simple example code:
#################
(ENV_mkl_scipy) [luhuizho@build-node ~]$ cat simple.py
import numpy as np
from scipy.linalg import eigvalsh
print('*'10+ ' symm matrix ' + ''10)
eigvalsh(np.eye(6144))
print(''10+ ' h matrix ' + ''*10)
eigvalsh(np.eye(6144)*1j)
##############
********** symm matrix **********
********** h matrix **********

Intel MKL ERROR: Parameter 12 was incorrect on entry to ZHBRDB.

We don't have trouble with numpy/scipy+openblas.

Could you help ?

thanks,

HuiZhong

marvinlenk · 2019-11-07T16:49:48Z

The problem still exists for me, using Anaconda 2019.10-py37 with mkl 2019.4 and scipy 1.3.1. A rather comprehensive description of my problem can be found here: https://stackoverflow.com/questions/54314529/mkl-error-parameter-12-for-large-matrices-with-scipy-linalg-eigvalsh-in-an

To make a long story short - I used the scipy lapack wrappers to do what is written in the eigvalsh (or rather eigh) routine and it reproduces the same error as mentioned right at the beginning, spitting out

Intel MKL ERROR: Parameter 12 was incorrect on entry to ZHBRDB

Then I checked the zheev function in Fortran - so the same function that eigh uses - linked against the same mkl as scipy and it worked. So I guesss the problem is hidden in the wrapper, probably some problems with calculating the work in combination with using MKL, since non-MKL versions work perfectly fine. Deeper analysis is atm beyond my skills.

ev-br · 2019-11-07T16:58:31Z

I used the scipy lapack wrappers to do what is written in the eigvalsh (or rather eigh) routine and it reproduces the same error

and

Then I checked the zheev function in Fortran - so the same function that eigh uses

It would be helpful if you could add here both programs/scripts.

ilayn · 2020-01-07T07:17:50Z

@marvinlenk #11304 I've renewed the wrappers for evr family. Unfortunately, the signature has changed as a result of this unification. Would be great if you can give any feedback you might have. I've already tested your SO example and all works, seemingly, OK .

marvinlenk · 2020-09-17T11:31:54Z

@marvinlenk #11304 I've renewed the wrappers for evr family. Unfortunately, the signature has changed as a result of this unification. Would be great if you can give any feedback you might have. I've already tested your SO example and all works, seemingly, OK .

@ilayn Sorry for the late reply - it seems to be working like a charm now, thank you very much!

ilayn added defect A clear bug or issue that prevents SciPy from being installed or used as expected scipy.linalg labels Dec 8, 2017

mgbukov mentioned this issue Dec 9, 2017

report eigvalsh() bug to SciPy QuSpin/QuSpin#3

Open

ilayn mentioned this issue Dec 23, 2017

Add Eigenvalue Range Functionality for Symmetric Eigenvalue Problems #6510

Closed

ilayn mentioned this issue Apr 12, 2018

Intel MKL ERROR in scipy.linalg.eigvalsh for complex matrix larger than 1999 #8713

Closed

ilayn mentioned this issue Sep 4, 2018

EIGH very very slow --> suggesting an easy fix #9212

Closed

dweigand mentioned this issue Nov 13, 2018

Intel MKL error breaks wigner.qfunc qutip/qutip#937

Closed

ilayn mentioned this issue Dec 22, 2019

Hermitian Eigenvalue Problem eigh() API and wrapper change proposal #11262

Closed

ilayn mentioned this issue Jan 2, 2020

ENH: MAINT: Rewrite of eigh() and relevant wrappers #11304

Merged

ilayn added this to the 1.5.0 milestone Feb 11, 2020

ilayn closed this as completed in #11304 Feb 11, 2020

ilayn mentioned this issue Mar 17, 2020

linalg tests failing in runtests.py #11601

Closed

MrRobot2211 mentioned this issue Apr 11, 2021

Updating superop module tests allowing rectangular channels/channels with tracing representation qutip/qutip#1491

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ValueError in scipy.linalg.eigvalsh for large matrix #8205

ValueError in scipy.linalg.eigvalsh for large matrix #8205

mgbukov commented Dec 7, 2017

ilayn commented Dec 7, 2017 •

edited

Loading

mgbukov commented Dec 8, 2017

ilayn commented Dec 8, 2017

chris-n-self commented Dec 20, 2017 •

edited

Loading

ilayn commented Dec 20, 2017

brad-alt commented Dec 22, 2017 •

edited

Loading

ilayn commented Dec 22, 2017

brad-alt commented Dec 22, 2017

ilayn commented Dec 22, 2017 •

edited

Loading

brad-alt commented Dec 22, 2017

ilayn commented Dec 23, 2017

brad-alt commented Dec 23, 2017

brad-alt commented Dec 24, 2017

ilayn commented Dec 24, 2017

brad-alt commented Oct 5, 2018 •

edited

Loading

hzlucq commented Mar 4, 2019

marvinlenk commented Nov 7, 2019

ev-br commented Nov 7, 2019

ilayn commented Jan 7, 2020

marvinlenk commented Sep 17, 2020

ValueError in scipy.linalg.eigvalsh for large matrix #8205

ValueError in scipy.linalg.eigvalsh for large matrix #8205

Comments

mgbukov commented Dec 7, 2017

Reproducing code example:

Error message:

Scipy/Numpy/Python version information:

ilayn commented Dec 7, 2017 • edited Loading

mgbukov commented Dec 8, 2017

ilayn commented Dec 8, 2017

chris-n-self commented Dec 20, 2017 • edited Loading

ilayn commented Dec 20, 2017

brad-alt commented Dec 22, 2017 • edited Loading

ilayn commented Dec 22, 2017

brad-alt commented Dec 22, 2017

ilayn commented Dec 22, 2017 • edited Loading

brad-alt commented Dec 22, 2017

ilayn commented Dec 23, 2017

brad-alt commented Dec 23, 2017

brad-alt commented Dec 24, 2017

ilayn commented Dec 24, 2017

brad-alt commented Oct 5, 2018 • edited Loading

hzlucq commented Mar 4, 2019

marvinlenk commented Nov 7, 2019

ev-br commented Nov 7, 2019

ilayn commented Jan 7, 2020

marvinlenk commented Sep 17, 2020

ilayn commented Dec 7, 2017 •

edited

Loading

chris-n-self commented Dec 20, 2017 •

edited

Loading

brad-alt commented Dec 22, 2017 •

edited

Loading

ilayn commented Dec 22, 2017 •

edited

Loading

brad-alt commented Oct 5, 2018 •

edited

Loading