-
-
Notifications
You must be signed in to change notification settings - Fork 5.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ValueError in scipy.linalg.eigvalsh for large matrix #8205
Comments
Hi @ilayn, could be that this is a lapack issue. I my case, I also have
|
Judging by the google results statistics, this looks like a special bug for MKL but not LAPACK. |
I'm having this problem too, I recently changed from homebrew python (on mac OS X) to anaconda and that's when it started happening (https://stackoverflow.com/questions/47836266/error-when-diagonalising-large-matrices-using-anaconda-scipy). So perhaps it is an anaconda bug and should be reported to them. |
@chris-n-self I've checked it again and the routines reported here don't seem to be Reference LAPACK routines. But instead they pop up with MKL searches (such as DSBRDB, ZHBRDB). From the naming they sound like banded Symmetric/Hermitian matrix reduction to (again?) banded matrix routines. That tells me that this has to be reported to Intel MKL which is a particular implementation, let alone the fact that we can't even fix anything about LAPACK bugs. This is beyond our reach. And I'm almost certain beyond Anaconda too. If you can test a nonMKL SciPy inside anaconda that would settle the issue. |
@ilayn I'm able to reproduce this error in conda python2.7 using MKL - summary of test Here is the make and test run: |
@brd4790 Thank you for the analysis. Indeed I've seen that we don't query Did you call DSYEVR thrice with |
@ilayn thanks for the suggestion! That does indeed reproduce the error and returns all zero eigenvalues. Same as noted in #6666 , this works for n = 6143 and fails for n > 6143. I can report this to Intel. Otherwise, what do you suggest? Is it worth optimizing the work parameters to work around this issue? Thanks! Error message was "Intel MKL ERROR: Parameter 12 was incorrect on entry to DSBRDB." |
@brd4790 Ouch. No, that means the blame is probably on us in the f2py wrapper 😃 I'll try to compile a PR to fix this as soon as I can confirm it locally. In the meantime which lwork and liwork values did you use in your previous C program if I may ask? I mean the one that didn't give any errors. Thank you for dissecting this further for us by the way. EDIT: By the way, this still shows that MKL has an internal bug probably assuming the optimal lwork size but nevertheless we should have used lwork and liwork queries anyhow. |
@ilayn right, that's what I was thinking is that MKL has a bug since dysevr is not living up to it's documented contract. I'll still report that to Intel for what it's worth. To calculate the optimal values I called dysevr with lwork & liwork set to -1... I have no idea what the optimization is doing. For n=6144 , the optimization returns lwork = 768000 and liwork = 61440. So liwork is at the minimum, but not lwork. happy to help, thank you! |
@brd4790 It seems that the wrapper did not expose every possibility and while modifying it I've found out that proper fix will break the |
@ilayn thanks, that's cool.. I've never seen pyf signature file before. Just out of curiosity, what's the issue with marking the work sizes as optional and leaving the default values at the minimums? |
FYI: Intel responded to me and referenced a similar issue in 2018 "MKLD-3350" |
Interesting, yesterday I got the same problem with |
@ilayn FYI Intel has fixed the MKL bug with minimum work sizes in release 19.0 |
@ilayn , I have IMKL/2019.2.187, numpy-1.16.2, scipy-1.2.1 and python/3.7.0, but I still have the same error with hermitian matrix , but it is correct with symm matrix as shown with the simple example code: Intel MKL ERROR: Parameter 12 was incorrect on entry to ZHBRDB. We don't have trouble with numpy/scipy+openblas. Could you help ? thanks, HuiZhong |
The problem still exists for me, using Anaconda 2019.10-py37 with mkl 2019.4 and scipy 1.3.1. A rather comprehensive description of my problem can be found here: https://stackoverflow.com/questions/54314529/mkl-error-parameter-12-for-large-matrices-with-scipy-linalg-eigvalsh-in-an To make a long story short - I used the scipy lapack wrappers to do what is written in the eigvalsh (or rather eigh) routine and it reproduces the same error as mentioned right at the beginning, spitting out Intel MKL ERROR: Parameter 12 was incorrect on entry to ZHBRDB Then I checked the zheev function in Fortran - so the same function that eigh uses - linked against the same mkl as scipy and it worked. So I guesss the problem is hidden in the wrapper, probably some problems with calculating the work in combination with using MKL, since non-MKL versions work perfectly fine. Deeper analysis is atm beyond my skills. |
and
It would be helpful if you could add here both programs/scripts. |
@marvinlenk #11304 I've renewed the wrappers for evr family. Unfortunately, the signature has changed as a result of this unification. Would be great if you can give any feedback you might have. I've already tested your SO example and all works, seemingly, OK . |
@ilayn Sorry for the late reply - it seems to be working like a charm now, thank you very much! |
scipy.linalg.eigvalsh()
throws ValueError() for large matrices. The bug appears for arbitrary matrices.However, it also shows up in trivial examples, such as a large identity matrix, and large diagonal matrices with random coefficients in [0,1].
Reproducing code example:
Error message:
Scipy/Numpy/Python version information:
The text was updated successfully, but these errors were encountered: