Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(numpy) svd segfault with some big matrices #225

Closed
ChristophHaag opened this issue May 27, 2013 · 8 comments
Closed

(numpy) svd segfault with some big matrices #225

ChristophHaag opened this issue May 27, 2013 · 8 comments
Labels
Milestone

Comments

@ChristophHaag
Copy link

Openblas 0.2.6 is built with USE_OPENMP=1 NO_LAPACK=1 NUM_THREADS=4
numpy for python3 1.7.1, the same happens with python2.

Test case:

#!/usr/bin/env python3 
import numpy as np
U, s, VT = np.linalg.svd(np.ones(shape=(51, 64000)), full_matrices=False)

I was not really investigative with the size of the matrix, but this is about the smallest area in which it segfaults.
If you make it bigger in either one or both directions, it continues to segfault.
If you make it smaller in either dimension it seems to work, e.g. (50, 1000000) and (1000, 50000) work fine for me, but I haven't really tested much more.

Full backtrace: https://gist.github.com/ChristophHaag/5659298

BUT: My testcase above works when I run it with OMP_NUM_THREADS=1 ./openblas_svd.py. Any more threads than that and it segfaults.

@xianyi
Copy link
Collaborator

xianyi commented May 28, 2013

Thank you for the feedback
I think this is a bug related to DGEMM multi-threaded implementation.

@ViralBShah
Copy link
Contributor

This could possibly also be the reason for some of the failures we see in julia's linear algebra test suite (multi-threaded), which have been hard to track down.

@xianyi
Copy link
Collaborator

xianyi commented Jun 14, 2013

Hi @ChristophHaag ,

Could you give me any tip about which LAPACK & BLAS function numpy svd calls?

Then, I can write the test code in C.

Xianyi

@ChristophHaag
Copy link
Author

Sorry, I have no idea. I have tried looking through the numpy code, but it's confusing.

I think it should be there somewhere, but I am really not familiar with all that.
https://github.com/numpy/numpy/blob/a02457f1d76dc6727f2118f2d129ce3a5261c253/numpy/linalg/umath_linalg.c.src#L3026

Maybe the numpy people can help more?

@ghost
Copy link

ghost commented Jun 27, 2013

I also had problems with 0.2.6 and SVD. Sometimes SVD returns completely incorrect answer with lot of NANs. I tracked it down to this http://icl.cs.utk.edu/lapack-forum/viewtopic.php?f=13&t=4250.

It turns out LAPACK 3.4.2 has many code changes and is buggy. Reverting to LAPACK 3.4.1 (git revert 08c177c) in OpenBLAS solved my problem. Maybe this is a different bug, however.

@xianyi
Copy link
Collaborator

xianyi commented Jun 29, 2013

Hi all,

I will try to link OpenBLAS with LAPACK 3.4.1. I hope it can fix both NANs and segfault bugs :)

Xianyi

@xianyi
Copy link
Collaborator

xianyi commented Jul 15, 2013

Hi @ChristophHaag ,

I think I fixed this SEGFAULT bug on develop branch. Could you test it?

Xianyi

@ChristophHaag
Copy link
Author

Yes, segfault is gone. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants