-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Segfault in sgesdd under Windows with the Barcelona architecture #603
Comments
I didn't use the latest develop branch for x86-64 branch due another problem with numpy and Haswell I have to track down. In the meantime I recreate numpy with a more recent OpenBLAS revision. |
@carlkl let me known when you have a new build of openblas with mingwpy / numpy ready so that I can update this issue accordingly. |
@ogrisel, I have a temporary upload on https://bitbucket.org/carlkl/mingw-w64-for-python/downloads/openblas-3f1b576_2015-07-13_amd64.7z. This is not a debug build so far, but the latest xiany develop trunk. Compiled without HASWELL kernel ( |
I tried replacing the |
I upload the debug build to bitbucket as well after the building. 2015-07-13 14:58 GMT+02:00 Olivier Grisel notifications@github.com:
|
Alright let me know when you have new stuff for me to test. Alternatively I could try to write a pure C reproduction case but I am not very familiar with the C lapacke API and windows-based development environments so it's not a trivial task for me. @jeromerobert do you think that necessary for you or other OpenBLAS developer to reproduce and understand the cause of the crash?
Sounds reasonable :) |
See https://bitbucket.org/carlkl/mingw-w64-for-python/downloads/openblas-3f1b576_2015-07-13_amd64_NO_AVX2_debug.7z (Barcelona kernels are included). Please consider using_backtrace.txt inside the archive or use gdb if available. |
I did that (using backtrace.dll), here is the result:
|
I took backtrace from http://code.google.com/p/backtrace-mingw/, and later on from http://dukeworld.duke4.net/eduke32/synthesis (64bit support). It has BSD licence and is forseen as addendum to mingwpy (as is OpenBLAS and other suppl. libraries). |
@ogrisel, the Barcelona kernels are about one year old. Can you test against target OPTERON_SSE3, as this is identical to BARCELONA but without barcelona specific kernels. |
@ogrisel , is it dgesdd function? In your topic, it's I think I can write the C code to test it on windows. |
Indeed @xianyi this was double precision data. Although I get the same segfault and backtrace by using |
@carlkl
(I am using the MSYS2 bash console to set the environment variable). |
@xianyi great! let me know if you want me to compile your C file and run it on the machine where I observed the segfault in the first place. |
@ogrisel, I recompiled backtrace: https://bitbucket.org/carlkl/mingw-w64-for-python/downloads/backtrace_eduke32.zip. This version should be able to show lineno as well. |
@ogrisel, @xianyi: a new build of libopenblaspy.dll is (amd64 for now) available as debug and release build: https://bitbucket.org/carlkl/mingw-w64-for-python/downloads/openblas-3f1b576_2015-07-13_amd64-NO_AVX2-NO_BARCELONA_ALL.7z. I compiled with NO_AVX2 (now Haswell kernels) and I exchanged manually the fallback Barcelona target with Prescott. |
@ogrisel, can you test the CPU features of your AMD system with https://bitbucket.org/carlkl/mingw-w64-for-python/downloads/cpuid_features.exe? BTW, it is from https://gist.github.com/macton/4dd5fec2113be284796e. |
@carlkl I get a similar segfault with Prescott. Nehalem works for some reason I don't understand. Here is the output of
|
From: http://www.cpu-world.com/CPUs/Bulldozer/AMD-Opteron%204332%20HE%20-%20OS4332OFU6KHK.html
|
@ogrisel, I have new numpy, scipy builds available at anaconda.org:
I used @wernsaar's latest OpenBLAS trunk with some patches to use NEHALEM instead of BARCELONA as AMD kernel fallback and excluded three non-SSE2 kernels for X86. The HASWELL segfaults I encountered are now gone away with this build. BTW: Your processor should use the PILEDRIVER kernel, if not one has to check driver/others/dynamic.c IMHO. |
Hi @carlkl , I tried again with the new numpy / scipy builds that you did and now I can do a SVD on my rackspace VM. If I set |
@ogrisel , If you set
Therefore, illegal instruction. |
Alright, any idea which kernel type is the right for this host based on the output of |
@ogrisel, I guess the right kernel should be the one chosen by OpenBLAS itself. Check with |
I checked with |
I guess in https://github.com/xianyi/OpenBLAS/blob/develop/driver/others/dynamic.c |
The question then is why Barcelona is a bad fallback for this host: what instruction set is used by the Barcelona kernel that is not available on this machine? |
@ogrisel, concerning #603 (comment): does the segfault still happen with the newer numpy/scipy wheels from https://anaconda.org/carlkl? I can again make a OpenBLAS debug build (I used https://github.com/wernsaar/OpenBLAS due to newer haswell kernels) for these builds. Is 64bit only ok for you? |
@carlkl your latest build of openblas / numpy / scipy on https://anaconda.org/carlkl hides the problem successfully by making openblas detect the Nehalem architecture instead of the Barcelona architecture that causes the crash on that machine. |
@ogrisel, that's correct. For these builds I used a patched wernsaar repo. I will put these patches for that build on https://github.com/carlkl/OpenBLAS. |
Ideally it would be great to have the upstream OpenBLAS architecture detection mechanism robust to such a wonky platform (probably caused by the virtualization layer that hides the support for the AVX and maybe other instruction sets). |
Does sgesdd call gemv? I suspect gemv may cause this segfault problem. |
I tried to reproduce the crash with a direct matrix vector multiplication but I cannot reproduce it. Here is what I a tried (among variants):
|
@ogrisel, I guess this issue can be closed? |
@carlkl it is not clear to me what (if anything) was fixed in OpenBLAS to avoid this problem, and I do not have any Barcelona or similar older AMD hardware. BARCELONA still appears to be the default fallback for any AMD target that lacks AVX capability, although this ticket suggests this may be problematic. If I am reading this correctly, you have/had some private fork for anaconda where you |
@martin-frbg, my fork you mentioned is not used anymore. In this fork I replaced the BARCELONA fallback with NEHALEM. However, this issue is quite old now. If the problem still occurs with a newer version of numpy+openblas tested against Barcelona (@ogrisel, is it possible to test this combination?) a new issue should be created IMHO. |
I'm actually quite happy to get rid of these old tickets, I just did not want to close anything when I have no means of verifying. If you still can test on Barcelona, perhaps we could also kill #494 (a supposedly benign valgrind warning for ddot). Not sure about #607 as that was on Bulldozer, but I suspect that it may have been solved by later fixes as well. |
@martin-frbg, if no one has the technical requirements for testing this configuration (AMD on AVX disabled VM) against a recent OpenBLAS version I propose to close this issue. A new issue should be opened if this " |
Closing without changes to the fallback as the issue appears to have been seen only in some unspecified VM that did not expose the actual Piledriver hardware, and only when running windows in this VM. |
I used openblas build for Python / Numpy with mingwpy by @carlkl. You can install it embedded in numpy with:
I installed the 64 bit version of Python using the official installer from python.org.
If you want to install the compilers for debugging on that plateform you can use:
Carl said on the numpy mailing list that the OpenBLAS version embedded in his numpy package has digest: fb02cb0 (from April).
The problem can be triggered with the following call:
I used this script and apparently the
Barcelona
architecture is detected by OpenBLAS on this VM.With smaller data, e.g.
(128, 128)
it works as expected and if I force the Nehalem core type it works as well (it prints the results of the SVD):To reproduce this I used the 2GB Standard instance on rackspace cloud. I can give you access to such a VM in private message if that helps.
I also tried to reproduce this issue by building OpenBLAS and numpy on the same instance type under Ubuntu 15.04 instead and I cannot reproduce the crash under Linux although I checked that the Barcelona core is detected there as well.
The text was updated successfully, but these errors were encountered: