New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
internal compiler error when compiling on ARM v8 ThunderX2 #13622
Comments
Hmm, ARMv8 testing is part of our CI these days--I wonder what is so different about your setup? Pretty sure shippable is using Ubuntu for their native builds--so that's one difference. |
Well, one thing I should mention is that we have x64 nodes with amlost identical software stack (same OS, packages version, Python installed with same spack comand etc) - and NumPy 1.16.x compiles ok. |
I'm also seeing this issue on CentOS 7.6. Interestingly, it builds just fine on the same hardware with SLE 12 SP4. |
I am not sure we can help with this too much since our CI succeeds. What compiler is SLE 12 SP4 using? You might be able to extract this as a stand-alone compiler bug by looking a few lines up for the actual gcc call and the gcc-options, then whittling down the example until it fails, starting off by removing unneeded include paths. Mine is below. Note the file you want to compile is actually the processed
```
x86_64-linux-gnu-gcc -Ibuild/src.linux-x86_64-3.6/numpy/core/src/npymath -Inumpy/core/include -Ibuild/src.linux-x86_64-3.6/numpy/core/include/numpy -Inumpy/core/src/common -Inumpy/core/src -Inumpy/core -Inumpy/core/src/npymath -Inumpy/core/src/multiarray -Inumpy/core/src/umath -Inumpy/core/src/npysort -I/usr/include/python3.6m -I/path/to/python/include/python3.6m -Ibuild/src.linux-x86_64-3.6/numpy/core/src/common -Ibuild/src.linux-x86_64-3.6/numpy/core/src/npymath -Ibuild/src.linux-x86_64-3.6/numpy/core/src/common -Ibuild/src.linux-x86_64-3.6/numpy/core/src/npymath -c numpy/core/src/npymath/npy_math.c
```
|
It would be useful to know what the value of I attempted to reduce it here but it didn't fail on any of the compilers I tried. |
Experiencing the same issue on CentOS 7.6 on Huawei Taishan 2280 ARM64 servers. Edit: Bug does not occur when using -O0 on GCC 8.2.0 and does not occur at all with GCC 9.1.0 |
I am also experiencing this issue on this environment: I believe that the difference between the aarch64 CI and the failing environments could be related to GLIBC version. Numpy uses the "numpy" version of npy_cacoshf (npy_math_complex.c.src:1389 in the source) if GLIBC version is below 2.18, and the "glibc" version otherwise (npy_math_complex.c:5343 in the build). Centos 7.6 uses GLIBC 2.17. After playing with the compiler/numpy flags, I confirmed the issue occurs when GCC tries to inline the function. Adding either -O0 or -fno-inline flags makes the compilation succeed, but this is not desirable. The only (ugly) workaround that I found so far that did not affect other functions was to force -O0 on the function definition. Here is the source diff for the workaround (GNU GCC only of course):
More debugging info:
|
@ginomcevoy Thanks for the informative debug info. You should be able to compile with GCC 7.2.1 even without the |
Note that c99 is only needed for NumPy >= 1.17. |
Yes, I forgot to mention that I was trying to install NumPy 1.17, that is why I didn't go for I was able to compile NumPy after that ugly fix, and only one test failed for me (TestComplexFunctions.test_loss_of_precision[complex256]). |
An ICE would typically mean a compiler bug. I'll see if I can find a centos box to reproduce this. EDIT: I should add that I just built numpy with gcc 8.3.0, so whatever the bug, it's likely to have already been fixed and only a backport may be necessary to get this working again. EDIT2: Fixed gcc version. Sorry, checked the wrong machine, too many tabs open :/ |
Sorry I couldn't find a suitable machine to reproduce this. If someone can give me access I'll be happy to help isolate the problem. |
I've reproduced this on cortex-a72 with both CentOS 7.6 and Ubuntu 18.04. I'm looking into this. Please report bugs like this to the upstream community (https://gcc.gnu.org/bugzilla/). The GNU Toolchain community prioritizes ICE (internal compiler error) and wrong-code bugs. Most of the time you just need to attach pre-processed source file and cc1 command line -- add "-v -save-temps" to compilation flags to get these. |
I think it was here: ...and fixed, I believe. NumPy builds with 9.2.0 on Aarch64 for me now. |
@nSircombe , thanks! This saved me some cycles digging into this further. The patch was backported to gcc-7 and gcc-8 release branches, and will be in next update releases, which distros should pick up. |
I've not tested the newer versions of GCC 7 & 8. But I can confirm that |
Confirmed that GCC 7 built from current gcc-7-branch works fine. |
Thanks for the detective work. Should we wait for the toolchain to be released to close this? |
Release 1.17.4 has moved things around, and the above workaround patch is no longer applicable. Here is what works for me at the moment :
Not sure if it is the "best" (least worse) workaround though, feel free to give some feedback ! |
Closing, please reopen if there are still problems with this particular toolchain. |
Compilation of NumPy 1.16.x fails when I compile it on a machine with Cavium ThunderX2 CPU
Versions 1.15.4 and below compile normally
Reproducing code example:
get last stable release, run
python3 setup.py install
or
pip install
Error message:
the first error which is not from _configtest.c is
full compilation log: https://pastebin.com/83Jj3knk
Environment:
OS: CentOS Linux release 7.6.1810 (AltArch)
Python 3.7.2, tried several other versions.
GCC 7.4.0, also tried system's default 4.8.5
CPU:
The text was updated successfully, but these errors were encountered: