Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kernel_regress:skx_avx [FAIL] with Clang #3015

Closed
Zha0q1 opened this issue Nov 30, 2020 · 10 comments · Fixed by #3018
Closed

kernel_regress:skx_avx [FAIL] with Clang #3015

Zha0q1 opened this issue Nov 30, 2020 · 10 comments · Fixed by #3018

Comments

@Zha0q1
Copy link

Zha0q1 commented Nov 30, 2020

I am trying to build openblas (0.3.7, 0.3.10, 0.3.12) on ubuntu 18.4 with CLang (I tried clang-6.0 and clang-10). I got this test error consistently:

TEST 22/23 potrf:smoketest_trivial [OK]
TEST 23/23 kernel_regress:skx_avx [FAIL]
  ERR: test_kernel_regress.c:49  expected 0.000e+00, got 4.127e+02 (diff -4.127e+02, tol 1.000e-10)
RESULTS: 23 tests (22 ok, 1 failed, 0 skipped) ran in 23 ms
make[1]: *** [run_test] Error 1

Build Script:

# Build OpenBLAS from source
RUN mkdir ~/openblas && \
    cd ~/openblas && \
    OPENBLAS_VERSION=0.3.7 && \
    wget \
        https://github.com/xianyi/OpenBLAS/archive/v${OPENBLAS_VERSION}.zip \
        -O openblas.zip && \
    unzip -q openblas.zip -d . && \
    cd OpenBLAS-${OPENBLAS_VERSION} && \
    CXX="clang++-10 -fPIC -mno-avx" CC="clang-10 -fPIC -mno-avx" make -j DYNAMIC_ARCH=1 DYNAMIC_OLDER=1 \
        USE_OPENMP=1 INTERFACE64=1 BINARY=64 NO_AVX=1 NO_AVX2=1 NO_AVX512=1 && \
    make PREFIX=/usr/local/opt install && \

CPU info:


Step 4/17 : RUN lscpu
 ---> Running in c02130b8da64
Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
CPU(s):              32
On-line CPU(s) list: 0-31
Thread(s) per core:  2
Core(s) per socket:  16
Socket(s):           1
NUMA node(s):        1
Vendor ID:           GenuineIntel
CPU family:          6
Model:               79
Model name:          Intel(R) Xeon(R) CPU E5-2686 v4 @ 2.30GHz
Stepping:            1
CPU MHz:             2699.894
CPU max MHz:         3000.0000
CPU min MHz:         1200.0000
BogoMIPS:            4600.09
Hypervisor vendor:   Xen
Virtualization type: full
L1d cache:           32K
L1i cache:           32K
L2 cache:            256K
L3 cache:            46080K
NUMA node0 CPU(s):   0-31
Flags:               fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc aperfmperf pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single kaiser fsgsbase bmi1 hle avx2 smep bmi2 erms invpcid rtm rdseed adx xsaveopt
Removing intermediate container c02130b8da64

CLang:

Step 6/17 : RUN clang-10 --version
 ---> Running in d3004a7f3279
Ubuntu clang version 10.0.1-++20200708124938+ef32c611aa2-1~exp1~20200707225535.189 
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/bin
@Zha0q1
Copy link
Author

Zha0q1 commented Dec 1, 2020

Update:
so after disabling openmp it was able to build and pass the tests:

CXX="clang++-6.0 -fPIC" CC="clang-6.0 -fPIC" make -j DYNAMIC_ARCH=1 DYNAMIC_OLDER=1 \
        USE_OPENMP=0 INTERFACE64=1 BINARY=64 && \

This makes me wonder if I need to set some flag to specify the flavor of omp? I do also have gomp on my machine

@martin-frbg
Copy link
Collaborator

Normally clang should link its own flavor of omp automatically - I'll need to try to reproduce this (not observed with gcc so far). Your combination of build options does look a bit strange - with the NO_AVX NO_AVX2 NO_AVX512 you disable dedicated support for any cpu newer than Nehalem, so maybe there is "only" a problem with running the clang-built Nehalem version of the DGEMM kernel on Haswell/Broadwell architecture.

@valpackett
Copy link

I've just had the same problem with clang11+gfortran9 in a customized version of FreeBSD which removes the "base system" clang/libomp, and only uses pkg-installed llvm.

This is indeed the result of linking both libomp and libgomp. The libgomp comes from $(FEXTRALIB) in Makefile.conf as a result of the f_check script. Excluding gomp in that script is a workaround:

--- f_check.orig
+++ f_check
@@ -337,6 +337,7 @@
            && ($flags !~ /kernel32/)
            && ($flags !~ /advapi32/)
            && ($flags !~ /shell32/)
+           && ($flags !~ /gomp/)
            && ($flags !~ /omp/ || ($vendor !~ /PGI/ && $flags =~ /omp/))
            && ($flags !~ /[0-9]+/)
                && ($flags !~ /^\-l$/)

valpackett added a commit to DankBSD/ports that referenced this issue Dec 1, 2020
@Zha0q1
Copy link
Author

Zha0q1 commented Dec 1, 2020

Normally clang should link its own flavor of omp automatically - I'll need to try to reproduce this (not observed with gcc so far). Your combination of build options does look a bit strange - with the NO_AVX NO_AVX2 NO_AVX512 you disable dedicated support for any cpu newer than Nehalem, so maybe there is "only" a problem with running the clang-built Nehalem version of the DGEMM kernel on Haswell/Broadwell architecture.

I thought the NO_AVX NO_AVX2 NO_AVX512 options could be a fix but they were not. I added them in and later removed them and saw the same failure throughout. And yeah gcc does not have issue

@martin-frbg
Copy link
Collaborator

Thanks. As far as I can tell, this problem actually goes a bit deeper - when USE_OPENMP is set, the build system translates this into -fopenmp, which for clang appears to imply -lomp - which is all well and good unless (a) libomp is not installed or (b) gfortran is used as the Fortran compiler, for which -fopenmp implies -lgomp. Some sanity can be restored by using -fopenmp=libgomp for the clang parts, but then there would be no clean way to expressly tell it to link with its own libomp.

@Zha0q1
Copy link
Author

Zha0q1 commented Dec 1, 2020

Thanks. As far as I can tell, this problem actually goes a bit deeper - when USE_OPENMP is set, the build system translates this into -fopenmp, which for clang appears to imply -lomp - which is all well and good unless (a) libomp is not installed or (b) gfortran is used as the Fortran compiler, for which -fopenmp implies -lgomp. Some sanity can be restored by using -fopenmp=libgomp for the clang parts, but then there would be no clean way to expressly tell it to link with its own libomp.

I do have libomp installed so i think b) is the cause of my issue..

@valpackett
Copy link

(a) libomp is not installed

If your clang wants to link with a non-existent libomp, your environment is kinda broken.

(b) gfortran is used as the Fortran compiler, for which -fopenmp implies -lgomp

See my comment above (#3015 (comment)) for a workaround!

@martin-frbg
Copy link
Collaborator

martin-frbg commented Dec 1, 2020

brokenness (a) seems to be easily achievable with current opensuse (at least)
(b) acknowledged but need to build on that to ensure it does not cause trouble for non-clang builds (and on its own it will not fix CC=clang NOFORTRAN builds due to (a)). Another possibility would seem to be to run with make CC="clang -fopenmp=libgomp" USE_OPENMP=1 ... (the test and ctest Makefiles need to be stopped from expressly adding -lomp for this; and the clang9-generated binaries appear to run add a glacial pace with libgomp. Could be there is still more to this...)

valpackett added a commit to DankBSD/ports that referenced this issue Dec 2, 2020
valpackett added a commit to DankBSD/ports that referenced this issue Dec 13, 2020
valpackett added a commit to DankBSD/ports that referenced this issue Dec 17, 2020
valpackett added a commit to DankBSD/ports that referenced this issue Dec 18, 2020
valpackett added a commit to DankBSD/ports that referenced this issue Dec 25, 2020
valpackett added a commit to DankBSD/ports that referenced this issue Dec 28, 2020
valpackett added a commit to DankBSD/ports that referenced this issue Jan 3, 2021
valpackett added a commit to DankBSD/ports that referenced this issue Jan 7, 2021
valpackett added a commit to DankBSD/ports that referenced this issue Jan 9, 2021
valpackett added a commit to DankBSD/ports that referenced this issue Jan 14, 2021
valpackett added a commit to DankBSD/ports that referenced this issue Jan 16, 2021
@aninstein
Copy link

Dear sir, how to solve this problem?

@martin-frbg
Copy link
Collaborator

This should be fixed in the 0.3.13 release, do you still see the problem with the current version ?

valpackett added a commit to DankBSD/ports that referenced this issue Feb 16, 2021
valpackett added a commit to DankBSD/ports that referenced this issue Feb 20, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants