-
Notifications
You must be signed in to change notification settings - Fork 15.2k
Description
The following program compiles with gcc, uses openmp targets on nvptx and creates cuda kernels, does qr, cholesky and lu decompositions with two different algorithms on nvptx targets., matrix multiplications with strassen algorithms and winograd variants on gpu and cpu.
With clang 21.1.0 and
SET (CMAKE_CXX_COMPILER clang++ CACHE STRING "C++ compiler" FORCE)
SET (CMAKE_CXX_FLAGS "${CMAKE_CXX_FLAGS} -std=c++23 -fopenmp -fopenmp-targets=nvptx64-nvidia-cuda --libomptarget-nvptx-bc-path=/usr/lib64/nvptx64-nvidia-cuda/ -Wall")
in the cmakelists.txt, it compiles but on runtime it fails to compute the multiplication with the Strassen algorithm and crashes with the message
assertion failure at kmp_csupport.cpp(539): this_thr->th.th_set_nproc >= 1.
OMP: Error #13: Assertion failure at kmp_csupport.cpp(539).
OMP: Hint Please submit a bug report with this message, compile and run commands used, and machine configuration info including native compiler and operating system versions. Faster response will be obtained by including all program sources. For information on submitting this issue, please see https://github.com/llvm/llvm-project/issues/.
Abgebrochen ./arraytest
The program files are attached. The h.txt files are C++ header files. before compilation, just remove the .txt ending that was added for github upload. compile with
cmake .
followed by
make
clang will compile, but after the ordinary matrix multiplication, it will fail when it wants to start the strassen algorithm on gpu with the above assertion failure at kmp_csupport.cpp(539)
clang version 21.1.0
Target: x86_64-pc-linux-gnu
Thread model: posix
InstalledDir: /usr/lib/llvm/21/bin
Configuration file: /etc/clang/21/x86_64-pc-linux-gnu-clang.cfg
CMakeLists.txt
main_omp.cpp
main_mpi.cpp
mdspan_omp.h.txt
mdspan_data.h.txt
mathfunctions.h.txt
mathfunctions_mpi.h.txt
inkernel_mathfunctions.h.txt
indiceshelperfunctions.h.txt
gpu_mathfunctions.h.txt
datastruct.h.txt
datastruct_mpifunctions.h.txt
datastruct_host_memory_functions.h.txt
datastruct_gpu_memory_functions.h.txt