Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to activate optimization option up to O0 on the CUDA GPU test case #1659

Closed
bemichel opened this issue Feb 3, 2024 · 5 comments
Closed

Comments

@bemichel
Copy link

bemichel commented Feb 3, 2024

Hi,

Thanks to the doc and your help, i was able to setting up a representative test on GPU CUDA with Enzyme both on forward and backward mode : https://fwd.gymni.ch/TWC7tS

I use clang-14+Enzyme-0.0.81+CUDA-11.2 and the results seem good :

$> clang++ -DENABLE_ENZYME -I${CUDAPATH}/include test.cu -fplugin=${ENZYMEPATH}/lib/ClangEnzyme-14.so --cuda-gpu-arch=sm_61 -lcudart -L${CUDAPATH}/11.2/lib64
$> ./a.out
[GPU, direct] a[0]         == 12.000000                                                                                     
[GPU, direct] a[nb_cell-1] == 12.000000                                                                                     
[GPU, direct] b[0]         == 437.000000                                                                                    
[GPU, direct] b[nb_cell-1] == 437.000000
[GPU, forward] da[0]         == 1.000000
[GPU, forward] da[nb_cell-1] == 1.000000
[GPU, forward] db[0]         == 72.000000
[GPU, forward] db[nb_cell-1] == 72.000000
[GPU, backward] da[0]         == 72.000000
[GPU, backward] da[nb_cell-1] == 72.000000
[GPU, backward] db[0]         == 0.000000
[GPU, backward] db[nb_cell-1] == 0.000000

But if i try the same compilation step with -0[123], Enzyme fails :

$> clang++ -O1 -DENABLE_ENZYME -I${CUDAPATH}/include test.cu -fplugin=${ENZYMEPATH}/lib/ClangEnzyme-14.so --cuda-gpu-arch=sm_61 -lcudart -L${CUDAPATH}/11.2/lib64
clang-14: /.../gcc-8.3.1/llvm-14.0.6-mlbglx3o3n5rirgy2xfi4l6f66wjzqhq/include/llvm/Support/Casting.h:90: static bool llvm::isa_impl_cl<To, From*>::doit(const From*) [with To = llvm::ConstantAsMetadata; From = llvm::Metadata]: Assertion `Val && "isa<> used on a null pointer"' failed.
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.      Program arguments: /directory/gcc-8.3.1/llvm-14.0.6-mlbglx3o3n5rirgy2xfi4l6f66wjzqhq/bin/clang-14 -cc1 -triple x86_64-unknown-linux-gnu -target-sdk-version=11.2 -aux-triple nvptx64-nvidia-cuda -emit-obj --mrelax-relocations -disable-free -clear-ast-before-backend -disable-llvm-verifier -discard-value-names -main-file-name test_jambon.cu -mrelocation-model static -mframe-pointer=none -fmath-errno -ffp-contract=on -fno-rounding-math -mconstructor-aliases -funwind-tables=2 -target-cpu x86-64 -tune-cpu generic -mllvm -treat-scalable-fixed-error-as-warning -debugger-tuning=gdb -fcoverage-compilation-dir=directory/test_enzyme -resource-dir /directory/gcc-8.3.1/llvm-14.0.6-mlbglx3o3n5rirgy2xfi4l6f66wjzqhq/lib/clang/14.0.6 -internal-isystem /directory/gcc-8.3.1/llvm-14.0.6-mlbglx3o3n5rirgy2xfi4l6f66wjzqhq/lib/clang/14.0.6/include/cuda_wrappers -include __clang_cuda_runtime_wrapper.h -D ENABLE_ENZYME -I /directory/nvidia-hpc-sdk/22.2-gnu831/Linux_x86_64/22.2/cuda/11.2/include -I /directory/nvidia-hpc-sdk/22.2-gnu831/Linux_x86_64/22.2/math_libs/11.2/targets/x86_64-linux/include -I/directory/hwloc/2.4.1-gnu831-hpc/include -I/directory/openmpi/4.0.5-gnu831-hpc/include -I/directory/intel/oneapi/mkl/2021.2.0/include -internal-isystem /usr/lib/gcc/x86_64-redhat-linux/8/../../../../include/c++/8 -internal-isystem /usr/lib/gcc/x86_64-redhat-linux/8/../../../../include/c++/8/x86_64-redhat-linux -internal-isystem /usr/lib/gcc/x86_64-redhat-linux/8/../../../../include/c++/8/backward -internal-isystem /usr/lib/gcc/x86_64-redhat-linux/8/../../../../include/c++/8 -internal-isystem /usr/lib/gcc/x86_64-redhat-linux/8/../../../../include/c++/8/x86_64-redhat-linux -internal-isystem /usr/lib/gcc/x86_64-redhat-linux/8/../../../../include/c++/8/backward -internal-isystem /directory/gcc-8.3.1/llvm-14.0.6-mlbglx3o3n5rirgy2xfi4l6f66wjzqhq/lib/clang/14.0.6/include -internal-isystem /usr/local/include -internal-isystem /usr/lib/gcc/x86_64-redhat-linux/8/../../../../x86_64-redhat-linux/include -internal-externc-isystem /include -internal-externc-isystem /usr/include -internal-isystem /directory/gcc-8.3.1/llvm-14.0.6-mlbglx3o3n5rirgy2xfi4l6f66wjzqhq/lib/clang/14.0.6/include -internal-isystem /usr/local/include -internal-isystem /usr/lib/gcc/x86_64-redhat-linux/8/../../../../x86_64-redhat-linux/include -internal-externc-isystem /include -internal-externc-isystem /usr/include -internal-isystem /directory/nvidia-hpc-sdk/22.2-gnu831/Linux_x86_64/22.2/cuda/11.2/include -O1 -fdeprecated-macro -fdebug-compilation-dir=directory/test_enzyme -ferror-limit 19 -fgnuc-version=4.2.1 -fcxx-exceptions -fexceptions -fcolor-diagnostics -load /directory/gcc-10.2.0/enzyme-0.0.81-pz4de3ykrazxwzcd3rlouco7s24xmmdu/lib/ClangEnzyme-14.so -fcuda-include-gpubinary /tmp/test_jambon-1b521c.fatbin -cuid=7e7be506b6b6c538 -faddrsig -D__GCC_HAVE_DWARF2_CFI_ASM=1 -o /tmp/test_jambon-3c93a8.o -x cuda test_jambon.cu
1.      <eof> parser at end of file
2.      Optimizer
 #0 0x000000000333bb6f PrintStackTraceSignalHandler(void*) Signals.cpp:0:0
 #1 0x0000000003338ebe SignalHandler(int) Signals.cpp:0:0
 #2 0x00001492f34f2b30 __restore_rt sigaction.c:0:0
 #3 0x00001492f1f1e84f raise (/lib64/libc.so.6+0x3784f)
 #4 0x00001492f1f08c45 abort (/lib64/libc.so.6+0x21c45)
 #5 0x00001492f1f08b19 _nl_load_domain.cold.0 loadmsgcat.c:0:0
 #6 0x00001492f1f16e36 .annobin___GI___assert_fail.end assert.c:0:0
 #7 0x00001492f1831923 (/scratchm/sonics/opt_2024/linux-rhel8-broadwell/gcc-10.2.0/enzyme-0.0.81-pz4de3ykrazxwzcd3rlouco7s24xmmdu/lib/ClangEnzyme-14.so+0x18f923)
 #8 0x00001492f1875f8b EnzymeLogic::CreateForwardDiff(llvm::Function*, DIFFE_TYPE, llvm::ArrayRef<DIFFE_TYPE>, TypeAnalysis&, bool, DerivativeMode, bool, unsigned int, llvm::Type*, FnTypeInfo const&, std::vector<bool, std::allocator<bool> >, AugmentedReturn const*, bool) (/scratchm/sonics/opt_2024/linux-rhel8-broadwell/gcc-10.2.0/enzyme-0.0.81-pz4de3ykrazxwzcd3rlouco7s24xmmdu/lib/ClangEnzyme-14.so+0x1d3f8b)
 #9 0x00001492f180d8c4 (anonymous namespace)::EnzymeBase::HandleAutoDiff(llvm::Instruction*, unsigned int, llvm::Value*, llvm::Type*, llvm::SmallVectorImpl<llvm::Value*>&, std::map<int, llvm::Type*, std::less<int>, std::allocator<std::pair<int const, llvm::Type*> > > const&, std::vector<DIFFE_TYPE, std::allocator<DIFFE_TYPE> > const&, llvm::Function*, DerivativeMode, (anonymous namespace)::EnzymeBase::Options&, bool) Enzyme.cpp:0:0
#10 0x00001492f180fd30 (anonymous namespace)::EnzymeBase::HandleAutoDiffArguments(llvm::CallInst*, DerivativeMode, bool) Enzyme.cpp:0:0
#11 0x00001492f1812a35 (anonymous namespace)::EnzymeBase::lowerEnzymeCalls(llvm::Function&, std::set<llvm::Function*, std::less<llvm::Function*>, std::allocator<llvm::Function*> >&) Enzyme.cpp:0:0
#12 0x00001492f18166c6 (anonymous namespace)::EnzymeBase::run(llvm::Module&) Enzyme.cpp:0:0
#13 0x00001492f182fd8e llvm::detail::PassModel<llvm::Module, EnzymeNewPM, llvm::PreservedAnalyses, llvm::AnalysisManager<llvm::Module> >::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) (/scratchm/sonics/opt_2024/linux-rhel8-broadwell/gcc-10.2.0/enzyme-0.0.81-pz4de3ykrazxwzcd3rlouco7s24xmmdu/lib/ClangEnzyme-14.so+0x18dd8e)
#14 0x0000000002afb0a9 llvm::PassManager<llvm::Module, llvm::AnalysisManager<llvm::Module> >::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) (/scratchm/sonics/opt_2024/linux-rhel8-broadwell/gcc-8.3.1/llvm-14.0.6-mlbglx3o3n5rirgy2xfi4l6f66wjzqhq/bin/clang-14+0x2afb0a9)
#15 0x0000000003648736 (anonymous namespace)::EmitAssemblyHelper::RunOptimizationPipeline(clang::BackendAction, std::unique_ptr<llvm::raw_pwrite_stream, std::default_delete<llvm::raw_pwrite_stream> >&, std::unique_ptr<llvm::ToolOutputFile, std::default_delete<llvm::ToolOutputFile> >&) (.constprop.902) BackendUtil.cpp:0:0
#16 0x000000000364a7b3 clang::EmitBackendOutput(clang::DiagnosticsEngine&, clang::HeaderSearchOptions const&, clang::CodeGenOptions const&, clang::TargetOptions const&, clang::LangOptions const&, llvm::StringRef, llvm::Module*, clang::BackendAction, std::unique_ptr<llvm::raw_pwrite_stream, std::default_delete<llvm::raw_pwrite_stream> >) (/scratchm/sonics/opt_2024/linux-rhel8-broadwell/gcc-8.3.1/llvm-14.0.6-mlbglx3o3n5rirgy2xfi4l6f66wjzqhq/bin/clang-14+0x364a7b3)
#17 0x00000000042cc38d clang::BackendConsumer::HandleTranslationUnit(clang::ASTContext&) (/scratchm/sonics/opt_2024/linux-rhel8-broadwell/gcc-8.3.1/llvm-14.0.6-mlbglx3o3n5rirgy2xfi4l6f66wjzqhq/bin/clang-14+0x42cc38d)
#18 0x0000000003cdba38 clang::MultiplexConsumer::HandleTranslationUnit(clang::ASTContext&) (/scratchm/sonics/opt_2024/linux-rhel8-broadwell/gcc-8.3.1/llvm-14.0.6-mlbglx3o3n5rirgy2xfi4l6f66wjzqhq/bin/clang-14+0x3cdba38)
#19 0x00000000050981c9 clang::ParseAST(clang::Sema&, bool, bool) (/scratchm/sonics/opt_2024/linux-rhel8-broadwell/gcc-8.3.1/llvm-14.0.6-mlbglx3o3n5rirgy2xfi4l6f66wjzqhq/bin/clang-14+0x50981c9)
#20 0x00000000042cc6e2 clang::CodeGenAction::ExecuteAction() (/scratchm/sonics/opt_2024/linux-rhel8-broadwell/gcc-8.3.1/llvm-14.0.6-mlbglx3o3n5rirgy2xfi4l6f66wjzqhq/bin/clang-14+0x42cc6e2)
#21 0x0000000003ca9231 clang::FrontendAction::Execute() (/scratchm/sonics/opt_2024/linux-rhel8-broadwell/gcc-8.3.1/llvm-14.0.6-mlbglx3o3n5rirgy2xfi4l6f66wjzqhq/bin/clang-14+0x3ca9231)
#22 0x0000000003c3b35a clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) (/scratchm/sonics/opt_2024/linux-rhel8-broadwell/gcc-8.3.1/llvm-14.0.6-mlbglx3o3n5rirgy2xfi4l6f66wjzqhq/bin/clang-14+0x3c3b35a)
#23 0x0000000003d6ef01 clang::ExecuteCompilerInvocation(clang::CompilerInstance*) (/scratchm/sonics/opt_2024/linux-rhel8-broadwell/gcc-8.3.1/llvm-14.0.6-mlbglx3o3n5rirgy2xfi4l6f66wjzqhq/bin/clang-14+0x3d6ef01)
#24 0x0000000000ed78c4 cc1_main(llvm::ArrayRef<char const*>, char const*, void*) (/scratchm/sonics/opt_2024/linux-rhel8-broadwell/gcc-8.3.1/llvm-14.0.6-mlbglx3o3n5rirgy2xfi4l6f66wjzqhq/bin/clang-14+0xed78c4)
#25 0x0000000000ed5079 ExecuteCC1Tool(llvm::SmallVectorImpl<char const*>&) driver.cpp:0:0
#26 0x0000000000e08bbc main (/scratchm/sonics/opt_2024/linux-rhel8-broadwell/gcc-8.3.1/llvm-14.0.6-mlbglx3o3n5rirgy2xfi4l6f66wjzqhq/bin/clang-14+0xe08bbc)
#27 0x00001492f1f0a803 __libc_start_main (/lib64/libc.so.6+0x23803)
#28 0x0000000000ed395e _start (/scratchm/sonics/opt_2024/linux-rhel8-broadwell/gcc-8.3.1/llvm-14.0.6-mlbglx3o3n5rirgy2xfi4l6f66wjzqhq/bin/clang-14+0xed395e)
clang-14: error: unable to execute command: Aborted (core dumped)
clang-14: error: clang frontend command failed due to signal (use -v to see invocation)
clang version 14.0.6
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /scratchm/sonics/opt_2024/linux-rhel8-broadwell/gcc-8.3.1/llvm-14.0.6-mlbglx3o3n5rirgy2xfi4l6f66wjzqhq/bin
clang-14: note: diagnostic msg: 
********************

PLEASE ATTACH THE FOLLOWING FILES TO THE BUG REPORT:
Preprocessed source(s) and associated run script(s) are located at:
clang-14: note: diagnostic msg: /tmp/test_jambon-69a59d.cu
clang-14: note: diagnostic msg: /tmp/test_jambon-4dc09a/test_jambon-sm_61.cu
clang-14: note: diagnostic msg: /tmp/test_jambon-69a59d.sh
clang-14: note: diagnostic msg: 

********************

Is my code can not be compiled with upper optimization option than -O0 ?

Or as it seems explained in the documentation (https://enzyme.mit.edu/getting_started/CUDAGuide/#cuda-example) :

Note that this procedure (using ClangEnzyme as opposed to LLVMEnzyme manually) inserts Enzyme at a specific locaton in LLVM’s
optimization pipeline. The default ordering should be reasonable, however, the precise ordering of optimization passes may
 [impact performance](https://proceedings.mlsys.org/paper/2020/file/4e732ced3463d06de0ca9a15b6153677-Paper.pdf) .
 If there is a performance issue that you suspect may be due to optimization ordering, please
 [open an issue](https://github.com/EnzymeAD/Enzyme/issues/new) .

Is there another way to do the compilation/differentiation phase to be able to activate -O[123] option ?

Thanks for your help,

@wsmoses
Copy link
Member

wsmoses commented Feb 3, 2024 via email

@bemichel
Copy link
Author

bemichel commented Feb 5, 2024

Hi,
I try to compile/differentiate with the Enzyme version v0.0.99 (clang-16+Enzyme-0.0.99+CUDA-11.2) :

$> clang --version
clang version 16.0.6
...
$> nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver                                                                                               
Copyright (c) 2005-2021 NVIDIA Corporation                                                                                          
Built on Thu_Jan_28_19:32:09_PST_2021
Cuda compilation tools, release 11.2, V11.2.142
Build cuda_11.2.r11.2/compiler.29558016_0
$> module show enzyme/0.0.99-gcc-12.1.0-...

It works fine with option -O0 :

$> clang++ -O0 -DENABLE_ENZYME -I${CUDAPATH}/include test.cu -fplugin=${ENZYMEPATH}/lib/ClangEnzyme-16.so --cuda-gpu-arch=sm_61 -lcudart -L${CUDAPATH}/11.2/lib64
$> ./a.out                                                                               
argc == 1
[GPU, direct] a[0]         == 12.000000
[GPU, direct] a[nb_cell-1] == 12.000000
[GPU, direct] b[0]         == 437.000000
[GPU, direct] b[nb_cell-1] == 437.000000
[GPU, forward] da[0]         == 1.000000
[GPU, forward] da[nb_cell-1] == 1.000000
[GPU, forward] db[0]         == 72.000000
[GPU, forward] db[nb_cell-1] == 72.000000
[GPU, backward] da[0]         == 72.000000
[GPU, backward] da[nb_cell-1] == 72.000000
[GPU, backward] db[0]         == 0.000000
[GPU, backward] db[nb_cell-1] == 0.000000

But with the option -O1 it seems return the equivalent error :

$> clang++ -O1 -DENABLE_ENZYME -I${CUDAPATH}/include test.cu -fplugin=${ENZYMEPATH}/lib/ClangEnzyme-16.so --cuda-gpu-arch=sm_61 -lcudart -L${CUDAPATH}/11.2/lib64
...
1.      <eof> parser at end of file
2.      Optimizer
 #0 0x000000000386282b llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/scratchm/sonics/opt_2024/linux-rhel8-broadwell/gcc-12.1.0/llvm-16.0.6-wnagksngnyalxvmluiow2yuywyd4npx5/bin/clang-16+0x386282b)
 #1 0x000000000385fccb SignalHandler(int) Signals.cpp:0:0
 #2 0x00001535e64e6b30 __restore_rt sigaction.c:0:0
 #3 0x00001535e4f1284f raise (/lib64/libc.so.6+0x3784f)
 #4 0x00001535e4efcc45 abort (/lib64/libc.so.6+0x21c45)
 #5 0x00001535e4efcb19 _nl_load_domain.cold.0 loadmsgcat.c:0:0
 #6 0x00001535e4f0ae36 .annobin___GI___assert_fail.end assert.c:0:0
 #7 0x00001535e4974b33 (/scratchm/sonics/opt_2024/linux-rhel8-broadwell/gcc-12.1.0/enzyme-0.0.99-52vilpu4js7wisrre2fgjny6gq3o7ut5/lib/ClangEnzyme-16.so+0x365b33)
 #8 0x00001535e49a0437 EnzymeLogic::CreateForwardDiff(RequestContext, llvm::Function*, DIFFE_TYPE, llvm::ArrayRef<DIFFE_TYPE>, TypeAnalysis&, bool, DerivativeMode, bool, unsigned int, llvm::Type*, FnTypeInfo const&, std::vector<bool, std::allocator<bool>>, AugmentedReturn const*, bool) (/scratchm/sonics/opt_2024/linux-rhel8-broadwell/gcc-12.1.0/enzyme-0.0.99-52vilpu4js7wisrre2fgjny6gq3o7ut5/lib/ClangEnzyme-16.so+0x391437)
 #9 0x00001535e494ed11 (anonymous namespace)::EnzymeBase::HandleAutoDiff(llvm::Instruction*, unsigned int, llvm::Value*, llvm::Type*, llvm::SmallVectorImpl<llvm::Value*>&, std::map<int, llvm::Type*, std::less<int>, std::allocator<std::pair<int const, llvm::Type*>>> const&, std::vector<DIFFE_TYPE, std::allocator<DIFFE_TYPE>> const&, llvm::Function*, DerivativeMode, (anonymous namespace)::EnzymeBase::Options&, bool, llvm::SmallVectorImpl<llvm::CallInst*>&) Enzyme.cpp:0:0
#10 0x00001535e495090b (anonymous namespace)::EnzymeBase::HandleAutoDiffArguments(llvm::CallInst*, DerivativeMode, bool, llvm::SmallVectorImpl<llvm::CallInst*>&) Enzyme.cpp:0:0
#11 0x00001535e4957973 (anonymous namespace)::EnzymeBase::lowerEnzymeCalls(llvm::Function&, std::set<llvm::Function*, std::less<llvm::Function*>, std::allocator<llvm::Function*>>&) Enzyme.cpp:0:0
#12 0x00001535e495b0b8 (anonymous namespace)::EnzymeBase::run(llvm::Module&) Enzyme.cpp:0:0
#13 0x00001535e4973880 llvm::detail::PassModel<llvm::Module, EnzymeNewPM, llvm::PreservedAnalyses, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) (/scratchm/sonics/opt_2024/linux-rhel8-broadwell/gcc-12.1.0/enzyme-0.0.99-52vilpu4js7wisrre2fgjny6gq3o7ut5/lib/ClangEnzyme-16.so+0x364880)
#14 0x0000000003138d2d llvm::PassManager<llvm::Module, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) (/scratchm/sonics/opt_2024/linux-rhel8-broadwell/gcc-12.1.0/llvm-16.0.6-wnagksngnyalxvmluiow2yuywyd4npx5/bin/clang-16+0x3138d2d)
#15 0x0000000003c19373 (anonymous namespace)::EmitAssemblyHelper::RunOptimizationPipeline(clang::BackendAction, std::unique_ptr<llvm::raw_pwrite_stream, std::default_delete<llvm::raw_pwrite_stream>>&, std::unique_ptr<llvm::ToolOutputFile, std::default_delete<llvm::ToolOutputFile>>&) BackendUtil.cpp:0:0
#16 0x0000000003c1bd5c clang::EmitBackendOutput(clang::DiagnosticsEngine&, clang::HeaderSearchOptions const&, clang::CodeGenOptions const&, clang::TargetOptions const&, clang::LangOptions const&, llvm::StringRef, llvm::Module*, clang::BackendAction, std::unique_ptr<llvm::raw_pwrite_stream, std::default_delete<llvm::raw_pwrite_stream>>) (/scratchm/sonics/opt_2024/linux-rhel8-broadwell/gcc-12.1.0/llvm-16.0.6-wnagksngnyalxvmluiow2yuywyd4npx5/bin/clang-16+0x3c1bd5c)
#17 0x0000000004a8dd52 clang::BackendConsumer::HandleTranslationUnit(clang::ASTContext&) (/scratchm/sonics/opt_2024/linux-rhel8-broadwell/gcc-12.1.0/llvm-16.0.6-wnagksngnyalxvmluiow2yuywyd4npx5/bin/clang-16+0x4a8dd52)
#18 0x00000000043edb68 clang::MultiplexConsumer::HandleTranslationUnit(clang::ASTContext&) (/scratchm/sonics/opt_2024/linux-rhel8-broadwell/gcc-12.1.0/llvm-16.0.6-wnagksngnyalxvmluiow2yuywyd4npx5/bin/clang-16+0x43edb68)
#19 0x0000000005966575 clang::ParseAST(clang::Sema&, bool, bool) (/scratchm/sonics/opt_2024/linux-rhel8-broadwell/gcc-12.1.0/llvm-16.0.6-wnagksngnyalxvmluiow2yuywyd4npx5/bin/clang-16+0x5966575)
#20 0x00000000043b3fb1 clang::FrontendAction::Execute() (/scratchm/sonics/opt_2024/linux-rhel8-broadwell/gcc-12.1.0/llvm-16.0.6-wnagksngnyalxvmluiow2yuywyd4npx5/bin/clang-16+0x43b3fb1)
#21 0x000000000433a23b clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) (/scratchm/sonics/opt_2024/linux-rhel8-broadwell/gcc-12.1.0/llvm-16.0.6-wnagksngnyalxvmluiow2yuywyd4npx5/bin/clang-16+0x433a23b)
#22 0x000000000446f088 clang::ExecuteCompilerInvocation(clang::CompilerInstance*) (/scratchm/sonics/opt_2024/linux-rhel8-broadwell/gcc-12.1.0/llvm-16.0.6-wnagksngnyalxvmluiow2yuywyd4npx5/bin/clang-16+0x446f088)
#23 0x0000000001056a10 cc1_main(llvm::ArrayRef<char const*>, char const*, void*) (/scratchm/sonics/opt_2024/linux-rhel8-broadwell/gcc-12.1.0/llvm-16.0.6-wnagksngnyalxvmluiow2yuywyd4npx5/bin/clang-16+0x1056a10)
#24 0x000000000105202a ExecuteCC1Tool(llvm::SmallVectorImpl<char const*>&) driver.cpp:0:0
#25 0x00000000010531f0 clang_main(int, char**) (/scratchm/sonics/opt_2024/linux-rhel8-broadwell/gcc-12.1.0/llvm-16.0.6-wnagksngnyalxvmluiow2yuywyd4npx5/bin/clang-16+0x10531f0)
#26 0x00001535e4efe803 __libc_start_main (/lib64/libc.so.6+0x23803)
#27 0x000000000104d3be _start (/scratchm/sonics/opt_2024/linux-rhel8-broadwell/gcc-12.1.0/llvm-16.0.6-wnagksngnyalxvmluiow2yuywyd4npx5/bin/clang-16+0x104d3be)
clang-16: error: unable to execute command: Aborted (core dumped)
clang-16: error: clang frontend command failed due to signal (use -v to see invocation)
clang version 16.0.6
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /scratchm/sonics/opt_2024/linux-rhel8-broadwell/gcc-12.1.0/llvm-16.0.6-wnagksngnyalxvmluiow2yuywyd4npx5/bin
clang-16: note: diagnostic msg: 
********************

PLEASE ATTACH THE FOLLOWING FILES TO THE BUG REPORT:
Preprocessed source(s) and associated run script(s) are located at:
clang-16: note: diagnostic msg: /tmp/test_jambon-e9590d.cu
clang-16: note: diagnostic msg: /tmp/test_jambon-e6f392/test_jambon-sm_61.cu
clang-16: note: diagnostic msg: /tmp/test_jambon-e9590d.sh
clang-16: note: diagnostic msg: 

********************

Do you have any idea what's wrong ???

@wsmoses
Copy link
Member

wsmoses commented Feb 5, 2024

Can you paste the full log?

@bemichel
Copy link
Author

bemichel commented Feb 5, 2024

The full message:

$> clang++ -O1 -DENABLE_ENZYME -I/opt/tools/nvidia-hpc-sdk/22.2-gnu831/Linux_x86_64/22.2/cuda/11.2/include -I/opt/tools/nvidia-hpc-sdk/22.2-gnu831/Linux_x86_64/22.2/math_libs/11.2/targets/x86_64-linux/include test_jambon.cu -fplugin=/scratchm/sonics/opt_2024/linux-rhel8-broadwell/gcc-12.1.0/enzyme-0.0.99-52vilpu4js7wisrre2fgjny6gq3o7ut5/lib/ClangEnzyme-16.so --cuda-gpu-arch=sm_61 -lcudart -L/opt/tools/nvidia-hpc-sdk/22.2-gnu831/Linux_x86_64/22.2/cuda/11.2/lib64
clang-16: /scratchm/sonics/opt_2024/linux-rhel8-broadwell/gcc-12.1.0/llvm-16.0.6-wnagksngnyalxvmluiow2yuywyd4npx5/include/llvm/Support/Casting.h:109: static bool llvm::isa_impl_cl<To, const From*>::doit(const From*) [with To = llvm::ConstantAsMetadata; From = llvm::Metadata]: Assertion `Val && "isa<> used on a null pointer"' failed.
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.      Program arguments: /scratchm/sonics/opt_2024/linux-rhel8-broadwell/gcc-12.1.0/llvm-16.0.6-wnagksngnyalxvmluiow2yuywyd4npx5/bin/clang-16 -cc1 -triple x86_64-unknown-linux-gnu -target-sdk-version=11.2 -aux-triple nvptx64-nvidia-cuda -emit-obj -disable-free -clear-ast-before-backend -disable-llvm-verifier -discard-value-names -main-file-name test_jambon.cu -mrelocation-model pic -pic-level 2 -pic-is-pie -mframe-pointer=none -fmath-errno -ffp-contract=on -fno-rounding-math -mconstructor-aliases -funwind-tables=2 -target-cpu x86-64 -tune-cpu generic -mllvm -treat-scalable-fixed-error-as-warning -debugger-tuning=gdb -fcoverage-compilation-dir=/visu/bemichel/dev/SoNICS/dev_doc/test_enzyme -resource-dir /scratchm/sonics/opt_2024/linux-rhel8-broadwell/gcc-12.1.0/llvm-16.0.6-wnagksngnyalxvmluiow2yuywyd4npx5/lib/clang/16 -internal-isystem /scratchm/sonics/opt_2024/linux-rhel8-broadwell/gcc-12.1.0/llvm-16.0.6-wnagksngnyalxvmluiow2yuywyd4npx5/lib/clang/16/include/cuda_wrappers -include __clang_cuda_runtime_wrapper.h -D ENABLE_ENZYME -I /opt/tools/nvidia-hpc-sdk/22.2-gnu831/Linux_x86_64/22.2/cuda/11.2/include -I /opt/tools/nvidia-hpc-sdk/22.2-gnu831/Linux_x86_64/22.2/math_libs/11.2/targets/x86_64-linux/include -I/opt/tools/nvidia-hpc-sdk/22.2-gnu831/Linux_x86_64/22.2/math_libs/11.2/include -I/opt/tools/hwloc/2.4.1-gnu831-hpc/include -I/opt/tools/openmpi/4.0.5-gnu831-hpc/include -I/opt/tools/intel/oneapi/tbb/2021.2.0/include -I/opt/tools/intel/oneapi/compiler/2021.2.0/linux/include -I/opt/tools/gcc/10.2.0-gnu831/include -I/scratchm/sonics/opt_el8/linux-rhel8-broadwell/gcc-8.3.1/eigen-3.4.0-xxgaw25zr3gqeeimp5nugzxxxxlzzjfq/include/eigen3 -I/opt/tools/intel/oneapi/mpi/2021.6.0//include -I/opt/tools/intel/oneapi/mkl/2021.2.0/include -I/scratchm/sonics/opt_el8/linux-rhel8-broadwell/gcc-8.3.1/python-3.9.12-enb6bk6hdesnoo6ppwn5jnb3jivt2jcz/include/python3.9 -internal-isystem /opt/tools/gcc/12.1.0-gnu831/lib/gcc/x86_64-pc-linux-gnu/12.1.0/../../../../include/c++/12.1.0 -internal-isystem /opt/tools/gcc/12.1.0-gnu831/lib/gcc/x86_64-pc-linux-gnu/12.1.0/../../../../include/c++/12.1.0/x86_64-pc-linux-gnu -internal-isystem /opt/tools/gcc/12.1.0-gnu831/lib/gcc/x86_64-pc-linux-gnu/12.1.0/../../../../include/c++/12.1.0/backward -internal-isystem /opt/tools/gcc/12.1.0-gnu831/lib/gcc/x86_64-pc-linux-gnu/12.1.0/../../../../include/c++/12.1.0 -internal-isystem /opt/tools/gcc/12.1.0-gnu831/lib/gcc/x86_64-pc-linux-gnu/12.1.0/../../../../include/c++/12.1.0/x86_64-pc-linux-gnu -internal-isystem /opt/tools/gcc/12.1.0-gnu831/lib/gcc/x86_64-pc-linux-gnu/12.1.0/../../../../include/c++/12.1.0/backward -internal-isystem /scratchm/sonics/opt_2024/linux-rhel8-broadwell/gcc-12.1.0/llvm-16.0.6-wnagksngnyalxvmluiow2yuywyd4npx5/lib/clang/16/include -internal-isystem /usr/local/include -internal-isystem /opt/tools/gcc/12.1.0-gnu831/lib/gcc/x86_64-pc-linux-gnu/12.1.0/../../../../x86_64-pc-linux-gnu/include -internal-externc-isystem /include -internal-externc-isystem /usr/include -internal-isystem /scratchm/sonics/opt_2024/linux-rhel8-broadwell/gcc-12.1.0/llvm-16.0.6-wnagksngnyalxvmluiow2yuywyd4npx5/lib/clang/16/include -internal-isystem /usr/local/include -internal-isystem /opt/tools/gcc/12.1.0-gnu831/lib/gcc/x86_64-pc-linux-gnu/12.1.0/../../../../x86_64-pc-linux-gnu/include -internal-externc-isystem /include -internal-externc-isystem /usr/include -internal-isystem /opt/tools/nvidia-hpc-sdk/22.2-gnu831/Linux_x86_64/22.2/cuda/11.2/include -O1 -fdeprecated-macro -fdebug-compilation-dir=/visu/bemichel/dev/SoNICS/dev_doc/test_enzyme -ferror-limit 19 -fgnuc-version=4.2.1 -fcxx-exceptions -fexceptions -fcolor-diagnostics -load /scratchm/sonics/opt_2024/linux-rhel8-broadwell/gcc-12.1.0/enzyme-0.0.99-52vilpu4js7wisrre2fgjny6gq3o7ut5/lib/ClangEnzyme-16.so -fcuda-include-gpubinary /tmp/test_jambon-1378cf.fatbin -cuid=c77be1b562716e -faddrsig -D__GCC_HAVE_DWARF2_CFI_ASM=1 -o /tmp/test_jambon-1b2471.o -x cuda test_jambon.cu
1.      <eof> parser at end of file
2.      Optimizer
 #0 0x000000000386282b llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/scratchm/sonics/opt_2024/linux-rhel8-broadwell/gcc-12.1.0/llvm-16.0.6-wnagksngnyalxvmluiow2yuywyd4npx5/bin/clang-16+0x386282b)
 #1 0x000000000385fccb SignalHandler(int) Signals.cpp:0:0
 #2 0x000015320199bb30 __restore_rt sigaction.c:0:0
 #3 0x00001532003c784f raise (/lib64/libc.so.6+0x3784f)
 #4 0x00001532003b1c45 abort (/lib64/libc.so.6+0x21c45)
 #5 0x00001532003b1b19 _nl_load_domain.cold.0 loadmsgcat.c:0:0
 #6 0x00001532003bfe36 .annobin___GI___assert_fail.end assert.c:0:0
 #7 0x00001531ffe29b33 (/scratchm/sonics/opt_2024/linux-rhel8-broadwell/gcc-12.1.0/enzyme-0.0.99-52vilpu4js7wisrre2fgjny6gq3o7ut5/lib/ClangEnzyme-16.so+0x365b33)
 #8 0x00001531ffe55437 EnzymeLogic::CreateForwardDiff(RequestContext, llvm::Function*, DIFFE_TYPE, llvm::ArrayRef<DIFFE_TYPE>, TypeAnalysis&, bool, DerivativeMode, bool, unsigned int, llvm::Type*, FnTypeInfo const&, std::vector<bool, std::allocator<bool>>, AugmentedReturn const*, bool) (/scratchm/sonics/opt_2024/linux-rhel8-broadwell/gcc-12.1.0/enzyme-0.0.99-52vilpu4js7wisrre2fgjny6gq3o7ut5/lib/ClangEnzyme-16.so+0x391437)
 #9 0x00001531ffe03d11 (anonymous namespace)::EnzymeBase::HandleAutoDiff(llvm::Instruction*, unsigned int, llvm::Value*, llvm::Type*, llvm::SmallVectorImpl<llvm::Value*>&, std::map<int, llvm::Type*, std::less<int>, std::allocator<std::pair<int const, llvm::Type*>>> const&, std::vector<DIFFE_TYPE, std::allocator<DIFFE_TYPE>> const&, llvm::Function*, DerivativeMode, (anonymous namespace)::EnzymeBase::Options&, bool, llvm::SmallVectorImpl<llvm::CallInst*>&) Enzyme.cpp:0:0
#10 0x00001531ffe0590b (anonymous namespace)::EnzymeBase::HandleAutoDiffArguments(llvm::CallInst*, DerivativeMode, bool, llvm::SmallVectorImpl<llvm::CallInst*>&) Enzyme.cpp:0:0
#11 0x00001531ffe0c973 (anonymous namespace)::EnzymeBase::lowerEnzymeCalls(llvm::Function&, std::set<llvm::Function*, std::less<llvm::Function*>, std::allocator<llvm::Function*>>&) Enzyme.cpp:0:0
#12 0x00001531ffe100b8 (anonymous namespace)::EnzymeBase::run(llvm::Module&) Enzyme.cpp:0:0
#13 0x00001531ffe28880 llvm::detail::PassModel<llvm::Module, EnzymeNewPM, llvm::PreservedAnalyses, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) (/scratchm/sonics/opt_2024/linux-rhel8-broadwell/gcc-12.1.0/enzyme-0.0.99-52vilpu4js7wisrre2fgjny6gq3o7ut5/lib/ClangEnzyme-16.so+0x364880)
#14 0x0000000003138d2d llvm::PassManager<llvm::Module, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) (/scratchm/sonics/opt_2024/linux-rhel8-broadwell/gcc-12.1.0/llvm-16.0.6-wnagksngnyalxvmluiow2yuywyd4npx5/bin/clang-16+0x3138d2d)
#15 0x0000000003c19373 (anonymous namespace)::EmitAssemblyHelper::RunOptimizationPipeline(clang::BackendAction, std::unique_ptr<llvm::raw_pwrite_stream, std::default_delete<llvm::raw_pwrite_stream>>&, std::unique_ptr<llvm::ToolOutputFile, std::default_delete<llvm::ToolOutputFile>>&) BackendUtil.cpp:0:0
#16 0x0000000003c1bd5c clang::EmitBackendOutput(clang::DiagnosticsEngine&, clang::HeaderSearchOptions const&, clang::CodeGenOptions const&, clang::TargetOptions const&, clang::LangOptions const&, llvm::StringRef, llvm::Module*, clang::BackendAction, std::unique_ptr<llvm::raw_pwrite_stream, std::default_delete<llvm::raw_pwrite_stream>>) (/scratchm/sonics/opt_2024/linux-rhel8-broadwell/gcc-12.1.0/llvm-16.0.6-wnagksngnyalxvmluiow2yuywyd4npx5/bin/clang-16+0x3c1bd5c)
#17 0x0000000004a8dd52 clang::BackendConsumer::HandleTranslationUnit(clang::ASTContext&) (/scratchm/sonics/opt_2024/linux-rhel8-broadwell/gcc-12.1.0/llvm-16.0.6-wnagksngnyalxvmluiow2yuywyd4npx5/bin/clang-16+0x4a8dd52)
#18 0x00000000043edb68 clang::MultiplexConsumer::HandleTranslationUnit(clang::ASTContext&) (/scratchm/sonics/opt_2024/linux-rhel8-broadwell/gcc-12.1.0/llvm-16.0.6-wnagksngnyalxvmluiow2yuywyd4npx5/bin/clang-16+0x43edb68)
#19 0x0000000005966575 clang::ParseAST(clang::Sema&, bool, bool) (/scratchm/sonics/opt_2024/linux-rhel8-broadwell/gcc-12.1.0/llvm-16.0.6-wnagksngnyalxvmluiow2yuywyd4npx5/bin/clang-16+0x5966575)
#20 0x00000000043b3fb1 clang::FrontendAction::Execute() (/scratchm/sonics/opt_2024/linux-rhel8-broadwell/gcc-12.1.0/llvm-16.0.6-wnagksngnyalxvmluiow2yuywyd4npx5/bin/clang-16+0x43b3fb1)
#21 0x000000000433a23b clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) (/scratchm/sonics/opt_2024/linux-rhel8-broadwell/gcc-12.1.0/llvm-16.0.6-wnagksngnyalxvmluiow2yuywyd4npx5/bin/clang-16+0x433a23b)
#22 0x000000000446f088 clang::ExecuteCompilerInvocation(clang::CompilerInstance*) (/scratchm/sonics/opt_2024/linux-rhel8-broadwell/gcc-12.1.0/llvm-16.0.6-wnagksngnyalxvmluiow2yuywyd4npx5/bin/clang-16+0x446f088)
#23 0x0000000001056a10 cc1_main(llvm::ArrayRef<char const*>, char const*, void*) (/scratchm/sonics/opt_2024/linux-rhel8-broadwell/gcc-12.1.0/llvm-16.0.6-wnagksngnyalxvmluiow2yuywyd4npx5/bin/clang-16+0x1056a10)
#24 0x000000000105202a ExecuteCC1Tool(llvm::SmallVectorImpl<char const*>&) driver.cpp:0:0
#25 0x00000000010531f0 clang_main(int, char**) (/scratchm/sonics/opt_2024/linux-rhel8-broadwell/gcc-12.1.0/llvm-16.0.6-wnagksngnyalxvmluiow2yuywyd4npx5/bin/clang-16+0x10531f0)
#26 0x00001532003b3803 __libc_start_main (/lib64/libc.so.6+0x23803)
#27 0x000000000104d3be _start (/scratchm/sonics/opt_2024/linux-rhel8-broadwell/gcc-12.1.0/llvm-16.0.6-wnagksngnyalxvmluiow2yuywyd4npx5/bin/clang-16+0x104d3be)
clang-16: error: unable to execute command: Aborted (core dumped)
clang-16: error: clang frontend command failed due to signal (use -v to see invocation)
clang version 16.0.6
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /scratchm/sonics/opt_2024/linux-rhel8-broadwell/gcc-12.1.0/llvm-16.0.6-wnagksngnyalxvmluiow2yuywyd4npx5/bin
clang-16: note: diagnostic msg: 
********************

PLEASE ATTACH THE FOLLOWING FILES TO THE BUG REPORT:
Preprocessed source(s) and associated run script(s) are located at:
clang-16: note: diagnostic msg: /tmp/test_jambon-828cba.cu
clang-16: note: diagnostic msg: /tmp/test_jambon-971738/test_jambon-sm_61.cu
clang-16: note: diagnostic msg: /tmp/test_jambon-828cba.sh
clang-16: note: diagnostic msg: 

********************

@wsmoses
Copy link
Member

wsmoses commented Feb 12, 2024

Okay should be fixed by #1697

@wsmoses wsmoses closed this as completed Feb 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants