[Flang][OpenMP][Offload] HLFIR AssignOp does not lower to a friendly form for AMDGPU which is used for target offloading for OpenMP #74603

agozillon · 2023-12-06T15:19:50Z

This issue was found during the implementation of the following PR (and is dependent on it): #71766

The following example which attempts to map and assign a value to an allocatable variable on device compiles and works for the deprecated FIR flow, but will fail using the new HLFIR flow:

program main
    integer, allocatable :: test
    allocate(test)
    test = 10

!$omp target map(tofrom:test)
    test = 50
!$omp end target

    print *, test

    deallocate(test)
end program

Yielding the following ICE error:

LLVM ERROR: Cannot select: t20: i64,ch = dynamic_stackalloc t16:1, Constant:i64<16>, Constant:i64<0>
  t19: i64 = Constant<16>
  t6: i64 = Constant<0>
In function: __omp_offloading_fd00_4b200ae__QQmain_l5
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0.	Program arguments: /work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper --host-triple=x86_64-unknown-linux-gnu --linker-path=/work/agozillo/git/flang-dev/llvm-main-project/build/bin/ld.lld -- -z relro --hash-style=gnu --eh-frame-hdr -m elf_x86_64 -pie -dynamic-linker /lib64/ld-linux-x86-64.so.2 -o single-value-alloca.out /lib/x86_64-linux-gnu/Scrt1.o /lib/x86_64-linux-gnu/crti.o /usr/lib/gcc/x86_64-linux-gnu/9/crtbeginS.o -L/usr/lib/gcc/x86_64-linux-gnu/9 -L/usr/lib/gcc/x86_64-linux-gnu/9/../../../../lib64 -L/lib/x86_64-linux-gnu -L/lib/../lib64 -L/usr/lib/x86_64-linux-gnu -L/usr/lib/../lib64 -L/lib -L/usr/lib -L/home/agozillo/git/flang-dev/llvm-main-project/build/lib -L/home/agozillo/git/flang-dev/llvm-main-project/build/projects/openmp/libomptarget -L/home/agozillo/git/flang-dev/llvm-main-project/build/projects/openmp/ -L/home/agozillo/git/flang-dev/llvm-main-project/build/projects/openmp/libomptarget/DeviceRTL -L/etc/alternatives/rocm/lib /tmp/single-value-alloca-825105.o -L/work/agozillo/git/flang-dev/llvm-main-project/build/lib --whole-archive -lFortran_main --no-whole-archive -lFortranRuntime -lFortranDecimal -lm -lomp -lomptarget -lomptarget.devicertl -L/work/agozillo/git/flang-dev/llvm-main-project/build/lib -lgcc --as-needed -lgcc_s --no-as-needed -lpthread -lc -lgcc --as-needed -lgcc_s --no-as-needed /usr/lib/gcc/x86_64-linux-gnu/9/crtendS.o /lib/x86_64-linux-gnu/crtn.o
1.	Running pass 'CallGraph Pass Manager' on module 'ld-temp.o'.
2.	Running pass 'AMDGPU DAG->DAG Pattern Instruction Selection' on function '@__omp_offloading_fd00_4b200ae__QQmain_l5'
 #0 0x0000560af190f69f llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0x147969f)
 #1 0x0000560af190ce84 SignalHandler(int) Signals.cpp:0:0
 #2 0x00007fd3d84a9420 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x14420)
 #3 0x00007fd3d7f4600b raise /build/glibc-BHL3KM/glibc-2.31/signal/../sysdeps/unix/sysv/linux/raise.c:51:1
 #4 0x00007fd3d7f25859 abort /build/glibc-BHL3KM/glibc-2.31/stdlib/abort.c:81:7
 #5 0x0000560af071eb98 llvm::ConvertUTF8toUTF32(unsigned char const**, unsigned char const*, unsigned int**, unsigned int*, llvm::ConversionFlags) (.cold) ConvertUTF.cpp:0:0
 #6 0x0000560af22b02bd llvm::SelectionDAGISel::CannotYetSelect(llvm::SDNode*) (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0x1e1a2bd)
 #7 0x0000560af22b2a19 llvm::SelectionDAGISel::SelectCodeCommon(llvm::SDNode*, unsigned char const*, unsigned int) (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0x1e1ca19)
 #8 0x0000560af0e98117 AMDGPUDAGToDAGISel::Select(llvm::SDNode*) (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0xa02117)
 #9 0x0000560af22ad240 llvm::SelectionDAGISel::DoInstructionSelection() (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0x1e17240)
#10 0x0000560af22ba62e llvm::SelectionDAGISel::CodeGenAndEmitDAG() (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0x1e2462e)
#11 0x0000560af22bd758 llvm::SelectionDAGISel::SelectAllBasicBlocks(llvm::Function const&) (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0x1e27758)
#12 0x0000560af22bf446 llvm::SelectionDAGISel::runOnMachineFunction(llvm::MachineFunction&) (.part.0) SelectionDAGISel.cpp:0:0
#13 0x0000560af0ea1349 AMDGPUDAGToDAGISel::runOnMachineFunction(llvm::MachineFunction&) (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0xa0b349)
#14 0x0000560af1a1e351 llvm::MachineFunctionPass::runOnFunction(llvm::Function&) (.part.0) MachineFunctionPass.cpp:0:0
#15 0x0000560af1285b71 llvm::FPPassManager::runOnFunction(llvm::Function&) (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0xdefb71)
#16 0x0000560af15206e7 (anonymous namespace)::CGPassManager::runOnModule(llvm::Module&) CallGraphSCCPass.cpp:0:0
#17 0x0000560af1286652 llvm::legacy::PassManagerImpl::run(llvm::Module&) (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0xdf0652)
#18 0x0000560af1ee0205 codegen(llvm::lto::Config const&, llvm::TargetMachine*, std::function<llvm::Expected<std::unique_ptr<llvm::CachedFileStream, std::default_delete<llvm::CachedFileStream>>> (unsigned int, llvm::Twine const&)>, unsigned int, llvm::Module&, llvm::ModuleSummaryIndex const&) LTOBackend.cpp:0:0
#19 0x0000560af1ee080d llvm::lto::backend(llvm::lto::Config const&, std::function<llvm::Expected<std::unique_ptr<llvm::CachedFileStream, std::default_delete<llvm::CachedFileStream>>> (unsigned int, llvm::Twine const&)>, unsigned int, llvm::Module&, llvm::ModuleSummaryIndex&) (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0x1a4a80d)
#20 0x0000560af1ed6c65 llvm::lto::LTO::runRegularLTO(std::function<llvm::Expected<std::unique_ptr<llvm::CachedFileStream, std::default_delete<llvm::CachedFileStream>>> (unsigned int, llvm::Twine const&)>) (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0x1a40c65)
#21 0x0000560af1ed72b8 llvm::lto::LTO::run(std::function<llvm::Expected<std::unique_ptr<llvm::CachedFileStream, std::default_delete<llvm::CachedFileStream>>> (unsigned int, llvm::Twine const&)>, std::function<llvm::Expected<std::function<llvm::Expected<std::unique_ptr<llvm::CachedFileStream, std::default_delete<llvm::CachedFileStream>>> (unsigned int, llvm::Twine const&)>> (unsigned int, llvm::StringRef, llvm::Twine const&)>) (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0x1a412b8)
#22 0x0000560af07d1a5d (anonymous namespace)::linkBitcodeFiles(llvm::SmallVectorImpl<llvm::object::OffloadFile>&, llvm::SmallVectorImpl<llvm::StringRef>&, llvm::opt::ArgList const&) (.constprop.0) ClangLinkerWrapper.cpp:0:0
#23 0x0000560af07d881a llvm::Error (anonymous namespace)::linkAndWrapDeviceFiles(llvm::SmallVectorImpl<llvm::object::OffloadFile>&, llvm::opt::InputArgList const&, char**, int)::'lambda'(auto&)::operator()<llvm::SmallVector<llvm::object::OffloadFile, 3u>>(auto&) const ClangLinkerWrapper.cpp:0:0
#24 0x0000560af07dee05 (anonymous namespace)::linkAndWrapDeviceFiles(llvm::SmallVectorImpl<llvm::object::OffloadFile>&, llvm::opt::InputArgList const&, char**, int) ClangLinkerWrapper.cpp:0:0
#25 0x0000560af07248e0 main (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0x28e8e0)
#26 0x00007fd3d7f27083 __libc_start_main /build/glibc-BHL3KM/glibc-2.31/csu/../csu/libc-start.c:342:3
#27 0x0000560af07c119e _start (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0x32b19e)
 #0 0x0000560af190f69f llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0x147969f)
 #1 0x0000560af190ce84 SignalHandler(int) Signals.cpp:0:0
 #2 0x00007fd3d84a9420 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x14420)
 #3 0x00007fd3d7f4600b raise /build/glibc-BHL3KM/glibc-2.31/signal/../sysdeps/unix/sysv/linux/raise.c:51:1
 #4 0x00007fd3d7f25859 abort /build/glibc-BHL3KM/glibc-2.31/stdlib/abort.c:81:7
 #5 0x0000560af071eb98 llvm::ConvertUTF8toUTF32(unsigned char const**, unsigned char const*, unsigned int**, unsigned int*, llvm::ConversionFlags) (.cold) ConvertUTF.cpp:0:0
 #6 0x0000560af22b02bd llvm::SelectionDAGISel::CannotYetSelect(llvm::SDNode*) (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0x1e1a2bd)
 #7 0x0000560af22b2a19 llvm::SelectionDAGISel::SelectCodeCommon(llvm::SDNode*, unsigned char const*, unsigned int) (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0x1e1ca19)
 #8 0x0000560af0e98117 AMDGPUDAGToDAGISel::Select(llvm::SDNode*) (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0xa02117)
 #9 0x0000560af22ad240 llvm::SelectionDAGISel::DoInstructionSelection() (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0x1e17240)
#10 0x0000560af22ba62e llvm::SelectionDAGISel::CodeGenAndEmitDAG() (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0x1e2462e)
#11 0x0000560af22bd758 llvm::SelectionDAGISel::SelectAllBasicBlocks(llvm::Function const&) (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0x1e27758)
#12 0x0000560af22bf446 llvm::SelectionDAGISel::runOnMachineFunction(llvm::MachineFunction&) (.part.0) SelectionDAGISel.cpp:0:0
#13 0x0000560af0ea1349 AMDGPUDAGToDAGISel::runOnMachineFunction(llvm::MachineFunction&) (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0xa0b349)
#14 0x0000560af1a1e351 llvm::MachineFunctionPass::runOnFunction(llvm::Function&) (.part.0) MachineFunctionPass.cpp:0:0
#15 0x0000560af1285b71 llvm::FPPassManager::runOnFunction(llvm::Function&) (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0xdefb71)
#16 0x0000560af15206e7 (anonymous namespace)::CGPassManager::runOnModule(llvm::Module&) CallGraphSCCPass.cpp:0:0
#17 0x0000560af1286652 llvm::legacy::PassManagerImpl::run(llvm::Module&) (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0xdf0652)
#18 0x0000560af1ee0205 codegen(llvm::lto::Config const&, llvm::TargetMachine*, std::function<llvm::Expected<std::unique_ptr<llvm::CachedFileStream, std::default_delete<llvm::CachedFileStream>>> (unsigned int, llvm::Twine const&)>, unsigned int, llvm::Module&, llvm::ModuleSummaryIndex const&) LTOBackend.cpp:0:0
#19 0x0000560af1ee080d llvm::lto::backend(llvm::lto::Config const&, std::function<llvm::Expected<std::unique_ptr<llvm::CachedFileStream, std::default_delete<llvm::CachedFileStream>>> (unsigned int, llvm::Twine const&)>, unsigned int, llvm::Module&, llvm::ModuleSummaryIndex&) (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0x1a4a80d)
#20 0x0000560af1ed6c65 llvm::lto::LTO::runRegularLTO(std::function<llvm::Expected<std::unique_ptr<llvm::CachedFileStream, std::default_delete<llvm::CachedFileStream>>> (unsigned int, llvm::Twine const&)>) (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0x1a40c65)
#21 0x0000560af1ed72b8 llvm::lto::LTO::run(std::function<llvm::Expected<std::unique_ptr<llvm::CachedFileStream, std::default_delete<llvm::CachedFileStream>>> (unsigned int, llvm::Twine const&)>, std::function<llvm::Expected<std::function<llvm::Expected<std::unique_ptr<llvm::CachedFileStream, std::default_delete<llvm::CachedFileStream>>> (unsigned int, llvm::Twine const&)>> (unsigned int, llvm::StringRef, llvm::Twine const&)>) (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0x1a412b8)
#22 0x0000560af07d1a5d (anonymous namespace)::linkBitcodeFiles(llvm::SmallVectorImpl<llvm::object::OffloadFile>&, llvm::SmallVectorImpl<llvm::StringRef>&, llvm::opt::ArgList const&) (.constprop.0) ClangLinkerWrapper.cpp:0:0
#23 0x0000560af07d881a llvm::Error (anonymous namespace)::linkAndWrapDeviceFiles(llvm::SmallVectorImpl<llvm::object::OffloadFile>&, llvm::opt::InputArgList const&, char**, int)::'lambda'(auto&)::operator()<llvm::SmallVector<llvm::object::OffloadFile, 3u>>(auto&) const ClangLinkerWrapper.cpp:0:0
#24 0x0000560af07dee05 (anonymous namespace)::linkAndWrapDeviceFiles(llvm::SmallVectorImpl<llvm::object::OffloadFile>&, llvm::opt::InputArgList const&, char**, int) ClangLinkerWrapper.cpp:0:0
#25 0x0000560af07248e0 main (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0x28e8e0)
#26 0x00007fd3d7f27083 __libc_start_main /build/glibc-BHL3KM/glibc-2.31/csu/../csu/libc-start.c:342:3
#27 0x0000560af07c119e _start (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0x32b19e)

The command used to compile this and hit the error (should just require having a Clang that's compiled to support AMDGPU and the dependent PR if it's not committed already): flang-new --offload-arch=gfx90a -fopenmp test.f90 -o test.out

From what I can gather, from digging a little into the issue, this comes from AMDGPU not supporting DYNAMIC_STACKALLOCA instructions. I think AMDGPU only performs static allocation, but someone with more understanding of that segment of the compiler will know far better than myself.

However, the generation of this code that's unfriendly for AMD GPU, appears to stem from the HLFIR AssignOp, which lowers to a Fortran runtime call, which likely brings in the instruction that requires a dynamic stack allocation instruction (I've unfortunately not found the exact problematic line, but there's a number of areas that might pose the problem).

The solution, that I can currently think of, is to opt out of the HLFIR AssignOp generation for AMD GPU devices or for OpenMP offload (or both) and utilise the old FIR flow, which does not depend on the runtime call. I am not sure how palatable that is for everyone though, as I imagine the intent was to discard this old FIR flow in the near future. I am more than open to other suggestions however! This is just the option I had in mind just now.

It also brings up the possible issue that errors like this are encountered for other cases where HLFIR operations lower to Fortran rutnime calls, but that may be hyperbole as this is the only case I've encountered so far.

The text was updated successfully, but these errors were encountered:

llvmbot · 2023-12-06T15:20:05Z

@llvm/issue-subscribers-flang-ir

Author: None (agozillon)

This issue was found during the implementation of the following PR (and is dependent on it): https://github.com//pull/71766

The following example which attempts to map and assign a value to an allocatable variable on device compiles and works for the deprecated FIR flow, but will fail using the new HLFIR flow:

program main
    integer, allocatable :: test
    allocate(test)
    test = 10

!$omp target map(tofrom:test)
    test = 50
!$omp end target

    print *, test

    deallocate(test)
end program

Yielding the following ICE error:

LLVM ERROR: Cannot select: t20: i64,ch = dynamic_stackalloc t16:1, Constant:i64&lt;16&gt;, Constant:i64&lt;0&gt;
  t19: i64 = Constant&lt;16&gt;
  t6: i64 = Constant&lt;0&gt;
In function: __omp_offloading_fd00_4b200ae__QQmain_l5
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0.	Program arguments: /work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper --host-triple=x86_64-unknown-linux-gnu --linker-path=/work/agozillo/git/flang-dev/llvm-main-project/build/bin/ld.lld -- -z relro --hash-style=gnu --eh-frame-hdr -m elf_x86_64 -pie -dynamic-linker /lib64/ld-linux-x86-64.so.2 -o single-value-alloca.out /lib/x86_64-linux-gnu/Scrt1.o /lib/x86_64-linux-gnu/crti.o /usr/lib/gcc/x86_64-linux-gnu/9/crtbeginS.o -L/usr/lib/gcc/x86_64-linux-gnu/9 -L/usr/lib/gcc/x86_64-linux-gnu/9/../../../../lib64 -L/lib/x86_64-linux-gnu -L/lib/../lib64 -L/usr/lib/x86_64-linux-gnu -L/usr/lib/../lib64 -L/lib -L/usr/lib -L/home/agozillo/git/flang-dev/llvm-main-project/build/lib -L/home/agozillo/git/flang-dev/llvm-main-project/build/projects/openmp/libomptarget -L/home/agozillo/git/flang-dev/llvm-main-project/build/projects/openmp/ -L/home/agozillo/git/flang-dev/llvm-main-project/build/projects/openmp/libomptarget/DeviceRTL -L/etc/alternatives/rocm/lib /tmp/single-value-alloca-825105.o -L/work/agozillo/git/flang-dev/llvm-main-project/build/lib --whole-archive -lFortran_main --no-whole-archive -lFortranRuntime -lFortranDecimal -lm -lomp -lomptarget -lomptarget.devicertl -L/work/agozillo/git/flang-dev/llvm-main-project/build/lib -lgcc --as-needed -lgcc_s --no-as-needed -lpthread -lc -lgcc --as-needed -lgcc_s --no-as-needed /usr/lib/gcc/x86_64-linux-gnu/9/crtendS.o /lib/x86_64-linux-gnu/crtn.o
1.	Running pass 'CallGraph Pass Manager' on module 'ld-temp.o'.
2.	Running pass 'AMDGPU DAG-&gt;DAG Pattern Instruction Selection' on function '@<!-- -->__omp_offloading_fd00_4b200ae__QQmain_l5'
 #<!-- -->0 0x0000560af190f69f llvm::sys::PrintStackTrace(llvm::raw_ostream&amp;, int) (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0x147969f)
 #<!-- -->1 0x0000560af190ce84 SignalHandler(int) Signals.cpp:0:0
 #<!-- -->2 0x00007fd3d84a9420 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x14420)
 #<!-- -->3 0x00007fd3d7f4600b raise /build/glibc-BHL3KM/glibc-2.31/signal/../sysdeps/unix/sysv/linux/raise.c:51:1
 #<!-- -->4 0x00007fd3d7f25859 abort /build/glibc-BHL3KM/glibc-2.31/stdlib/abort.c:81:7
 #<!-- -->5 0x0000560af071eb98 llvm::ConvertUTF8toUTF32(unsigned char const**, unsigned char const*, unsigned int**, unsigned int*, llvm::ConversionFlags) (.cold) ConvertUTF.cpp:0:0
 #<!-- -->6 0x0000560af22b02bd llvm::SelectionDAGISel::CannotYetSelect(llvm::SDNode*) (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0x1e1a2bd)
 #<!-- -->7 0x0000560af22b2a19 llvm::SelectionDAGISel::SelectCodeCommon(llvm::SDNode*, unsigned char const*, unsigned int) (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0x1e1ca19)
 #<!-- -->8 0x0000560af0e98117 AMDGPUDAGToDAGISel::Select(llvm::SDNode*) (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0xa02117)
 #<!-- -->9 0x0000560af22ad240 llvm::SelectionDAGISel::DoInstructionSelection() (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0x1e17240)
#<!-- -->10 0x0000560af22ba62e llvm::SelectionDAGISel::CodeGenAndEmitDAG() (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0x1e2462e)
#<!-- -->11 0x0000560af22bd758 llvm::SelectionDAGISel::SelectAllBasicBlocks(llvm::Function const&amp;) (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0x1e27758)
#<!-- -->12 0x0000560af22bf446 llvm::SelectionDAGISel::runOnMachineFunction(llvm::MachineFunction&amp;) (.part.0) SelectionDAGISel.cpp:0:0
#<!-- -->13 0x0000560af0ea1349 AMDGPUDAGToDAGISel::runOnMachineFunction(llvm::MachineFunction&amp;) (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0xa0b349)
#<!-- -->14 0x0000560af1a1e351 llvm::MachineFunctionPass::runOnFunction(llvm::Function&amp;) (.part.0) MachineFunctionPass.cpp:0:0
#<!-- -->15 0x0000560af1285b71 llvm::FPPassManager::runOnFunction(llvm::Function&amp;) (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0xdefb71)
#<!-- -->16 0x0000560af15206e7 (anonymous namespace)::CGPassManager::runOnModule(llvm::Module&amp;) CallGraphSCCPass.cpp:0:0
#<!-- -->17 0x0000560af1286652 llvm::legacy::PassManagerImpl::run(llvm::Module&amp;) (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0xdf0652)
#<!-- -->18 0x0000560af1ee0205 codegen(llvm::lto::Config const&amp;, llvm::TargetMachine*, std::function&lt;llvm::Expected&lt;std::unique_ptr&lt;llvm::CachedFileStream, std::default_delete&lt;llvm::CachedFileStream&gt;&gt;&gt; (unsigned int, llvm::Twine const&amp;)&gt;, unsigned int, llvm::Module&amp;, llvm::ModuleSummaryIndex const&amp;) LTOBackend.cpp:0:0
#<!-- -->19 0x0000560af1ee080d llvm::lto::backend(llvm::lto::Config const&amp;, std::function&lt;llvm::Expected&lt;std::unique_ptr&lt;llvm::CachedFileStream, std::default_delete&lt;llvm::CachedFileStream&gt;&gt;&gt; (unsigned int, llvm::Twine const&amp;)&gt;, unsigned int, llvm::Module&amp;, llvm::ModuleSummaryIndex&amp;) (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0x1a4a80d)
#<!-- -->20 0x0000560af1ed6c65 llvm::lto::LTO::runRegularLTO(std::function&lt;llvm::Expected&lt;std::unique_ptr&lt;llvm::CachedFileStream, std::default_delete&lt;llvm::CachedFileStream&gt;&gt;&gt; (unsigned int, llvm::Twine const&amp;)&gt;) (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0x1a40c65)
#<!-- -->21 0x0000560af1ed72b8 llvm::lto::LTO::run(std::function&lt;llvm::Expected&lt;std::unique_ptr&lt;llvm::CachedFileStream, std::default_delete&lt;llvm::CachedFileStream&gt;&gt;&gt; (unsigned int, llvm::Twine const&amp;)&gt;, std::function&lt;llvm::Expected&lt;std::function&lt;llvm::Expected&lt;std::unique_ptr&lt;llvm::CachedFileStream, std::default_delete&lt;llvm::CachedFileStream&gt;&gt;&gt; (unsigned int, llvm::Twine const&amp;)&gt;&gt; (unsigned int, llvm::StringRef, llvm::Twine const&amp;)&gt;) (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0x1a412b8)
#<!-- -->22 0x0000560af07d1a5d (anonymous namespace)::linkBitcodeFiles(llvm::SmallVectorImpl&lt;llvm::object::OffloadFile&gt;&amp;, llvm::SmallVectorImpl&lt;llvm::StringRef&gt;&amp;, llvm::opt::ArgList const&amp;) (.constprop.0) ClangLinkerWrapper.cpp:0:0
#<!-- -->23 0x0000560af07d881a llvm::Error (anonymous namespace)::linkAndWrapDeviceFiles(llvm::SmallVectorImpl&lt;llvm::object::OffloadFile&gt;&amp;, llvm::opt::InputArgList const&amp;, char**, int)::'lambda'(auto&amp;)::operator()&lt;llvm::SmallVector&lt;llvm::object::OffloadFile, 3u&gt;&gt;(auto&amp;) const ClangLinkerWrapper.cpp:0:0
#<!-- -->24 0x0000560af07dee05 (anonymous namespace)::linkAndWrapDeviceFiles(llvm::SmallVectorImpl&lt;llvm::object::OffloadFile&gt;&amp;, llvm::opt::InputArgList const&amp;, char**, int) ClangLinkerWrapper.cpp:0:0
#<!-- -->25 0x0000560af07248e0 main (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0x28e8e0)
#<!-- -->26 0x00007fd3d7f27083 __libc_start_main /build/glibc-BHL3KM/glibc-2.31/csu/../csu/libc-start.c:342:3
#<!-- -->27 0x0000560af07c119e _start (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0x32b19e)
 #<!-- -->0 0x0000560af190f69f llvm::sys::PrintStackTrace(llvm::raw_ostream&amp;, int) (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0x147969f)
 #<!-- -->1 0x0000560af190ce84 SignalHandler(int) Signals.cpp:0:0
 #<!-- -->2 0x00007fd3d84a9420 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x14420)
 #<!-- -->3 0x00007fd3d7f4600b raise /build/glibc-BHL3KM/glibc-2.31/signal/../sysdeps/unix/sysv/linux/raise.c:51:1
 #<!-- -->4 0x00007fd3d7f25859 abort /build/glibc-BHL3KM/glibc-2.31/stdlib/abort.c:81:7
 #<!-- -->5 0x0000560af071eb98 llvm::ConvertUTF8toUTF32(unsigned char const**, unsigned char const*, unsigned int**, unsigned int*, llvm::ConversionFlags) (.cold) ConvertUTF.cpp:0:0
 #<!-- -->6 0x0000560af22b02bd llvm::SelectionDAGISel::CannotYetSelect(llvm::SDNode*) (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0x1e1a2bd)
 #<!-- -->7 0x0000560af22b2a19 llvm::SelectionDAGISel::SelectCodeCommon(llvm::SDNode*, unsigned char const*, unsigned int) (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0x1e1ca19)
 #<!-- -->8 0x0000560af0e98117 AMDGPUDAGToDAGISel::Select(llvm::SDNode*) (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0xa02117)
 #<!-- -->9 0x0000560af22ad240 llvm::SelectionDAGISel::DoInstructionSelection() (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0x1e17240)
#<!-- -->10 0x0000560af22ba62e llvm::SelectionDAGISel::CodeGenAndEmitDAG() (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0x1e2462e)
#<!-- -->11 0x0000560af22bd758 llvm::SelectionDAGISel::SelectAllBasicBlocks(llvm::Function const&amp;) (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0x1e27758)
#<!-- -->12 0x0000560af22bf446 llvm::SelectionDAGISel::runOnMachineFunction(llvm::MachineFunction&amp;) (.part.0) SelectionDAGISel.cpp:0:0
#<!-- -->13 0x0000560af0ea1349 AMDGPUDAGToDAGISel::runOnMachineFunction(llvm::MachineFunction&amp;) (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0xa0b349)
#<!-- -->14 0x0000560af1a1e351 llvm::MachineFunctionPass::runOnFunction(llvm::Function&amp;) (.part.0) MachineFunctionPass.cpp:0:0
#<!-- -->15 0x0000560af1285b71 llvm::FPPassManager::runOnFunction(llvm::Function&amp;) (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0xdefb71)
#<!-- -->16 0x0000560af15206e7 (anonymous namespace)::CGPassManager::runOnModule(llvm::Module&amp;) CallGraphSCCPass.cpp:0:0
#<!-- -->17 0x0000560af1286652 llvm::legacy::PassManagerImpl::run(llvm::Module&amp;) (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0xdf0652)
#<!-- -->18 0x0000560af1ee0205 codegen(llvm::lto::Config const&amp;, llvm::TargetMachine*, std::function&lt;llvm::Expected&lt;std::unique_ptr&lt;llvm::CachedFileStream, std::default_delete&lt;llvm::CachedFileStream&gt;&gt;&gt; (unsigned int, llvm::Twine const&amp;)&gt;, unsigned int, llvm::Module&amp;, llvm::ModuleSummaryIndex const&amp;) LTOBackend.cpp:0:0
#<!-- -->19 0x0000560af1ee080d llvm::lto::backend(llvm::lto::Config const&amp;, std::function&lt;llvm::Expected&lt;std::unique_ptr&lt;llvm::CachedFileStream, std::default_delete&lt;llvm::CachedFileStream&gt;&gt;&gt; (unsigned int, llvm::Twine const&amp;)&gt;, unsigned int, llvm::Module&amp;, llvm::ModuleSummaryIndex&amp;) (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0x1a4a80d)
#<!-- -->20 0x0000560af1ed6c65 llvm::lto::LTO::runRegularLTO(std::function&lt;llvm::Expected&lt;std::unique_ptr&lt;llvm::CachedFileStream, std::default_delete&lt;llvm::CachedFileStream&gt;&gt;&gt; (unsigned int, llvm::Twine const&amp;)&gt;) (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0x1a40c65)
#<!-- -->21 0x0000560af1ed72b8 llvm::lto::LTO::run(std::function&lt;llvm::Expected&lt;std::unique_ptr&lt;llvm::CachedFileStream, std::default_delete&lt;llvm::CachedFileStream&gt;&gt;&gt; (unsigned int, llvm::Twine const&amp;)&gt;, std::function&lt;llvm::Expected&lt;std::function&lt;llvm::Expected&lt;std::unique_ptr&lt;llvm::CachedFileStream, std::default_delete&lt;llvm::CachedFileStream&gt;&gt;&gt; (unsigned int, llvm::Twine const&amp;)&gt;&gt; (unsigned int, llvm::StringRef, llvm::Twine const&amp;)&gt;) (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0x1a412b8)
#<!-- -->22 0x0000560af07d1a5d (anonymous namespace)::linkBitcodeFiles(llvm::SmallVectorImpl&lt;llvm::object::OffloadFile&gt;&amp;, llvm::SmallVectorImpl&lt;llvm::StringRef&gt;&amp;, llvm::opt::ArgList const&amp;) (.constprop.0) ClangLinkerWrapper.cpp:0:0
#<!-- -->23 0x0000560af07d881a llvm::Error (anonymous namespace)::linkAndWrapDeviceFiles(llvm::SmallVectorImpl&lt;llvm::object::OffloadFile&gt;&amp;, llvm::opt::InputArgList const&amp;, char**, int)::'lambda'(auto&amp;)::operator()&lt;llvm::SmallVector&lt;llvm::object::OffloadFile, 3u&gt;&gt;(auto&amp;) const ClangLinkerWrapper.cpp:0:0
#<!-- -->24 0x0000560af07dee05 (anonymous namespace)::linkAndWrapDeviceFiles(llvm::SmallVectorImpl&lt;llvm::object::OffloadFile&gt;&amp;, llvm::opt::InputArgList const&amp;, char**, int) ClangLinkerWrapper.cpp:0:0
#<!-- -->25 0x0000560af07248e0 main (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0x28e8e0)
#<!-- -->26 0x00007fd3d7f27083 __libc_start_main /build/glibc-BHL3KM/glibc-2.31/csu/../csu/libc-start.c:342:3
#<!-- -->27 0x0000560af07c119e _start (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0x32b19e)

The command used to compile this and hit the error (should just require having a Clang that's compiled to support AMDGPU and the dependent PR if it's not committed already): flang-new --offload-arch=gfx90a -fopenmp test.f90 -o test.out

From what I can gather, from digging a little into the issue, this comes from AMDGPU not supporting DYNAMIC_STACKALLOCA instructions. I think AMDGPU only performs static allocation, but someone with more understanding of that segment of the compiler will know far better than myself.

However, the generation of this code that's unfriendly for AMD GPU, appears to stem from the HLFIR AssignOp, which lowers to a Fortran runtime call, which likely brings in the instruction that requires a dynamic stack allocation instruction (I've unfortunately not found the exact problematic line, but there's a number of areas that might pose the problem).

The solution, that I can currently think of, is to opt out of the HLFIR AssignOp generation for AMD GPU devices or for OpenMP offload (or both) and utilise the old FIR flow, which does not depend on the runtime call. I am not sure how palatable that is for everyone though, as I imagine the intent was to discard this old FIR flow in the near future. I am more than open to other suggestions however! This is just the option I had in mind just now.

It also brings up the possible issue that errors like this are encountered for other cases where HLFIR operations lower to Fortran rutnime calls, but that may be hyperbole as this is the only case I've encountered so far.

llvmbot · 2023-12-06T15:20:08Z

@llvm/issue-subscribers-flang-runtime

Author: None (agozillon)

This issue was found during the implementation of the following PR (and is dependent on it): https://github.com//pull/71766

The following example which attempts to map and assign a value to an allocatable variable on device compiles and works for the deprecated FIR flow, but will fail using the new HLFIR flow:

program main
    integer, allocatable :: test
    allocate(test)
    test = 10

!$omp target map(tofrom:test)
    test = 50
!$omp end target

    print *, test

    deallocate(test)
end program

Yielding the following ICE error:

LLVM ERROR: Cannot select: t20: i64,ch = dynamic_stackalloc t16:1, Constant:i64&lt;16&gt;, Constant:i64&lt;0&gt;
  t19: i64 = Constant&lt;16&gt;
  t6: i64 = Constant&lt;0&gt;
In function: __omp_offloading_fd00_4b200ae__QQmain_l5
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0.	Program arguments: /work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper --host-triple=x86_64-unknown-linux-gnu --linker-path=/work/agozillo/git/flang-dev/llvm-main-project/build/bin/ld.lld -- -z relro --hash-style=gnu --eh-frame-hdr -m elf_x86_64 -pie -dynamic-linker /lib64/ld-linux-x86-64.so.2 -o single-value-alloca.out /lib/x86_64-linux-gnu/Scrt1.o /lib/x86_64-linux-gnu/crti.o /usr/lib/gcc/x86_64-linux-gnu/9/crtbeginS.o -L/usr/lib/gcc/x86_64-linux-gnu/9 -L/usr/lib/gcc/x86_64-linux-gnu/9/../../../../lib64 -L/lib/x86_64-linux-gnu -L/lib/../lib64 -L/usr/lib/x86_64-linux-gnu -L/usr/lib/../lib64 -L/lib -L/usr/lib -L/home/agozillo/git/flang-dev/llvm-main-project/build/lib -L/home/agozillo/git/flang-dev/llvm-main-project/build/projects/openmp/libomptarget -L/home/agozillo/git/flang-dev/llvm-main-project/build/projects/openmp/ -L/home/agozillo/git/flang-dev/llvm-main-project/build/projects/openmp/libomptarget/DeviceRTL -L/etc/alternatives/rocm/lib /tmp/single-value-alloca-825105.o -L/work/agozillo/git/flang-dev/llvm-main-project/build/lib --whole-archive -lFortran_main --no-whole-archive -lFortranRuntime -lFortranDecimal -lm -lomp -lomptarget -lomptarget.devicertl -L/work/agozillo/git/flang-dev/llvm-main-project/build/lib -lgcc --as-needed -lgcc_s --no-as-needed -lpthread -lc -lgcc --as-needed -lgcc_s --no-as-needed /usr/lib/gcc/x86_64-linux-gnu/9/crtendS.o /lib/x86_64-linux-gnu/crtn.o
1.	Running pass 'CallGraph Pass Manager' on module 'ld-temp.o'.
2.	Running pass 'AMDGPU DAG-&gt;DAG Pattern Instruction Selection' on function '@<!-- -->__omp_offloading_fd00_4b200ae__QQmain_l5'
 #<!-- -->0 0x0000560af190f69f llvm::sys::PrintStackTrace(llvm::raw_ostream&amp;, int) (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0x147969f)
 #<!-- -->1 0x0000560af190ce84 SignalHandler(int) Signals.cpp:0:0
 #<!-- -->2 0x00007fd3d84a9420 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x14420)
 #<!-- -->3 0x00007fd3d7f4600b raise /build/glibc-BHL3KM/glibc-2.31/signal/../sysdeps/unix/sysv/linux/raise.c:51:1
 #<!-- -->4 0x00007fd3d7f25859 abort /build/glibc-BHL3KM/glibc-2.31/stdlib/abort.c:81:7
 #<!-- -->5 0x0000560af071eb98 llvm::ConvertUTF8toUTF32(unsigned char const**, unsigned char const*, unsigned int**, unsigned int*, llvm::ConversionFlags) (.cold) ConvertUTF.cpp:0:0
 #<!-- -->6 0x0000560af22b02bd llvm::SelectionDAGISel::CannotYetSelect(llvm::SDNode*) (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0x1e1a2bd)
 #<!-- -->7 0x0000560af22b2a19 llvm::SelectionDAGISel::SelectCodeCommon(llvm::SDNode*, unsigned char const*, unsigned int) (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0x1e1ca19)
 #<!-- -->8 0x0000560af0e98117 AMDGPUDAGToDAGISel::Select(llvm::SDNode*) (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0xa02117)
 #<!-- -->9 0x0000560af22ad240 llvm::SelectionDAGISel::DoInstructionSelection() (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0x1e17240)
#<!-- -->10 0x0000560af22ba62e llvm::SelectionDAGISel::CodeGenAndEmitDAG() (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0x1e2462e)
#<!-- -->11 0x0000560af22bd758 llvm::SelectionDAGISel::SelectAllBasicBlocks(llvm::Function const&amp;) (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0x1e27758)
#<!-- -->12 0x0000560af22bf446 llvm::SelectionDAGISel::runOnMachineFunction(llvm::MachineFunction&amp;) (.part.0) SelectionDAGISel.cpp:0:0
#<!-- -->13 0x0000560af0ea1349 AMDGPUDAGToDAGISel::runOnMachineFunction(llvm::MachineFunction&amp;) (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0xa0b349)
#<!-- -->14 0x0000560af1a1e351 llvm::MachineFunctionPass::runOnFunction(llvm::Function&amp;) (.part.0) MachineFunctionPass.cpp:0:0
#<!-- -->15 0x0000560af1285b71 llvm::FPPassManager::runOnFunction(llvm::Function&amp;) (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0xdefb71)
#<!-- -->16 0x0000560af15206e7 (anonymous namespace)::CGPassManager::runOnModule(llvm::Module&amp;) CallGraphSCCPass.cpp:0:0
#<!-- -->17 0x0000560af1286652 llvm::legacy::PassManagerImpl::run(llvm::Module&amp;) (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0xdf0652)
#<!-- -->18 0x0000560af1ee0205 codegen(llvm::lto::Config const&amp;, llvm::TargetMachine*, std::function&lt;llvm::Expected&lt;std::unique_ptr&lt;llvm::CachedFileStream, std::default_delete&lt;llvm::CachedFileStream&gt;&gt;&gt; (unsigned int, llvm::Twine const&amp;)&gt;, unsigned int, llvm::Module&amp;, llvm::ModuleSummaryIndex const&amp;) LTOBackend.cpp:0:0
#<!-- -->19 0x0000560af1ee080d llvm::lto::backend(llvm::lto::Config const&amp;, std::function&lt;llvm::Expected&lt;std::unique_ptr&lt;llvm::CachedFileStream, std::default_delete&lt;llvm::CachedFileStream&gt;&gt;&gt; (unsigned int, llvm::Twine const&amp;)&gt;, unsigned int, llvm::Module&amp;, llvm::ModuleSummaryIndex&amp;) (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0x1a4a80d)
#<!-- -->20 0x0000560af1ed6c65 llvm::lto::LTO::runRegularLTO(std::function&lt;llvm::Expected&lt;std::unique_ptr&lt;llvm::CachedFileStream, std::default_delete&lt;llvm::CachedFileStream&gt;&gt;&gt; (unsigned int, llvm::Twine const&amp;)&gt;) (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0x1a40c65)
#<!-- -->21 0x0000560af1ed72b8 llvm::lto::LTO::run(std::function&lt;llvm::Expected&lt;std::unique_ptr&lt;llvm::CachedFileStream, std::default_delete&lt;llvm::CachedFileStream&gt;&gt;&gt; (unsigned int, llvm::Twine const&amp;)&gt;, std::function&lt;llvm::Expected&lt;std::function&lt;llvm::Expected&lt;std::unique_ptr&lt;llvm::CachedFileStream, std::default_delete&lt;llvm::CachedFileStream&gt;&gt;&gt; (unsigned int, llvm::Twine const&amp;)&gt;&gt; (unsigned int, llvm::StringRef, llvm::Twine const&amp;)&gt;) (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0x1a412b8)
#<!-- -->22 0x0000560af07d1a5d (anonymous namespace)::linkBitcodeFiles(llvm::SmallVectorImpl&lt;llvm::object::OffloadFile&gt;&amp;, llvm::SmallVectorImpl&lt;llvm::StringRef&gt;&amp;, llvm::opt::ArgList const&amp;) (.constprop.0) ClangLinkerWrapper.cpp:0:0
#<!-- -->23 0x0000560af07d881a llvm::Error (anonymous namespace)::linkAndWrapDeviceFiles(llvm::SmallVectorImpl&lt;llvm::object::OffloadFile&gt;&amp;, llvm::opt::InputArgList const&amp;, char**, int)::'lambda'(auto&amp;)::operator()&lt;llvm::SmallVector&lt;llvm::object::OffloadFile, 3u&gt;&gt;(auto&amp;) const ClangLinkerWrapper.cpp:0:0
#<!-- -->24 0x0000560af07dee05 (anonymous namespace)::linkAndWrapDeviceFiles(llvm::SmallVectorImpl&lt;llvm::object::OffloadFile&gt;&amp;, llvm::opt::InputArgList const&amp;, char**, int) ClangLinkerWrapper.cpp:0:0
#<!-- -->25 0x0000560af07248e0 main (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0x28e8e0)
#<!-- -->26 0x00007fd3d7f27083 __libc_start_main /build/glibc-BHL3KM/glibc-2.31/csu/../csu/libc-start.c:342:3
#<!-- -->27 0x0000560af07c119e _start (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0x32b19e)
 #<!-- -->0 0x0000560af190f69f llvm::sys::PrintStackTrace(llvm::raw_ostream&amp;, int) (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0x147969f)
 #<!-- -->1 0x0000560af190ce84 SignalHandler(int) Signals.cpp:0:0
 #<!-- -->2 0x00007fd3d84a9420 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x14420)
 #<!-- -->3 0x00007fd3d7f4600b raise /build/glibc-BHL3KM/glibc-2.31/signal/../sysdeps/unix/sysv/linux/raise.c:51:1
 #<!-- -->4 0x00007fd3d7f25859 abort /build/glibc-BHL3KM/glibc-2.31/stdlib/abort.c:81:7
 #<!-- -->5 0x0000560af071eb98 llvm::ConvertUTF8toUTF32(unsigned char const**, unsigned char const*, unsigned int**, unsigned int*, llvm::ConversionFlags) (.cold) ConvertUTF.cpp:0:0
 #<!-- -->6 0x0000560af22b02bd llvm::SelectionDAGISel::CannotYetSelect(llvm::SDNode*) (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0x1e1a2bd)
 #<!-- -->7 0x0000560af22b2a19 llvm::SelectionDAGISel::SelectCodeCommon(llvm::SDNode*, unsigned char const*, unsigned int) (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0x1e1ca19)
 #<!-- -->8 0x0000560af0e98117 AMDGPUDAGToDAGISel::Select(llvm::SDNode*) (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0xa02117)
 #<!-- -->9 0x0000560af22ad240 llvm::SelectionDAGISel::DoInstructionSelection() (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0x1e17240)
#<!-- -->10 0x0000560af22ba62e llvm::SelectionDAGISel::CodeGenAndEmitDAG() (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0x1e2462e)
#<!-- -->11 0x0000560af22bd758 llvm::SelectionDAGISel::SelectAllBasicBlocks(llvm::Function const&amp;) (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0x1e27758)
#<!-- -->12 0x0000560af22bf446 llvm::SelectionDAGISel::runOnMachineFunction(llvm::MachineFunction&amp;) (.part.0) SelectionDAGISel.cpp:0:0
#<!-- -->13 0x0000560af0ea1349 AMDGPUDAGToDAGISel::runOnMachineFunction(llvm::MachineFunction&amp;) (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0xa0b349)
#<!-- -->14 0x0000560af1a1e351 llvm::MachineFunctionPass::runOnFunction(llvm::Function&amp;) (.part.0) MachineFunctionPass.cpp:0:0
#<!-- -->15 0x0000560af1285b71 llvm::FPPassManager::runOnFunction(llvm::Function&amp;) (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0xdefb71)
#<!-- -->16 0x0000560af15206e7 (anonymous namespace)::CGPassManager::runOnModule(llvm::Module&amp;) CallGraphSCCPass.cpp:0:0
#<!-- -->17 0x0000560af1286652 llvm::legacy::PassManagerImpl::run(llvm::Module&amp;) (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0xdf0652)
#<!-- -->18 0x0000560af1ee0205 codegen(llvm::lto::Config const&amp;, llvm::TargetMachine*, std::function&lt;llvm::Expected&lt;std::unique_ptr&lt;llvm::CachedFileStream, std::default_delete&lt;llvm::CachedFileStream&gt;&gt;&gt; (unsigned int, llvm::Twine const&amp;)&gt;, unsigned int, llvm::Module&amp;, llvm::ModuleSummaryIndex const&amp;) LTOBackend.cpp:0:0
#<!-- -->19 0x0000560af1ee080d llvm::lto::backend(llvm::lto::Config const&amp;, std::function&lt;llvm::Expected&lt;std::unique_ptr&lt;llvm::CachedFileStream, std::default_delete&lt;llvm::CachedFileStream&gt;&gt;&gt; (unsigned int, llvm::Twine const&amp;)&gt;, unsigned int, llvm::Module&amp;, llvm::ModuleSummaryIndex&amp;) (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0x1a4a80d)
#<!-- -->20 0x0000560af1ed6c65 llvm::lto::LTO::runRegularLTO(std::function&lt;llvm::Expected&lt;std::unique_ptr&lt;llvm::CachedFileStream, std::default_delete&lt;llvm::CachedFileStream&gt;&gt;&gt; (unsigned int, llvm::Twine const&amp;)&gt;) (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0x1a40c65)
#<!-- -->21 0x0000560af1ed72b8 llvm::lto::LTO::run(std::function&lt;llvm::Expected&lt;std::unique_ptr&lt;llvm::CachedFileStream, std::default_delete&lt;llvm::CachedFileStream&gt;&gt;&gt; (unsigned int, llvm::Twine const&amp;)&gt;, std::function&lt;llvm::Expected&lt;std::function&lt;llvm::Expected&lt;std::unique_ptr&lt;llvm::CachedFileStream, std::default_delete&lt;llvm::CachedFileStream&gt;&gt;&gt; (unsigned int, llvm::Twine const&amp;)&gt;&gt; (unsigned int, llvm::StringRef, llvm::Twine const&amp;)&gt;) (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0x1a412b8)
#<!-- -->22 0x0000560af07d1a5d (anonymous namespace)::linkBitcodeFiles(llvm::SmallVectorImpl&lt;llvm::object::OffloadFile&gt;&amp;, llvm::SmallVectorImpl&lt;llvm::StringRef&gt;&amp;, llvm::opt::ArgList const&amp;) (.constprop.0) ClangLinkerWrapper.cpp:0:0
#<!-- -->23 0x0000560af07d881a llvm::Error (anonymous namespace)::linkAndWrapDeviceFiles(llvm::SmallVectorImpl&lt;llvm::object::OffloadFile&gt;&amp;, llvm::opt::InputArgList const&amp;, char**, int)::'lambda'(auto&amp;)::operator()&lt;llvm::SmallVector&lt;llvm::object::OffloadFile, 3u&gt;&gt;(auto&amp;) const ClangLinkerWrapper.cpp:0:0
#<!-- -->24 0x0000560af07dee05 (anonymous namespace)::linkAndWrapDeviceFiles(llvm::SmallVectorImpl&lt;llvm::object::OffloadFile&gt;&amp;, llvm::opt::InputArgList const&amp;, char**, int) ClangLinkerWrapper.cpp:0:0
#<!-- -->25 0x0000560af07248e0 main (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0x28e8e0)
#<!-- -->26 0x00007fd3d7f27083 __libc_start_main /build/glibc-BHL3KM/glibc-2.31/csu/../csu/libc-start.c:342:3
#<!-- -->27 0x0000560af07c119e _start (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0x32b19e)

The command used to compile this and hit the error (should just require having a Clang that's compiled to support AMDGPU and the dependent PR if it's not committed already): flang-new --offload-arch=gfx90a -fopenmp test.f90 -o test.out

From what I can gather, from digging a little into the issue, this comes from AMDGPU not supporting DYNAMIC_STACKALLOCA instructions. I think AMDGPU only performs static allocation, but someone with more understanding of that segment of the compiler will know far better than myself.

However, the generation of this code that's unfriendly for AMD GPU, appears to stem from the HLFIR AssignOp, which lowers to a Fortran runtime call, which likely brings in the instruction that requires a dynamic stack allocation instruction (I've unfortunately not found the exact problematic line, but there's a number of areas that might pose the problem).

The solution, that I can currently think of, is to opt out of the HLFIR AssignOp generation for AMD GPU devices or for OpenMP offload (or both) and utilise the old FIR flow, which does not depend on the runtime call. I am not sure how palatable that is for everyone though, as I imagine the intent was to discard this old FIR flow in the near future. I am more than open to other suggestions however! This is just the option I had in mind just now.

It also brings up the possible issue that errors like this are encountered for other cases where HLFIR operations lower to Fortran rutnime calls, but that may be hyperbole as this is the only case I've encountered so far.

clementval · 2023-12-06T16:33:26Z

I don't think using the old lowering is a good solution in the long term. Did you compile the runtime for offload?

agozillon · 2023-12-06T16:35:58Z

Perhaps I have not in this case, is there specific build commands required to do this? I'm a little unfamiliar with how the Fortran runtime works in this case admittedly, so I'd appreciate any pointers you can give.

clementval · 2023-12-06T16:49:58Z

I have not done it myself but there is this post: https://discourse.llvm.org/t/rfc-building-flang-runtime-for-offload-devices/70787

agozillon · 2023-12-06T16:51:45Z

Thank you @clementval I'll have a look into that thread, I do recollect a little about it, can't remember where we landed on it though but it does seem Slava made some good headway with an initial patch for CUDA/NVPTX.

agozillon · 2023-12-07T21:53:03Z

I've had a little time to dig into this a little more, it seems that at least the above error has nothing to do with the definition itself as it's not linked in to the module at the time the error occurs, we only have the declaration.

It seems more likely that the allocations (box and scalar) generated for the reallocation case are causing some issues. I think (not completely sure, need to find it/talk to someone) there's an optimisation pass in the backend somewhere that automagically tidies them up where possible, but doesn't handle these in this case (it disappears the allocas we create for kernel arguments on lowering to replace with other instructions at least, but it requires a load access in that particular case). But in any case, if I raise the allocas to the kernel entry point (as when using the generalised LLVM dialect lowering to LLVM IR it injects them in the middle of the kernel currently), I think making them statically allocatable, it bypasses the previous error, however, it hits another wall in the instruction selector where it tries to directly use a FrameIndex (the allocation), which seems to also be a no-go for at least AMDGPU.

So I am still not entirely sure of a fix, but I did notice another possible workaround would be to deactivate the realloc portion of the hlfir::AssignOp operation for AMDGPU/OpenMP target offload by perhaps adding another logic check to https://github.com/llvm/llvm-project/blob/main/flang/lib/Lower/Bridge.cpp#L3491 here to toggle isWholeAllocatableAssignment false (or something along those lines), similar to my original suggestion but less drastic as it's not keeping around the old FIR flow I think.

I suppose one other, perhaps more drastic option would be to create a pass that raised the allocas out of the kernel and turned them into mapped arguments, but that is possibly quite overkill for something like AssignOp (e.g. assigning a single value to an allocatable).

clementval · 2023-12-07T22:03:39Z

It seems very error prone to have specific lowering for different target. Especially in assignment where Fortran as lots of rule for allocatable and so on.

It feels like the adaptation should be done in a target specific pass like the target-rewrite pass to make the code workaround the error you are seeing.

agozillon · 2023-12-07T22:06:46Z

Thank you @clementval I wasn't aware of this pass, I'll have a look into it!

agozillon · 2023-12-14T19:03:01Z

I believe I have found a series of small fixes that will get this test working, I'll open a PR or two for it when I am back from vacation, unless someone else manages to get there first:

Put allocations inside of the target kernel into address space 5 (all device allocations should be in here) and create address space casts from them to the program address space (0 for AMDGPU). Sergio has a PR up that does at least some of this work (need to test his recent update to see if it works as is with no additional PR required), and Jan had a prior PR that did exactly this but it got stuck in limbo on phabricator, but it is necessary for AMDGPU.
Add optnone + noinline to kernels (and possibly all device functions, fortran runtime is already covered if it's compiled by Clang, but user functions from Fortran are not) as Clang currently does, without these the later passes are free to manipulate things and create non-functional IR for AMDGPU it seems. The alternative is to have some reformatting of the kernel once it's lowered to move all allocations to the top of the kernel in the entry block, but this might be a little overkill where optnone/noinline is fine and Clang uses it, it may prevent future issues similar to this down the line.
Create a driver change to allow optional include of GPU libc after libFortranRuntime to resolve memcpy etc. uses for device with device variations. It is possible there's a better command line command to force the library include after the fortran runtime inside of the compiler library ordering, but I'm unsure.

So the test will need two dependencies to work, the fortran runtime built for offload and a libc built for offload as well. Alongside some of the non-library related modifications to the IR.

…se function pass to finalize, utilised in convertTarget This patch seeks to add a mechanism to raise constant (not ConstantExpr or runtime/dynamic) sized allocations into the entry block for select functions that have been inserted into a list for processing. This processing occurs during the finalize call, after OutlinedInfo regions have completed. This currently has only been utilised for createOutlinedFunction, which is triggered for TargetOp generation in the OpenMP MLIR dialect lowering to LLVM-IR. This currently is required for Target kernels generated by createOutlinedFunction to avoid subsequent optimisation passes doing some unintentional malformed optimisaitions for AMD kernels (unsure if it occurs for other vendors). If the allocas are generated inside of the kernel and are not in the entry block and are subsequently passed to a function this can lead to required instructions being erased or manipulated in a way that causes the kernel to run into a HSA access error. This fix is related to a series of problems found in: llvm#74603 This problem primarily presents itself for Flang's HLFIR AssignOp currently, when utilised with a scalar temporary constant on the RHS and a descriptor type on the LHS. It will generate a call to a runtime function, wrap the RHS temporary in a newly allocated descriptor (an llvm struct), and pass both the LHS and RHS descriptor into the runtime function call. This will currently be embedded into the middle of the target region in the user entry block, which means the allocas are also embedded in the middle, which seems to pose issues when later passes are executed. This issue may present itself in other HLFIR operations or unrelated operations that generate allocas as a by product, but for the moment, this one test case is the only scenario i've found this problem. Perhaps this is not the appropriate fix, I am very open to other suggestions, I've tried a few others (at varying levels of the flang/mlir compiler flow), but this one is the smallest and least intrusive changeset. The other two, that come to mind (but I've not fully looked into, the former I tried a little with blocks but it had a few issues I'd need to think through): * Having a proper alloca only block (or region) generated for TargetOps that we could merge into the entry block that's generated by convertTarget's createOutlinedFunction. * Or diverging a little from Clang's current target generation and using the CodeExtractor to generate the user code as an outlined function region invoked from the kernel we make, with our kernel arguments passed into it. Similar to the current parallel generation. I am not sure how well this would intermingle with the existing parallel generation though that's layered in. Both of these methods seem like quite a divergeance from the current status quo, which I am not entirely sure is meritted for the small test this change aims to fix.

…se function pass to finalize, utilised in convertTarget (#78818) This patch seeks to add a mechanism to raise constant (not ConstantExpr or runtime/dynamic) sized allocations into the entry block for select functions that have been inserted into a list for processing. This processing occurs during the finalize call, after OutlinedInfo regions have completed. This currently has only been utilised for createOutlinedFunction, which is triggered for TargetOp generation in the OpenMP MLIR dialect lowering to LLVM-IR. This currently is required for Target kernels generated by createOutlinedFunction to avoid subsequent optimization passes doing some unintentional malformed optimizations for AMD kernels (unsure if it occurs for other vendors). If the allocas are generated inside of the kernel and are not in the entry block and are subsequently passed to a function this can lead to required instructions being erased or manipulated in a way that causes the kernel to run into a HSA access error. This fix is related to a series of problems found in: #74603 This problem primarily presents itself for Flang's HLFIR AssignOp currently, when utilised with a scalar temporary constant on the RHS and a descriptor type on the LHS. It will generate a call to a runtime function, wrap the RHS temporary in a newly allocated descriptor (an llvm struct), and pass both the LHS and RHS descriptor into the runtime function call. This will currently be embedded into the middle of the target region in the user entry block, which means the allocas are also embedded in the middle, which seems to pose issues when later passes are executed. This issue may present itself in other HLFIR operations or unrelated operations that generate allocas as a by product, but for the moment, this one test case is the only scenario I've found this problem. Perhaps this is not the appropriate fix, I am very open to other suggestions, I've tried a few others (at varying levels of the flang/mlir compiler flow), but this one is the smallest and least intrusive change set. The other two, that come to mind (but I've not fully looked into, the former I tried a little with blocks but it had a few issues I'd need to think through): - Having a proper alloca only block (or region) generated for TargetOps that we could merge into the entry block that's generated by convertTarget's createOutlinedFunction. - Or diverging a little from Clang's current target generation and using the CodeExtractor to generate the user code as an outlined function region invoked from the kernel we make, with our kernel arguments passed into it. Similar to the current parallel generation. I am not sure how well this would intermingle with the existing parallel generation though that's layered in. Both of these methods seem like quite a divergence from the current status quo, which I am not entirely sure is merited for the small test this change aims to fix.

agozillon · 2024-02-23T22:02:53Z

This particular case should now be resolved as the PRs that help address it have now landed.

llvmbot · 2024-02-23T22:05:09Z

@llvm/issue-subscribers-openmp

Author: None (agozillon)

This issue was found during the implementation of the following PR (and is dependent on it): https://github.com//pull/71766

The following example which attempts to map and assign a value to an allocatable variable on device compiles and works for the deprecated FIR flow, but will fail using the new HLFIR flow:

program main
    integer, allocatable :: test
    allocate(test)
    test = 10

!$omp target map(tofrom:test)
    test = 50
!$omp end target

    print *, test

    deallocate(test)
end program

Yielding the following ICE error:

LLVM ERROR: Cannot select: t20: i64,ch = dynamic_stackalloc t16:1, Constant:i64&lt;16&gt;, Constant:i64&lt;0&gt;
  t19: i64 = Constant&lt;16&gt;
  t6: i64 = Constant&lt;0&gt;
In function: __omp_offloading_fd00_4b200ae__QQmain_l5
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0.	Program arguments: /work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper --host-triple=x86_64-unknown-linux-gnu --linker-path=/work/agozillo/git/flang-dev/llvm-main-project/build/bin/ld.lld -- -z relro --hash-style=gnu --eh-frame-hdr -m elf_x86_64 -pie -dynamic-linker /lib64/ld-linux-x86-64.so.2 -o single-value-alloca.out /lib/x86_64-linux-gnu/Scrt1.o /lib/x86_64-linux-gnu/crti.o /usr/lib/gcc/x86_64-linux-gnu/9/crtbeginS.o -L/usr/lib/gcc/x86_64-linux-gnu/9 -L/usr/lib/gcc/x86_64-linux-gnu/9/../../../../lib64 -L/lib/x86_64-linux-gnu -L/lib/../lib64 -L/usr/lib/x86_64-linux-gnu -L/usr/lib/../lib64 -L/lib -L/usr/lib -L/home/agozillo/git/flang-dev/llvm-main-project/build/lib -L/home/agozillo/git/flang-dev/llvm-main-project/build/projects/openmp/libomptarget -L/home/agozillo/git/flang-dev/llvm-main-project/build/projects/openmp/ -L/home/agozillo/git/flang-dev/llvm-main-project/build/projects/openmp/libomptarget/DeviceRTL -L/etc/alternatives/rocm/lib /tmp/single-value-alloca-825105.o -L/work/agozillo/git/flang-dev/llvm-main-project/build/lib --whole-archive -lFortran_main --no-whole-archive -lFortranRuntime -lFortranDecimal -lm -lomp -lomptarget -lomptarget.devicertl -L/work/agozillo/git/flang-dev/llvm-main-project/build/lib -lgcc --as-needed -lgcc_s --no-as-needed -lpthread -lc -lgcc --as-needed -lgcc_s --no-as-needed /usr/lib/gcc/x86_64-linux-gnu/9/crtendS.o /lib/x86_64-linux-gnu/crtn.o
1.	Running pass 'CallGraph Pass Manager' on module 'ld-temp.o'.
2.	Running pass 'AMDGPU DAG-&gt;DAG Pattern Instruction Selection' on function '@<!-- -->__omp_offloading_fd00_4b200ae__QQmain_l5'
 #<!-- -->0 0x0000560af190f69f llvm::sys::PrintStackTrace(llvm::raw_ostream&amp;, int) (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0x147969f)
 #<!-- -->1 0x0000560af190ce84 SignalHandler(int) Signals.cpp:0:0
 #<!-- -->2 0x00007fd3d84a9420 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x14420)
 #<!-- -->3 0x00007fd3d7f4600b raise /build/glibc-BHL3KM/glibc-2.31/signal/../sysdeps/unix/sysv/linux/raise.c:51:1
 #<!-- -->4 0x00007fd3d7f25859 abort /build/glibc-BHL3KM/glibc-2.31/stdlib/abort.c:81:7
 #<!-- -->5 0x0000560af071eb98 llvm::ConvertUTF8toUTF32(unsigned char const**, unsigned char const*, unsigned int**, unsigned int*, llvm::ConversionFlags) (.cold) ConvertUTF.cpp:0:0
 #<!-- -->6 0x0000560af22b02bd llvm::SelectionDAGISel::CannotYetSelect(llvm::SDNode*) (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0x1e1a2bd)
 #<!-- -->7 0x0000560af22b2a19 llvm::SelectionDAGISel::SelectCodeCommon(llvm::SDNode*, unsigned char const*, unsigned int) (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0x1e1ca19)
 #<!-- -->8 0x0000560af0e98117 AMDGPUDAGToDAGISel::Select(llvm::SDNode*) (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0xa02117)
 #<!-- -->9 0x0000560af22ad240 llvm::SelectionDAGISel::DoInstructionSelection() (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0x1e17240)
#<!-- -->10 0x0000560af22ba62e llvm::SelectionDAGISel::CodeGenAndEmitDAG() (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0x1e2462e)
#<!-- -->11 0x0000560af22bd758 llvm::SelectionDAGISel::SelectAllBasicBlocks(llvm::Function const&amp;) (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0x1e27758)
#<!-- -->12 0x0000560af22bf446 llvm::SelectionDAGISel::runOnMachineFunction(llvm::MachineFunction&amp;) (.part.0) SelectionDAGISel.cpp:0:0
#<!-- -->13 0x0000560af0ea1349 AMDGPUDAGToDAGISel::runOnMachineFunction(llvm::MachineFunction&amp;) (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0xa0b349)
#<!-- -->14 0x0000560af1a1e351 llvm::MachineFunctionPass::runOnFunction(llvm::Function&amp;) (.part.0) MachineFunctionPass.cpp:0:0
#<!-- -->15 0x0000560af1285b71 llvm::FPPassManager::runOnFunction(llvm::Function&amp;) (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0xdefb71)
#<!-- -->16 0x0000560af15206e7 (anonymous namespace)::CGPassManager::runOnModule(llvm::Module&amp;) CallGraphSCCPass.cpp:0:0
#<!-- -->17 0x0000560af1286652 llvm::legacy::PassManagerImpl::run(llvm::Module&amp;) (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0xdf0652)
#<!-- -->18 0x0000560af1ee0205 codegen(llvm::lto::Config const&amp;, llvm::TargetMachine*, std::function&lt;llvm::Expected&lt;std::unique_ptr&lt;llvm::CachedFileStream, std::default_delete&lt;llvm::CachedFileStream&gt;&gt;&gt; (unsigned int, llvm::Twine const&amp;)&gt;, unsigned int, llvm::Module&amp;, llvm::ModuleSummaryIndex const&amp;) LTOBackend.cpp:0:0
#<!-- -->19 0x0000560af1ee080d llvm::lto::backend(llvm::lto::Config const&amp;, std::function&lt;llvm::Expected&lt;std::unique_ptr&lt;llvm::CachedFileStream, std::default_delete&lt;llvm::CachedFileStream&gt;&gt;&gt; (unsigned int, llvm::Twine const&amp;)&gt;, unsigned int, llvm::Module&amp;, llvm::ModuleSummaryIndex&amp;) (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0x1a4a80d)
#<!-- -->20 0x0000560af1ed6c65 llvm::lto::LTO::runRegularLTO(std::function&lt;llvm::Expected&lt;std::unique_ptr&lt;llvm::CachedFileStream, std::default_delete&lt;llvm::CachedFileStream&gt;&gt;&gt; (unsigned int, llvm::Twine const&amp;)&gt;) (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0x1a40c65)
#<!-- -->21 0x0000560af1ed72b8 llvm::lto::LTO::run(std::function&lt;llvm::Expected&lt;std::unique_ptr&lt;llvm::CachedFileStream, std::default_delete&lt;llvm::CachedFileStream&gt;&gt;&gt; (unsigned int, llvm::Twine const&amp;)&gt;, std::function&lt;llvm::Expected&lt;std::function&lt;llvm::Expected&lt;std::unique_ptr&lt;llvm::CachedFileStream, std::default_delete&lt;llvm::CachedFileStream&gt;&gt;&gt; (unsigned int, llvm::Twine const&amp;)&gt;&gt; (unsigned int, llvm::StringRef, llvm::Twine const&amp;)&gt;) (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0x1a412b8)
#<!-- -->22 0x0000560af07d1a5d (anonymous namespace)::linkBitcodeFiles(llvm::SmallVectorImpl&lt;llvm::object::OffloadFile&gt;&amp;, llvm::SmallVectorImpl&lt;llvm::StringRef&gt;&amp;, llvm::opt::ArgList const&amp;) (.constprop.0) ClangLinkerWrapper.cpp:0:0
#<!-- -->23 0x0000560af07d881a llvm::Error (anonymous namespace)::linkAndWrapDeviceFiles(llvm::SmallVectorImpl&lt;llvm::object::OffloadFile&gt;&amp;, llvm::opt::InputArgList const&amp;, char**, int)::'lambda'(auto&amp;)::operator()&lt;llvm::SmallVector&lt;llvm::object::OffloadFile, 3u&gt;&gt;(auto&amp;) const ClangLinkerWrapper.cpp:0:0
#<!-- -->24 0x0000560af07dee05 (anonymous namespace)::linkAndWrapDeviceFiles(llvm::SmallVectorImpl&lt;llvm::object::OffloadFile&gt;&amp;, llvm::opt::InputArgList const&amp;, char**, int) ClangLinkerWrapper.cpp:0:0
#<!-- -->25 0x0000560af07248e0 main (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0x28e8e0)
#<!-- -->26 0x00007fd3d7f27083 __libc_start_main /build/glibc-BHL3KM/glibc-2.31/csu/../csu/libc-start.c:342:3
#<!-- -->27 0x0000560af07c119e _start (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0x32b19e)
 #<!-- -->0 0x0000560af190f69f llvm::sys::PrintStackTrace(llvm::raw_ostream&amp;, int) (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0x147969f)
 #<!-- -->1 0x0000560af190ce84 SignalHandler(int) Signals.cpp:0:0
 #<!-- -->2 0x00007fd3d84a9420 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x14420)
 #<!-- -->3 0x00007fd3d7f4600b raise /build/glibc-BHL3KM/glibc-2.31/signal/../sysdeps/unix/sysv/linux/raise.c:51:1
 #<!-- -->4 0x00007fd3d7f25859 abort /build/glibc-BHL3KM/glibc-2.31/stdlib/abort.c:81:7
 #<!-- -->5 0x0000560af071eb98 llvm::ConvertUTF8toUTF32(unsigned char const**, unsigned char const*, unsigned int**, unsigned int*, llvm::ConversionFlags) (.cold) ConvertUTF.cpp:0:0
 #<!-- -->6 0x0000560af22b02bd llvm::SelectionDAGISel::CannotYetSelect(llvm::SDNode*) (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0x1e1a2bd)
 #<!-- -->7 0x0000560af22b2a19 llvm::SelectionDAGISel::SelectCodeCommon(llvm::SDNode*, unsigned char const*, unsigned int) (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0x1e1ca19)
 #<!-- -->8 0x0000560af0e98117 AMDGPUDAGToDAGISel::Select(llvm::SDNode*) (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0xa02117)
 #<!-- -->9 0x0000560af22ad240 llvm::SelectionDAGISel::DoInstructionSelection() (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0x1e17240)
#<!-- -->10 0x0000560af22ba62e llvm::SelectionDAGISel::CodeGenAndEmitDAG() (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0x1e2462e)
#<!-- -->11 0x0000560af22bd758 llvm::SelectionDAGISel::SelectAllBasicBlocks(llvm::Function const&amp;) (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0x1e27758)
#<!-- -->12 0x0000560af22bf446 llvm::SelectionDAGISel::runOnMachineFunction(llvm::MachineFunction&amp;) (.part.0) SelectionDAGISel.cpp:0:0
#<!-- -->13 0x0000560af0ea1349 AMDGPUDAGToDAGISel::runOnMachineFunction(llvm::MachineFunction&amp;) (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0xa0b349)
#<!-- -->14 0x0000560af1a1e351 llvm::MachineFunctionPass::runOnFunction(llvm::Function&amp;) (.part.0) MachineFunctionPass.cpp:0:0
#<!-- -->15 0x0000560af1285b71 llvm::FPPassManager::runOnFunction(llvm::Function&amp;) (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0xdefb71)
#<!-- -->16 0x0000560af15206e7 (anonymous namespace)::CGPassManager::runOnModule(llvm::Module&amp;) CallGraphSCCPass.cpp:0:0
#<!-- -->17 0x0000560af1286652 llvm::legacy::PassManagerImpl::run(llvm::Module&amp;) (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0xdf0652)
#<!-- -->18 0x0000560af1ee0205 codegen(llvm::lto::Config const&amp;, llvm::TargetMachine*, std::function&lt;llvm::Expected&lt;std::unique_ptr&lt;llvm::CachedFileStream, std::default_delete&lt;llvm::CachedFileStream&gt;&gt;&gt; (unsigned int, llvm::Twine const&amp;)&gt;, unsigned int, llvm::Module&amp;, llvm::ModuleSummaryIndex const&amp;) LTOBackend.cpp:0:0
#<!-- -->19 0x0000560af1ee080d llvm::lto::backend(llvm::lto::Config const&amp;, std::function&lt;llvm::Expected&lt;std::unique_ptr&lt;llvm::CachedFileStream, std::default_delete&lt;llvm::CachedFileStream&gt;&gt;&gt; (unsigned int, llvm::Twine const&amp;)&gt;, unsigned int, llvm::Module&amp;, llvm::ModuleSummaryIndex&amp;) (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0x1a4a80d)
#<!-- -->20 0x0000560af1ed6c65 llvm::lto::LTO::runRegularLTO(std::function&lt;llvm::Expected&lt;std::unique_ptr&lt;llvm::CachedFileStream, std::default_delete&lt;llvm::CachedFileStream&gt;&gt;&gt; (unsigned int, llvm::Twine const&amp;)&gt;) (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0x1a40c65)
#<!-- -->21 0x0000560af1ed72b8 llvm::lto::LTO::run(std::function&lt;llvm::Expected&lt;std::unique_ptr&lt;llvm::CachedFileStream, std::default_delete&lt;llvm::CachedFileStream&gt;&gt;&gt; (unsigned int, llvm::Twine const&amp;)&gt;, std::function&lt;llvm::Expected&lt;std::function&lt;llvm::Expected&lt;std::unique_ptr&lt;llvm::CachedFileStream, std::default_delete&lt;llvm::CachedFileStream&gt;&gt;&gt; (unsigned int, llvm::Twine const&amp;)&gt;&gt; (unsigned int, llvm::StringRef, llvm::Twine const&amp;)&gt;) (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0x1a412b8)
#<!-- -->22 0x0000560af07d1a5d (anonymous namespace)::linkBitcodeFiles(llvm::SmallVectorImpl&lt;llvm::object::OffloadFile&gt;&amp;, llvm::SmallVectorImpl&lt;llvm::StringRef&gt;&amp;, llvm::opt::ArgList const&amp;) (.constprop.0) ClangLinkerWrapper.cpp:0:0
#<!-- -->23 0x0000560af07d881a llvm::Error (anonymous namespace)::linkAndWrapDeviceFiles(llvm::SmallVectorImpl&lt;llvm::object::OffloadFile&gt;&amp;, llvm::opt::InputArgList const&amp;, char**, int)::'lambda'(auto&amp;)::operator()&lt;llvm::SmallVector&lt;llvm::object::OffloadFile, 3u&gt;&gt;(auto&amp;) const ClangLinkerWrapper.cpp:0:0
#<!-- -->24 0x0000560af07dee05 (anonymous namespace)::linkAndWrapDeviceFiles(llvm::SmallVectorImpl&lt;llvm::object::OffloadFile&gt;&amp;, llvm::opt::InputArgList const&amp;, char**, int) ClangLinkerWrapper.cpp:0:0
#<!-- -->25 0x0000560af07248e0 main (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0x28e8e0)
#<!-- -->26 0x00007fd3d7f27083 __libc_start_main /build/glibc-BHL3KM/glibc-2.31/csu/../csu/libc-start.c:342:3
#<!-- -->27 0x0000560af07c119e _start (/work/agozillo/git/flang-dev/llvm-main-project/build/bin/clang-linker-wrapper+0x32b19e)

The command used to compile this and hit the error (should just require having a Clang that's compiled to support AMDGPU and the dependent PR if it's not committed already): flang-new --offload-arch=gfx90a -fopenmp test.f90 -o test.out

From what I can gather, from digging a little into the issue, this comes from AMDGPU not supporting DYNAMIC_STACKALLOCA instructions. I think AMDGPU only performs static allocation, but someone with more understanding of that segment of the compiler will know far better than myself.

However, the generation of this code that's unfriendly for AMD GPU, appears to stem from the HLFIR AssignOp, which lowers to a Fortran runtime call, which likely brings in the instruction that requires a dynamic stack allocation instruction (I've unfortunately not found the exact problematic line, but there's a number of areas that might pose the problem).

The solution, that I can currently think of, is to opt out of the HLFIR AssignOp generation for AMD GPU devices or for OpenMP offload (or both) and utilise the old FIR flow, which does not depend on the runtime call. I am not sure how palatable that is for everyone though, as I imagine the intent was to discard this old FIR flow in the near future. I am more than open to other suggestions however! This is just the option I had in mind just now.

It also brings up the possible issue that errors like this are encountered for other cases where HLFIR operations lower to Fortran rutnime calls, but that may be hyperbole as this is the only case I've encountered so far.

agozillon added flang:ir flang:runtime flang Flang issues not falling into any other category flang:fir-hlfir flang:openmp flang:codegen labels Dec 6, 2023

agozillon self-assigned this Dec 6, 2023

EugeneZelenko removed flang:runtime flang Flang issues not falling into any other category flang:codegen labels Dec 6, 2023

agozillon mentioned this issue Dec 6, 2023

[Flang][OpenMP] Initial mapping of Fortran pointers and allocatables for target devices #71766

Merged

agozillon mentioned this issue Jan 20, 2024

[OpenMP][MLIR][OMPIRBuilder] Add a small optional constant alloca raise function pass to finalize, utilised in convertTarget #78818

Merged

agozillon mentioned this issue Feb 20, 2024

[Flang][OpenMP] Allocatables in a nested series of BLOCK with a do loop wrapping the inner BLOCK prompts a crash with HLFIR flow when used inside of a !$omp target #82368

Open

agozillon closed this as completed Feb 23, 2024

EugeneZelenko removed flang:ir flang:fir-hlfir labels Feb 23, 2024

EugeneZelenko added openmp and removed flang:openmp labels Feb 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Flang][OpenMP][Offload] HLFIR AssignOp does not lower to a friendly form for AMDGPU which is used for target offloading for OpenMP #74603

[Flang][OpenMP][Offload] HLFIR AssignOp does not lower to a friendly form for AMDGPU which is used for target offloading for OpenMP #74603

agozillon commented Dec 6, 2023

llvmbot commented Dec 6, 2023

llvmbot commented Dec 6, 2023

clementval commented Dec 6, 2023

agozillon commented Dec 6, 2023

clementval commented Dec 6, 2023

agozillon commented Dec 6, 2023 •

edited

agozillon commented Dec 7, 2023 •

edited

clementval commented Dec 7, 2023

agozillon commented Dec 7, 2023

agozillon commented Dec 14, 2023 •

edited

agozillon commented Feb 23, 2024

llvmbot commented Feb 23, 2024

[Flang][OpenMP][Offload] HLFIR AssignOp does not lower to a friendly form for AMDGPU which is used for target offloading for OpenMP #74603

[Flang][OpenMP][Offload] HLFIR AssignOp does not lower to a friendly form for AMDGPU which is used for target offloading for OpenMP #74603

Comments

agozillon commented Dec 6, 2023

llvmbot commented Dec 6, 2023

llvmbot commented Dec 6, 2023

clementval commented Dec 6, 2023

agozillon commented Dec 6, 2023

clementval commented Dec 6, 2023

agozillon commented Dec 6, 2023 • edited

agozillon commented Dec 7, 2023 • edited

clementval commented Dec 7, 2023

agozillon commented Dec 7, 2023

agozillon commented Dec 14, 2023 • edited

agozillon commented Feb 23, 2024

llvmbot commented Feb 23, 2024

agozillon commented Dec 6, 2023 •

edited

agozillon commented Dec 7, 2023 •

edited

agozillon commented Dec 14, 2023 •

edited