Skip to content

Failed to load globals when offloading to AMDGPU #54309

@jhuber6

Description

@jhuber6

The Thermo4FPM application currently fails to run because of an error in the AMDGPU where it fails to load a global. The flags used to configure the project are as follows.

cmake -DCMAKE_CXX_COMPILER=clang++ -DCMAKE_BUILD_TYPE=Release -DWITH_OPENMP_OFFLOAD=ON -DCMAKE_CXX_FLAGS="-fopenmp -fopenmp-cuda-mode -fopenmp-new-driver -fopenmp-targets=amdgcn -Rpass=openmp-opt -Rpass-missed=openmp-opt" -DWITH_OPENMP_OFFLOAD=ON && make -j && ctest ./

And one of the tests fails with the following error message.

Libomptarget --> Unable to generate entries table for device id 0.
Libomptarget --> Failed to init globals on device 0
Libomptarget error: Consult https://openmp.llvm.org/design/Runtimes.html for debugging options.
testLoopCALPHADbinaryEquilibrium.cc:106:1: Libomptarget fatal error 1: failure of target construct while offloading is mandatory

Looking at the debugging information we get the following information about the globals,

Libomptarget --> Device 0 is ready to use.
Target AMDGPU RTL --> Arg[0] "" (8, 0)
Target AMDGPU RTL --> Arg[1] "" (8, 8)
Target AMDGPU RTL --> Arg[2] "" (8, 16)
Target AMDGPU RTL --> Arg[3] "" (8, 24)
Target AMDGPU RTL --> Arg[4] "" (8, 32)
Target AMDGPU RTL --> Arg[5] "" (8, 40)
Target AMDGPU RTL --> [__omp_offloading_2f_1253762c__ZL29____C_A_T_C_H____T_E_S_T____0v_l106: kernarg seg size] (48 --> 48)
Target AMDGPU RTL --> Modules loaded successful? 1
Target AMDGPU RTL --> Exec Symbol type: 0
Target AMDGPU RTL --> Symbol omptarget_device_environment = 0x7f2aae6f8738 (16 bytes)
Target AMDGPU RTL --> Exec Symbol type: 0
Target AMDGPU RTL --> Symbol __omp_offloading_2f_1253762c__ZL29____C_A_T_C_H____T_E_S_T____0v_l106_exec_mode = 0x7f2aae605480 (1 bytes)
Target AMDGPU RTL --> Exec Symbol type: 1
Target AMDGPU RTL --> Kernel __omp_offloading_2f_1253762c__ZL29____C_A_T_C_H____T_E_S_T____0v_l106 --> 7f2aae605440 symbol 10032 group segsize 0 pvt segsize 48 bytes kernarg
Target AMDGPU RTL --> "Module registering" succeeded
Target AMDGPU RTL --> Setting global device environment after load (16 bytes)
Target AMDGPU RTL --> AMDGPU module successfully loaded!
Target AMDGPU RTL --> No device_state symbol found, skipping initialization
Target AMDGPU RTL --> to find the kernel name: __omp_offloading_2f_1253762c__ZL29____C_A_T_C_H____T_E_S_T____0v_l106 size: 69
Target AMDGPU RTL --> Warning: Loading KernDesc '__omp_offloading_2f_1253762c__ZL29____C_A_T_C_H____T_E_S_T____0v_l106_kern_desc' - symbol not found, Target AMDGPU RTL --> Warning: Loading WGSize '__omp_offloading_2f_1253762c__ZL29____C_A_T_C_H____T_E_S_T____0v_l106_wg_size' - symbol not found, using default value 256
Target AMDGPU RTL --> "Loading WGSize computation property" failed
Target AMDGPU RTL --> After loading global for __omp_offloading_2f_1253762c__ZL29____C_A_T_C_H____T_E_S_T____0v_l106_exec_mode ExecMode = 2
Target AMDGPU RTL --> "Loading computation property" succeeded
Target AMDGPU RTL --> Construct kernelinfo: ExecMode 2
Target AMDGPU RTL --> Entry point 0 maps to __omp_offloading_2f_1253762c__ZL29____C_A_T_C_H____T_E_S_T____0v_l106
Target AMDGPU RTL --> Loading global '_ZN10Thermo4PFML11fun_ptr_arrE' (Failed)

This was run using clang 15.0 on a GFX908 GPU.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions