[Offload] libomptarget force dlopen vendor libraries by default. #92788

ye-luo · 2024-05-20T16:52:00Z

Since #87009, libomptarget directly links all the plugins statically. All the dependencies of plugins got exposed to libomptarget. The CUDA plugin depends on libcuda and the amdgpu plugin depends on libhsa if not forced using dlopen. On a cluster with different compute node architectures, libomptarget can be built and run on different nodes. In the build stage, if cmake founds libcuda and LIBOMPTARGET_FORCE_DLOPEN_LIBCUDA=OFF, libomptarget links libcuda.so directly and the result libomptarget may not run a node without a NVIDIA driver for example a CPU or AMD GPU only machine with a complaint that libcuda.so not found.

The solution is setting LIBOMPTARGET_FORCE_DLOPEN_LIBCUDA and LIBOMPTARGET_FORCE_DLOPEN_LIBHSA ON. Preferably this should be default to maximize the usability of libomptarget. If cmake detects NVIDIA or AMD software on an OS imaging building node, the resulted libomptarget may not be able to function on the user side due to the requirement the existence of vendor runtime libraries.

Since llvm#87009, libomptarget directly links all the plugins statically. All the dependencies of plugins got exposed to libomptarget. The CUDA plugin depends on libcuda and the amdgpu plugin depends on libhsa if not forced using dlopen. On a cluster with different compute node architectures, libomptarget can be built and run on different nodes. In the build stage, if cmake founds libcuda and `LIBOMPTARGET_FORCE_DLOPEN_LIBCUDA=OFF`, libomptarget links libcuda.so directly and the result libomptarget may not run a node without a NVIDIA driver for example a CPU or AMD GPU only machine with a complaint that libcuda.so not found. The solution is setting `LIBOMPTARGET_FORCE_DLOPEN_LIBCUDA` and `LIBOMPTARGET_FORCE_DLOPEN_LIBHSA` `ON`. Preferably this should be default. If cmake detects NVIDIA or AMD software on an OS imaging building node, the resulted libomptarget may not be able to function on the user side due to the requirement the existence of vendor runtime libraries.

llvmbot · 2024-05-20T16:52:28Z

@llvm/pr-subscribers-offload

@llvm/pr-subscribers-backend-amdgpu

Author: Ye Luo (ye-luo)

Changes

Since #87009, libomptarget directly links all the plugins statically. All the dependencies of plugins got exposed to libomptarget. The CUDA plugin depends on libcuda and the amdgpu plugin depends on libhsa if not forced using dlopen. On a cluster with different compute node architectures, libomptarget can be built and run on different nodes. In the build stage, if cmake founds libcuda and LIBOMPTARGET_FORCE_DLOPEN_LIBCUDA=OFF, libomptarget links libcuda.so directly and the result libomptarget may not run a node without a NVIDIA driver for example a CPU or AMD GPU only machine with a complaint that libcuda.so not found.

The solution is setting LIBOMPTARGET_FORCE_DLOPEN_LIBCUDA and LIBOMPTARGET_FORCE_DLOPEN_LIBHSA ON. Preferably this should be default to maximize the usability of libomptarget. If cmake detects NVIDIA or AMD software on an OS imaging building node, the resulted libomptarget may not be able to function on the user side due to the requirement the existence of vendor runtime libraries.

Full diff: https://github.com/llvm/llvm-project/pull/92788.diff

2 Files Affected:

(modified) offload/plugins-nextgen/amdgpu/CMakeLists.txt (+1-1)
(modified) offload/plugins-nextgen/cuda/CMakeLists.txt (+1-1)

diff --git a/offload/plugins-nextgen/amdgpu/CMakeLists.txt b/offload/plugins-nextgen/amdgpu/CMakeLists.txt
index 2f4057c0ae7ef..7630e3788dae0 100644
--- a/offload/plugins-nextgen/amdgpu/CMakeLists.txt
+++ b/offload/plugins-nextgen/amdgpu/CMakeLists.txt
@@ -13,7 +13,7 @@ target_sources(omptarget.rtl.amdgpu PRIVATE src/rtl.cpp)
 target_include_directories(omptarget.rtl.amdgpu PRIVATE
                            ${CMAKE_CURRENT_SOURCE_DIR}/utils)
 
-option(LIBOMPTARGET_FORCE_DLOPEN_LIBHSA "Build with dlopened libhsa" OFF)
+option(LIBOMPTARGET_FORCE_DLOPEN_LIBHSA "Build with dlopened libhsa" ON)
 if(hsa-runtime64_FOUND AND NOT LIBOMPTARGET_FORCE_DLOPEN_LIBHSA)
   message(STATUS "Building AMDGPU plugin linked against libhsa")
   target_link_libraries(omptarget.rtl.amdgpu PRIVATE hsa-runtime64::hsa-runtime64)
diff --git a/offload/plugins-nextgen/cuda/CMakeLists.txt b/offload/plugins-nextgen/cuda/CMakeLists.txt
index 10ff612848ad2..fa5559c5e7dcb 100644
--- a/offload/plugins-nextgen/cuda/CMakeLists.txt
+++ b/offload/plugins-nextgen/cuda/CMakeLists.txt
@@ -10,7 +10,7 @@ add_target_library(omptarget.rtl.cuda CUDA)
 
 target_sources(omptarget.rtl.cuda PRIVATE src/rtl.cpp)
 
-option(LIBOMPTARGET_FORCE_DLOPEN_LIBCUDA "Build with dlopened libcuda" OFF)
+option(LIBOMPTARGET_FORCE_DLOPEN_LIBCUDA "Build with dlopened libcuda" ON)
 if(LIBOMPTARGET_DEP_CUDA_FOUND AND NOT LIBOMPTARGET_FORCE_DLOPEN_LIBCUDA)
   message(STATUS "Building CUDA plugin linked against libcuda")
   target_link_libraries(omptarget.rtl.cuda PRIVATE CUDA::cuda_driver)

shiltian

This makes sense to me.

jhuber6 · 2024-05-20T18:07:59Z

Does #87009 really make a difference? I thought that in the old case we would simply dlopen on the old libomptarget.rtl.cuda.so that would then be linked against libcuda.so directly.

ye-luo · 2024-05-20T18:14:37Z

Does #87009 really make a difference? I thought that in the old case we would simply dlopen on the old libomptarget.rtl.cuda.so that would then be linked against libcuda.so directly.

In the old case, dlopen libomptarget.rtl.cuda.so fails but the user programs continue.

jhuber6 · 2024-05-20T20:23:36Z

In the old case, dlopen libomptarget.rtl.cuda.so fails but the user programs continue.

I see, I think this makes sense as a sane default. However I'll wait until we discuss it in the Wednesday meeting to make the switch.

jhuber6 · 2024-05-21T19:14:51Z

Would it be possible for you to do a quick check on one of your applications to see if there's any noticeable performance difference? I'm going to guess that calling dlsym ~100 times isn't going to show up as an appreciable difference.

ye-luo · 2024-05-21T19:23:35Z

Would it be possible for you to do a quick check on one of your applications to see if there's any noticeable performance difference? I'm going to guess that calling dlsym ~100 times isn't going to show up as an appreciable difference.

libomptarget initialization was never an issue on all the machine I have ever used. I don't see why dlopen is a concern. I meant I won't have good measurements for your concern. Maybe you try it out on a machine that you have concern?

jhuber6

Looks good after discussion with others. I will probably rework the logic here later to make it more consistent, but this is fine for now.

jplehr · 2024-05-22T15:50:17Z

This appears to have caused/exposed our buildbot to produce a lot more intermittent failures of two test cases compared to what it used to produce:
https://lab.llvm.org/buildbot/#/builders/193/builds/52340
https://lab.llvm.org/buildbot/#/builders/193/builds/52344
https://lab.llvm.org/buildbot/#/builders/193/builds/52348

Any ideas why that would be happening?

jhuber6 · 2024-05-22T15:52:20Z

This appears to have caused/exposed our buildbot to produce a lot more intermittent failures of two test cases compared to what it used to produce: https://lab.llvm.org/buildbot/#/builders/193/builds/52340 https://lab.llvm.org/buildbot/#/builders/193/builds/52344 https://lab.llvm.org/buildbot/#/builders/193/builds/52348

Any ideas why that would be happening?

Confusing considering that it's failing on x86 and not CUDA nor HSA which is what this directly affects.

jplehr · 2024-05-22T16:38:35Z

yeah, i wonder if it simply exposes something that has been lingering for some time.

jhuber6 · 2024-05-22T16:39:41Z

yeah, i wonder if it simply exposes something that has been lingering for some time.

I can't reproduce it locally. Think you could try running it manually w/ debugging?

Disabled: [Offload] libomptarget force dlopen vendor libraries by default. (llvm#92788) Change-Id: Ibe7da67ea294b76e9d159372e1d6151fe6b5cba2

llvmbot added backend:AMDGPU offload labels May 20, 2024

shiltian requested a review from jhuber6 May 20, 2024 17:17

shiltian reviewed May 20, 2024

View reviewed changes

jhuber6 approved these changes May 22, 2024

View reviewed changes

ye-luo merged commit 831d143 into llvm:main May 22, 2024
7 checks passed

ye-luo deleted the dlopen-vendor branch May 22, 2024 14:40

jplehr mentioned this pull request May 23, 2024

[Offload] Flaky test dynamic_module on AMDGPU Buildbot #93173

Open

searlmc1 pushed a commit to ROCm/llvm-project that referenced this pull request May 24, 2024

merge main into amd-staging

226e581

Disabled: [Offload] libomptarget force dlopen vendor libraries by default. (llvm#92788) Change-Id: Ibe7da67ea294b76e9d159372e1d6151fe6b5cba2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Offload] libomptarget force dlopen vendor libraries by default. #92788

[Offload] libomptarget force dlopen vendor libraries by default. #92788

ye-luo commented May 20, 2024

llvmbot commented May 20, 2024 •

edited

Loading

shiltian left a comment

jhuber6 commented May 20, 2024

ye-luo commented May 20, 2024

jhuber6 commented May 20, 2024

jhuber6 commented May 21, 2024

ye-luo commented May 21, 2024

jhuber6 left a comment

jplehr commented May 22, 2024

jhuber6 commented May 22, 2024

jplehr commented May 22, 2024

jhuber6 commented May 22, 2024

[Offload] libomptarget force dlopen vendor libraries by default. #92788

[Offload] libomptarget force dlopen vendor libraries by default. #92788

Conversation

ye-luo commented May 20, 2024

llvmbot commented May 20, 2024 • edited Loading

shiltian left a comment

Choose a reason for hiding this comment

jhuber6 commented May 20, 2024

ye-luo commented May 20, 2024

jhuber6 commented May 20, 2024

jhuber6 commented May 21, 2024

ye-luo commented May 21, 2024

jhuber6 left a comment

Choose a reason for hiding this comment

jplehr commented May 22, 2024

jhuber6 commented May 22, 2024

jplehr commented May 22, 2024

jhuber6 commented May 22, 2024

llvmbot commented May 20, 2024 •

edited

Loading