-
Notifications
You must be signed in to change notification settings - Fork 11.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Offload] libomptarget force dlopen vendor libraries by default. #92788
Conversation
Since llvm#87009, libomptarget directly links all the plugins statically. All the dependencies of plugins got exposed to libomptarget. The CUDA plugin depends on libcuda and the amdgpu plugin depends on libhsa if not forced using dlopen. On a cluster with different compute node architectures, libomptarget can be built and run on different nodes. In the build stage, if cmake founds libcuda and `LIBOMPTARGET_FORCE_DLOPEN_LIBCUDA=OFF`, libomptarget links libcuda.so directly and the result libomptarget may not run a node without a NVIDIA driver for example a CPU or AMD GPU only machine with a complaint that libcuda.so not found. The solution is setting `LIBOMPTARGET_FORCE_DLOPEN_LIBCUDA` and `LIBOMPTARGET_FORCE_DLOPEN_LIBHSA` `ON`. Preferably this should be default. If cmake detects NVIDIA or AMD software on an OS imaging building node, the resulted libomptarget may not be able to function on the user side due to the requirement the existence of vendor runtime libraries.
@llvm/pr-subscribers-offload @llvm/pr-subscribers-backend-amdgpu Author: Ye Luo (ye-luo) ChangesSince #87009, libomptarget directly links all the plugins statically. All the dependencies of plugins got exposed to libomptarget. The CUDA plugin depends on libcuda and the amdgpu plugin depends on libhsa if not forced using dlopen. On a cluster with different compute node architectures, libomptarget can be built and run on different nodes. In the build stage, if cmake founds libcuda and The solution is setting Full diff: https://github.com/llvm/llvm-project/pull/92788.diff 2 Files Affected:
diff --git a/offload/plugins-nextgen/amdgpu/CMakeLists.txt b/offload/plugins-nextgen/amdgpu/CMakeLists.txt
index 2f4057c0ae7ef..7630e3788dae0 100644
--- a/offload/plugins-nextgen/amdgpu/CMakeLists.txt
+++ b/offload/plugins-nextgen/amdgpu/CMakeLists.txt
@@ -13,7 +13,7 @@ target_sources(omptarget.rtl.amdgpu PRIVATE src/rtl.cpp)
target_include_directories(omptarget.rtl.amdgpu PRIVATE
${CMAKE_CURRENT_SOURCE_DIR}/utils)
-option(LIBOMPTARGET_FORCE_DLOPEN_LIBHSA "Build with dlopened libhsa" OFF)
+option(LIBOMPTARGET_FORCE_DLOPEN_LIBHSA "Build with dlopened libhsa" ON)
if(hsa-runtime64_FOUND AND NOT LIBOMPTARGET_FORCE_DLOPEN_LIBHSA)
message(STATUS "Building AMDGPU plugin linked against libhsa")
target_link_libraries(omptarget.rtl.amdgpu PRIVATE hsa-runtime64::hsa-runtime64)
diff --git a/offload/plugins-nextgen/cuda/CMakeLists.txt b/offload/plugins-nextgen/cuda/CMakeLists.txt
index 10ff612848ad2..fa5559c5e7dcb 100644
--- a/offload/plugins-nextgen/cuda/CMakeLists.txt
+++ b/offload/plugins-nextgen/cuda/CMakeLists.txt
@@ -10,7 +10,7 @@ add_target_library(omptarget.rtl.cuda CUDA)
target_sources(omptarget.rtl.cuda PRIVATE src/rtl.cpp)
-option(LIBOMPTARGET_FORCE_DLOPEN_LIBCUDA "Build with dlopened libcuda" OFF)
+option(LIBOMPTARGET_FORCE_DLOPEN_LIBCUDA "Build with dlopened libcuda" ON)
if(LIBOMPTARGET_DEP_CUDA_FOUND AND NOT LIBOMPTARGET_FORCE_DLOPEN_LIBCUDA)
message(STATUS "Building CUDA plugin linked against libcuda")
target_link_libraries(omptarget.rtl.cuda PRIVATE CUDA::cuda_driver)
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes sense to me.
Does #87009 really make a difference? I thought that in the old case we would simply |
In the old case, dlopen |
I see, I think this makes sense as a sane default. However I'll wait until we discuss it in the Wednesday meeting to make the switch. |
Would it be possible for you to do a quick check on one of your applications to see if there's any noticeable performance difference? I'm going to guess that calling |
libomptarget initialization was never an issue on all the machine I have ever used. I don't see why dlopen is a concern. I meant I won't have good measurements for your concern. Maybe you try it out on a machine that you have concern? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good after discussion with others. I will probably rework the logic here later to make it more consistent, but this is fine for now.
This appears to have caused/exposed our buildbot to produce a lot more intermittent failures of two test cases compared to what it used to produce: Any ideas why that would be happening? |
Confusing considering that it's failing on x86 and not CUDA nor HSA which is what this directly affects. |
yeah, i wonder if it simply exposes something that has been lingering for some time. |
I can't reproduce it locally. Think you could try running it manually w/ debugging? |
Disabled: [Offload] libomptarget force dlopen vendor libraries by default. (llvm#92788) Change-Id: Ibe7da67ea294b76e9d159372e1d6151fe6b5cba2
Since #87009, libomptarget directly links all the plugins statically. All the dependencies of plugins got exposed to libomptarget. The CUDA plugin depends on libcuda and the amdgpu plugin depends on libhsa if not forced using dlopen. On a cluster with different compute node architectures, libomptarget can be built and run on different nodes. In the build stage, if cmake founds libcuda and
LIBOMPTARGET_FORCE_DLOPEN_LIBCUDA=OFF
, libomptarget links libcuda.so directly and the result libomptarget may not run a node without a NVIDIA driver for example a CPU or AMD GPU only machine with a complaint that libcuda.so not found.The solution is setting
LIBOMPTARGET_FORCE_DLOPEN_LIBCUDA
andLIBOMPTARGET_FORCE_DLOPEN_LIBHSA
ON
. Preferably this should be default to maximize the usability of libomptarget. If cmake detects NVIDIA or AMD software on an OS imaging building node, the resulted libomptarget may not be able to function on the user side due to the requirement the existence of vendor runtime libraries.