-
Notifications
You must be signed in to change notification settings - Fork 11k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AMDGPU][OpenMP] Using HSA API calls for APU detection. #90186
base: main
Are you sure you want to change the base?
Conversation
@llvm/pr-subscribers-offload Author: Thorsten Blaß (ThorBl) ChangesA simplified APU detection algorithm will be enabled if ROCm 6.1 is available at the compiler's build time. This algorithm is more reliable and simplified, as it doesn't rely on string comparison of the GFX name. With this new algorithm, the system can effectively distinguish between an MI300A and an MI300X without the need for further checking. Full diff: https://github.com/llvm/llvm-project/pull/90186.diff 3 Files Affected:
diff --git a/offload/plugins-nextgen/amdgpu/CMakeLists.txt b/offload/plugins-nextgen/amdgpu/CMakeLists.txt
index f5f7096137c20f..5fb4a37b6604d2 100644
--- a/offload/plugins-nextgen/amdgpu/CMakeLists.txt
+++ b/offload/plugins-nextgen/amdgpu/CMakeLists.txt
@@ -40,7 +40,9 @@ if(hsa-runtime64_FOUND AND NOT LIBOMPTARGET_FORCE_DLOPEN_LIBHSA)
target_link_libraries(omptarget.rtl.amdgpu PRIVATE hsa-runtime64::hsa-runtime64)
else()
libomptarget_say("Building AMDGPU plugin for dlopened libhsa")
- target_include_directories(omptarget.rtl.amdgpu PRIVATE dynamic_hsa)
+ target_include_directories(omptarget.rtl.amdgpu
+ PRIVATE dynamic_hsa
+ PRIVATE dynamic_hip)
target_sources(omptarget.rtl.amdgpu PRIVATE dynamic_hsa/hsa.cpp)
endif()
diff --git a/offload/plugins-nextgen/amdgpu/dynamic_hip/hip_version.h b/offload/plugins-nextgen/amdgpu/dynamic_hip/hip_version.h
new file mode 100644
index 00000000000000..496331cf86ec8f
--- /dev/null
+++ b/offload/plugins-nextgen/amdgpu/dynamic_hip/hip_version.h
@@ -0,0 +1,16 @@
+#ifndef HIP_VERSION
+#define HIP_VERSION
+
+#define HIP_VERSION_MAJOR 0
+#define HIP_VERSION_MINOR 0
+#define HIP_VERSION_PATCH 0
+#define HIP_VERSION_GITHASH ""
+#define HIP_VERSION_BUILD_ID 0
+#define HIP_VERSION_BUILD_NAME ""
+#define HIP_VERSION \
+ (HIP_VERSION_MAJOR * 10000000 + HIP_VERSION_MINOR * 100000 + \
+ HIP_VERSION_PATCH)
+
+#define __HIP_HAS_GET_PCH 0
+
+#endif /* HIP_VERSION */
diff --git a/offload/plugins-nextgen/amdgpu/src/rtl.cpp b/offload/plugins-nextgen/amdgpu/src/rtl.cpp
index 00650b801b4202..58323dbd7233d2 100644
--- a/offload/plugins-nextgen/amdgpu/src/rtl.cpp
+++ b/offload/plugins-nextgen/amdgpu/src/rtl.cpp
@@ -57,6 +57,12 @@
#endif
#if defined(__has_include)
+#if __has_include("hip/hip_version.h")
+#include "hip/hip_version.h"
+#else
+#include "hip_version.h"
+#endif
+
#if __has_include("hsa/hsa.h")
#include "hsa/hsa.h"
#include "hsa/hsa_ext_amd.h"
@@ -2809,7 +2815,21 @@ struct AMDGPUDeviceTy : public GenericDeviceTy, AMDGenericDeviceTy {
/// Detect if current architecture is an APU.
Error checkIfAPU() {
- // TODO: replace with ROCr API once it becomes available.
+#if (HIP_VERSION_MAJOR >= 6 && HIP_VERSION_MINOR >= 1)
+
+ uint8_t MemoryProperties[8];
+
+ hsa_status_t Stat = hsa_agent_get_info(
+ getAgent(), (hsa_agent_info_t)HSA_AMD_AGENT_INFO_MEMORY_PROPERTIES,
+ MemoryProperties);
+
+ if (auto Err = Plugin::check(Stat, "Error: Unable to fetch the memory "
+ "properties of the GPU. (%s)\n"))
+ return Err;
+
+ IsAPU = hsa_flag_isset64(MemoryProperties,
+ HSA_AMD_MEMORY_PROPERTY_AGENT_IS_APU);
+#else
llvm::StringRef StrGfxName(ComputeUnitKind);
IsAPU = llvm::StringSwitch<bool>(StrGfxName)
.Case("gfx940", true)
@@ -2832,6 +2852,7 @@ struct AMDGPUDeviceTy : public GenericDeviceTy, AMDGenericDeviceTy {
IsAPU = true;
return Plugin::success();
}
+#endif
return Plugin::success();
}
|
@llvm/pr-subscribers-backend-amdgpu Author: Thorsten Blaß (ThorBl) ChangesA simplified APU detection algorithm will be enabled if ROCm 6.1 is available at the compiler's build time. This algorithm is more reliable and simplified, as it doesn't rely on string comparison of the GFX name. With this new algorithm, the system can effectively distinguish between an MI300A and an MI300X without the need for further checking. Full diff: https://github.com/llvm/llvm-project/pull/90186.diff 3 Files Affected:
diff --git a/offload/plugins-nextgen/amdgpu/CMakeLists.txt b/offload/plugins-nextgen/amdgpu/CMakeLists.txt
index f5f7096137c20f..5fb4a37b6604d2 100644
--- a/offload/plugins-nextgen/amdgpu/CMakeLists.txt
+++ b/offload/plugins-nextgen/amdgpu/CMakeLists.txt
@@ -40,7 +40,9 @@ if(hsa-runtime64_FOUND AND NOT LIBOMPTARGET_FORCE_DLOPEN_LIBHSA)
target_link_libraries(omptarget.rtl.amdgpu PRIVATE hsa-runtime64::hsa-runtime64)
else()
libomptarget_say("Building AMDGPU plugin for dlopened libhsa")
- target_include_directories(omptarget.rtl.amdgpu PRIVATE dynamic_hsa)
+ target_include_directories(omptarget.rtl.amdgpu
+ PRIVATE dynamic_hsa
+ PRIVATE dynamic_hip)
target_sources(omptarget.rtl.amdgpu PRIVATE dynamic_hsa/hsa.cpp)
endif()
diff --git a/offload/plugins-nextgen/amdgpu/dynamic_hip/hip_version.h b/offload/plugins-nextgen/amdgpu/dynamic_hip/hip_version.h
new file mode 100644
index 00000000000000..496331cf86ec8f
--- /dev/null
+++ b/offload/plugins-nextgen/amdgpu/dynamic_hip/hip_version.h
@@ -0,0 +1,16 @@
+#ifndef HIP_VERSION
+#define HIP_VERSION
+
+#define HIP_VERSION_MAJOR 0
+#define HIP_VERSION_MINOR 0
+#define HIP_VERSION_PATCH 0
+#define HIP_VERSION_GITHASH ""
+#define HIP_VERSION_BUILD_ID 0
+#define HIP_VERSION_BUILD_NAME ""
+#define HIP_VERSION \
+ (HIP_VERSION_MAJOR * 10000000 + HIP_VERSION_MINOR * 100000 + \
+ HIP_VERSION_PATCH)
+
+#define __HIP_HAS_GET_PCH 0
+
+#endif /* HIP_VERSION */
diff --git a/offload/plugins-nextgen/amdgpu/src/rtl.cpp b/offload/plugins-nextgen/amdgpu/src/rtl.cpp
index 00650b801b4202..58323dbd7233d2 100644
--- a/offload/plugins-nextgen/amdgpu/src/rtl.cpp
+++ b/offload/plugins-nextgen/amdgpu/src/rtl.cpp
@@ -57,6 +57,12 @@
#endif
#if defined(__has_include)
+#if __has_include("hip/hip_version.h")
+#include "hip/hip_version.h"
+#else
+#include "hip_version.h"
+#endif
+
#if __has_include("hsa/hsa.h")
#include "hsa/hsa.h"
#include "hsa/hsa_ext_amd.h"
@@ -2809,7 +2815,21 @@ struct AMDGPUDeviceTy : public GenericDeviceTy, AMDGenericDeviceTy {
/// Detect if current architecture is an APU.
Error checkIfAPU() {
- // TODO: replace with ROCr API once it becomes available.
+#if (HIP_VERSION_MAJOR >= 6 && HIP_VERSION_MINOR >= 1)
+
+ uint8_t MemoryProperties[8];
+
+ hsa_status_t Stat = hsa_agent_get_info(
+ getAgent(), (hsa_agent_info_t)HSA_AMD_AGENT_INFO_MEMORY_PROPERTIES,
+ MemoryProperties);
+
+ if (auto Err = Plugin::check(Stat, "Error: Unable to fetch the memory "
+ "properties of the GPU. (%s)\n"))
+ return Err;
+
+ IsAPU = hsa_flag_isset64(MemoryProperties,
+ HSA_AMD_MEMORY_PROPERTY_AGENT_IS_APU);
+#else
llvm::StringRef StrGfxName(ComputeUnitKind);
IsAPU = llvm::StringSwitch<bool>(StrGfxName)
.Case("gfx940", true)
@@ -2832,6 +2852,7 @@ struct AMDGPUDeviceTy : public GenericDeviceTy, AMDGenericDeviceTy {
IsAPU = true;
return Plugin::success();
}
+#endif
return Plugin::success();
}
|
if (auto Err = Plugin::check(Stat, "Error: Unable to fetch the memory " | ||
"properties of the GPU. (%s)\n")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we get a better / not generic error from Stat?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not aware there is a way to get a more precise error description than what is encoded in the hsa_status_t enum. I will have another look to see if I can find anything, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does this require HIP at all? We really don't want to link against the HIP runtime.
#else | ||
#include "hip_version.h" | ||
#endif | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm guessing that HSA didn't bump up a HSA_AMD_INTERFACE_VERSION_MINOR
for this feature or something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you are right! They didn't bump up HSA_AMD_INTERFACE_VERSION_MINOR. Despite new features, the HSA versions shipped with (at least) ROCm 6.0.2 and ROCm 6.1 have the same HSA_AMD_INTERFACE_VERSION_MINOR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose we need to be more vocal to the HSA developers that we really need the API to be versioned whenever a new interface is added. I really don't want us to depend on HIP here just for a version check that should be in HSA anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I talked to one of the HSA guys. HSA_AMD_INTERFACE_VERSION_MINOR will be bumped up and made available with the 6.2 release. An internal PR for this has already been created. So, I tend to ride it out. I will adjust the #if condition to check for HSA_AMD_INTERFACE_VERSION_MINOR >= 5.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, so we'll automatically pick this up once the ROCm installation upgrades? How about in the dlopen(libhsa-runtime64.so
case? I think we want an actual function to get the versions as well in those cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, if ROCm 6.2 is eventually available during the build process, the new implementation will be used.
Regarding your second question. Are you referring to a method that determines the HSA version at runtime?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So it seems that HSA bumped up to 6.1 and introduced new runtime functions but neglected to bump up any sort of version indicator that would let us detect if they are available. So now, even though these are HSA functions, we're including a completely unrelated library just to check if they exist. This is a really annoying situation and I wish HSA would actually version things.
Honestly I think we should probably just add a version for this into |
A simplified APU detection algorithm will be enabled if ROCm 6.1 is available at the compiler's build time. This algorithm is more reliable and simplified, as it doesn't rely on string comparison of the GFX name. With this new algorithm, the system can effectively distinguish between an MI300A and an MI300X without the need for further checking.
e9635da
to
127c59e
Compare
✅ With the latest revision this PR passed the C/C++ code formatter. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LG, thanks.
A simplified APU detection algorithm will be enabled if ROCm 6.1 is available at the compiler's build time. This algorithm is more reliable and simplified, as it doesn't rely on string comparison of the GFX name. With this new algorithm, the system can effectively distinguish between an MI300A and an MI300X without the need for further checking.