-
Notifications
You must be signed in to change notification settings - Fork 10.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[AMDGPU] Defaults for missing dimensions in SYCL required wg size #72652
Conversation
SYCL allows for required work group to be partially specified (i.e. not all 3 dimensions): * https://registry.khronos.org/SYCL/specs/sycl-2020/html/sycl-2020.html#sec:kernel.attributes This fails AMDGPU's attribute verification. The patch aims to provide the default values for missing dimensions when dealing with SYCL kernels. Rather than modifying the module's metadata it uses internal data to padd missing values.
1644043
to
c2c9e5e
Compare
@llvm/pr-subscribers-backend-amdgpu Author: Jakub Chlanda (jchlanda) ChangesSYCL allows for required work group to be partially specified (i.e. not all 3 dimensions): This fails AMDGPU's attribute verification. The patch aims to provide the default values for missing dimensions when dealing with SYCL kernels. Rather than modifying the module's metadata it uses internal data to padd missing values. Full diff: https://github.com/llvm/llvm-project/pull/72652.diff 3 Files Affected:
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUHSAMetadataStreamer.cpp b/llvm/lib/Target/AMDGPU/AMDGPUHSAMetadataStreamer.cpp
index b51a876750b58b0..bff30bda357e2b9 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUHSAMetadataStreamer.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUHSAMetadataStreamer.cpp
@@ -173,14 +173,18 @@ std::string MetadataStreamerMsgPackV4::getTypeName(Type *Ty,
}
msgpack::ArrayDocNode
-MetadataStreamerMsgPackV4::getWorkGroupDimensions(MDNode *Node) const {
+MetadataStreamerMsgPackV4::getWorkGroupDimensions(const Function &Func,
+ MDNode *Node) const {
auto Dims = HSAMetadataDoc->getArrayNode();
- if (Node->getNumOperands() != 3)
+ if (Node->getNumOperands() != 3 && !Func.hasFnAttribute("sycl-module-id"))
return Dims;
for (auto &Op : Node->operands())
Dims.push_back(Dims.getDocument()->getNode(
uint64_t(mdconst::extract<ConstantInt>(Op)->getZExtValue())));
+ for (unsigned I = Dims.size(); I < 3; ++I)
+ Dims.push_back(Dims.getDocument()->getNode(1));
+
return Dims;
}
@@ -233,9 +237,9 @@ void MetadataStreamerMsgPackV4::emitKernelAttrs(const Function &Func,
msgpack::MapDocNode Kern) {
if (auto Node = Func.getMetadata("reqd_work_group_size"))
- Kern[".reqd_workgroup_size"] = getWorkGroupDimensions(Node);
+ Kern[".reqd_workgroup_size"] = getWorkGroupDimensions(Func, Node);
if (auto Node = Func.getMetadata("work_group_size_hint"))
- Kern[".workgroup_size_hint"] = getWorkGroupDimensions(Node);
+ Kern[".workgroup_size_hint"] = getWorkGroupDimensions(Func, Node);
if (auto Node = Func.getMetadata("vec_type_hint")) {
Kern[".vec_type_hint"] = Kern.getDocument()->getNode(
getTypeName(
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUHSAMetadataStreamer.h b/llvm/lib/Target/AMDGPU/AMDGPUHSAMetadataStreamer.h
index 18a7b5d7a9633e8..3214f096f27b9a9 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUHSAMetadataStreamer.h
+++ b/llvm/lib/Target/AMDGPU/AMDGPUHSAMetadataStreamer.h
@@ -81,7 +81,8 @@ class MetadataStreamerMsgPackV4 : public MetadataStreamer {
std::string getTypeName(Type *Ty, bool Signed) const;
- msgpack::ArrayDocNode getWorkGroupDimensions(MDNode *Node) const;
+ msgpack::ArrayDocNode getWorkGroupDimensions(const Function &Func,
+ MDNode *Node) const;
msgpack::MapDocNode getHSAKernelProps(const MachineFunction &MF,
const SIProgramInfo &ProgramInfo,
diff --git a/llvm/test/CodeGen/AMDGPU/required_work_group_size_sycl.ll b/llvm/test/CodeGen/AMDGPU/required_work_group_size_sycl.ll
new file mode 100644
index 000000000000000..1999a55ff31ee5a
--- /dev/null
+++ b/llvm/test/CodeGen/AMDGPU/required_work_group_size_sycl.ll
@@ -0,0 +1,42 @@
+; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx90a -verify-machineinstrs < %s | FileCheck %s
+
+; Make sure that SYCL kernels with less than 3 dimensions specified in required
+; work group size, have those dimensions padded up with 1.
+
+; CHECK-LABEL: .name: sycl_kernel_1dim
+; CHECK: .reqd_workgroup_size:
+; CHECK-NEXT: - 3
+; CHECK-NEXT: - 1
+; CHECK-NEXT: - 1
+define weak_odr protected amdgpu_kernel void @sycl_kernel_1dim() #1 !reqd_work_group_size !0 {
+entry:
+ ret void
+}
+
+; CHECK-LABEL: .name: sycl_kernel_2dim
+; CHECK: .reqd_workgroup_size:
+; CHECK-NEXT: - 5
+; CHECK-NEXT: - 7
+; CHECK-NEXT: - 1
+define weak_odr protected amdgpu_kernel void @sycl_kernel_2dim() #1 !reqd_work_group_size !1 {
+entry:
+ ret void
+}
+
+; CHECK-LABEL: .name: sycl_kernel_3dim
+; CHECK: .reqd_workgroup_size:
+; CHECK-NEXT: - 11
+; CHECK-NEXT: - 13
+; CHECK-NEXT: - 17
+define weak_odr protected amdgpu_kernel void @sycl_kernel_3dim() #1 !reqd_work_group_size !2 {
+entry:
+ ret void
+}
+
+attributes #0 = { nounwind speculatable memory(none) }
+attributes #1 = { "sycl-module-id"="reqd_work_group_size_check_exception.cpp" }
+
+
+!0 = !{i32 3}
+!1 = !{i32 5, i32 7}
+!2 = !{i32 11, i32 13, i32 17}
|
Friendly ping @arsenm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still think we need to have one single spec for the metadata and not modify based on any is-this-language type annotations. The different producers and consumers should just agree on the operand constraints of the metadata
auto Dims = HSAMetadataDoc->getArrayNode(); | ||
if (Node->getNumOperands() != 3) | ||
if (Node->getNumOperands() != 3 && !Func.hasFnAttribute("sycl-module-id")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The attribute check should not be necessary. The metadata should have a standalone interpretation and not depend on any other attributes
; CHECK-NEXT: - 3 | ||
; CHECK-NEXT: - 1 | ||
; CHECK-NEXT: - 1 | ||
define weak_odr protected amdgpu_kernel void @sycl_kernel_1dim() #1 !reqd_work_group_size !0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't need the weak_odrs
@@ -0,0 +1,42 @@ | |||
; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx90a -verify-machineinstrs < %s | FileCheck %s |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't bother with -verify-machineinstrs, it's unlikely to catch anything here and will be caught in EXPENSIVE_CHECKS builds
Closing this PR, we've decide to handle it internally in SYCL (intel/llvm#13600). It aligns with OpenCL's and always pads missing dimensions with |
SYCL allows for required work group to be partially specified (i.e. not all 3 dimensions):
This fails AMDGPU's attribute verification. The patch aims to provide the default values for missing dimensions when dealing with SYCL kernels. Rather than modifying the module's metadata it uses internal data to padd missing values.