Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AMDGPU] Defaults for missing dimensions in SYCL required wg size #72652

Closed

Conversation

jchlanda
Copy link
Contributor

@jchlanda jchlanda commented Nov 17, 2023

SYCL allows for required work group to be partially specified (i.e. not all 3 dimensions):

This fails AMDGPU's attribute verification. The patch aims to provide the default values for missing dimensions when dealing with SYCL kernels. Rather than modifying the module's metadata it uses internal data to padd missing values.

SYCL allows for required work group to be partially specified (i.e. not
all 3 dimensions):
* https://registry.khronos.org/SYCL/specs/sycl-2020/html/sycl-2020.html#sec:kernel.attributes
This fails AMDGPU's attribute verification. The patch aims to provide
the default values for missing dimensions when dealing with SYCL
kernels. Rather than modifying the module's metadata it uses internal
data to padd missing values.
@jchlanda jchlanda force-pushed the jakub/amd_rqwgs_sycl_streamer_padd branch from 1644043 to c2c9e5e Compare November 20, 2023 14:08
@jchlanda jchlanda requested a review from arsenm November 21, 2023 13:14
@jchlanda jchlanda marked this pull request as ready for review November 21, 2023 13:16
@llvmbot
Copy link
Collaborator

llvmbot commented Nov 21, 2023

@llvm/pr-subscribers-backend-amdgpu

Author: Jakub Chlanda (jchlanda)

Changes

SYCL allows for required work group to be partially specified (i.e. not all 3 dimensions):

This fails AMDGPU's attribute verification. The patch aims to provide the default values for missing dimensions when dealing with SYCL kernels. Rather than modifying the module's metadata it uses internal data to padd missing values.


Full diff: https://github.com/llvm/llvm-project/pull/72652.diff

3 Files Affected:

  • (modified) llvm/lib/Target/AMDGPU/AMDGPUHSAMetadataStreamer.cpp (+8-4)
  • (modified) llvm/lib/Target/AMDGPU/AMDGPUHSAMetadataStreamer.h (+2-1)
  • (added) llvm/test/CodeGen/AMDGPU/required_work_group_size_sycl.ll (+42)
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUHSAMetadataStreamer.cpp b/llvm/lib/Target/AMDGPU/AMDGPUHSAMetadataStreamer.cpp
index b51a876750b58b0..bff30bda357e2b9 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUHSAMetadataStreamer.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUHSAMetadataStreamer.cpp
@@ -173,14 +173,18 @@ std::string MetadataStreamerMsgPackV4::getTypeName(Type *Ty,
 }
 
 msgpack::ArrayDocNode
-MetadataStreamerMsgPackV4::getWorkGroupDimensions(MDNode *Node) const {
+MetadataStreamerMsgPackV4::getWorkGroupDimensions(const Function &Func,
+                                                  MDNode *Node) const {
   auto Dims = HSAMetadataDoc->getArrayNode();
-  if (Node->getNumOperands() != 3)
+  if (Node->getNumOperands() != 3 && !Func.hasFnAttribute("sycl-module-id"))
     return Dims;
 
   for (auto &Op : Node->operands())
     Dims.push_back(Dims.getDocument()->getNode(
         uint64_t(mdconst::extract<ConstantInt>(Op)->getZExtValue())));
+  for (unsigned I = Dims.size(); I < 3; ++I)
+    Dims.push_back(Dims.getDocument()->getNode(1));
+
   return Dims;
 }
 
@@ -233,9 +237,9 @@ void MetadataStreamerMsgPackV4::emitKernelAttrs(const Function &Func,
                                                 msgpack::MapDocNode Kern) {
 
   if (auto Node = Func.getMetadata("reqd_work_group_size"))
-    Kern[".reqd_workgroup_size"] = getWorkGroupDimensions(Node);
+    Kern[".reqd_workgroup_size"] = getWorkGroupDimensions(Func, Node);
   if (auto Node = Func.getMetadata("work_group_size_hint"))
-    Kern[".workgroup_size_hint"] = getWorkGroupDimensions(Node);
+    Kern[".workgroup_size_hint"] = getWorkGroupDimensions(Func, Node);
   if (auto Node = Func.getMetadata("vec_type_hint")) {
     Kern[".vec_type_hint"] = Kern.getDocument()->getNode(
         getTypeName(
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUHSAMetadataStreamer.h b/llvm/lib/Target/AMDGPU/AMDGPUHSAMetadataStreamer.h
index 18a7b5d7a9633e8..3214f096f27b9a9 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUHSAMetadataStreamer.h
+++ b/llvm/lib/Target/AMDGPU/AMDGPUHSAMetadataStreamer.h
@@ -81,7 +81,8 @@ class MetadataStreamerMsgPackV4 : public MetadataStreamer {
 
   std::string getTypeName(Type *Ty, bool Signed) const;
 
-  msgpack::ArrayDocNode getWorkGroupDimensions(MDNode *Node) const;
+  msgpack::ArrayDocNode getWorkGroupDimensions(const Function &Func,
+                                               MDNode *Node) const;
 
   msgpack::MapDocNode getHSAKernelProps(const MachineFunction &MF,
                                         const SIProgramInfo &ProgramInfo,
diff --git a/llvm/test/CodeGen/AMDGPU/required_work_group_size_sycl.ll b/llvm/test/CodeGen/AMDGPU/required_work_group_size_sycl.ll
new file mode 100644
index 000000000000000..1999a55ff31ee5a
--- /dev/null
+++ b/llvm/test/CodeGen/AMDGPU/required_work_group_size_sycl.ll
@@ -0,0 +1,42 @@
+; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx90a -verify-machineinstrs < %s | FileCheck %s
+
+; Make sure that SYCL kernels with less than 3 dimensions specified in required
+; work group size, have those dimensions padded up with 1.
+
+; CHECK-LABEL: .name:           sycl_kernel_1dim
+; CHECK:    .reqd_workgroup_size:
+; CHECK-NEXT:      - 3
+; CHECK-NEXT:      - 1
+; CHECK-NEXT:      - 1
+define weak_odr protected amdgpu_kernel void @sycl_kernel_1dim() #1 !reqd_work_group_size !0 {
+entry:
+  ret void
+}
+
+; CHECK-LABEL: .name:           sycl_kernel_2dim
+; CHECK:    .reqd_workgroup_size:
+; CHECK-NEXT:      - 5
+; CHECK-NEXT:      - 7
+; CHECK-NEXT:      - 1
+define weak_odr protected amdgpu_kernel void @sycl_kernel_2dim() #1 !reqd_work_group_size !1 {
+entry:
+  ret void
+}
+
+; CHECK-LABEL: .name:           sycl_kernel_3dim
+; CHECK:    .reqd_workgroup_size:
+; CHECK-NEXT:      - 11 
+; CHECK-NEXT:      - 13
+; CHECK-NEXT:      - 17
+define weak_odr protected amdgpu_kernel void @sycl_kernel_3dim() #1 !reqd_work_group_size !2 {
+entry:
+  ret void
+}
+
+attributes #0 = { nounwind speculatable memory(none) }
+attributes #1 = { "sycl-module-id"="reqd_work_group_size_check_exception.cpp" }
+
+
+!0 = !{i32 3}
+!1 = !{i32 5, i32 7}
+!2 = !{i32 11, i32 13, i32 17}

@jchlanda
Copy link
Contributor Author

Friendly ping @arsenm

Copy link
Contributor

@arsenm arsenm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think we need to have one single spec for the metadata and not modify based on any is-this-language type annotations. The different producers and consumers should just agree on the operand constraints of the metadata

auto Dims = HSAMetadataDoc->getArrayNode();
if (Node->getNumOperands() != 3)
if (Node->getNumOperands() != 3 && !Func.hasFnAttribute("sycl-module-id"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The attribute check should not be necessary. The metadata should have a standalone interpretation and not depend on any other attributes

; CHECK-NEXT: - 3
; CHECK-NEXT: - 1
; CHECK-NEXT: - 1
define weak_odr protected amdgpu_kernel void @sycl_kernel_1dim() #1 !reqd_work_group_size !0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't need the weak_odrs

@@ -0,0 +1,42 @@
; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx90a -verify-machineinstrs < %s | FileCheck %s
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't bother with -verify-machineinstrs, it's unlikely to catch anything here and will be caught in EXPENSIVE_CHECKS builds

@jchlanda
Copy link
Contributor Author

Closing this PR, we've decide to handle it internally in SYCL (intel/llvm#13600). It aligns with OpenCL's and always pads missing dimensions with 1.

@jchlanda jchlanda closed this May 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants