[AMDGPU] Defaults for missing dimensions in SYCL required wg size #72652

jchlanda · 2023-11-17T14:36:18Z

SYCL allows for required work group to be partially specified (i.e. not all 3 dimensions):

https://registry.khronos.org/SYCL/specs/sycl-2020/html/sycl-2020.html#sec:kernel.attributes

This fails AMDGPU's attribute verification. The patch aims to provide the default values for missing dimensions when dealing with SYCL kernels. Rather than modifying the module's metadata it uses internal data to padd missing values.

SYCL allows for required work group to be partially specified (i.e. not all 3 dimensions): * https://registry.khronos.org/SYCL/specs/sycl-2020/html/sycl-2020.html#sec:kernel.attributes This fails AMDGPU's attribute verification. The patch aims to provide the default values for missing dimensions when dealing with SYCL kernels. Rather than modifying the module's metadata it uses internal data to padd missing values.

llvmbot · 2023-11-21T13:16:52Z

@llvm/pr-subscribers-backend-amdgpu

Author: Jakub Chlanda (jchlanda)

Changes

SYCL allows for required work group to be partially specified (i.e. not all 3 dimensions):

https://registry.khronos.org/SYCL/specs/sycl-2020/html/sycl-2020.html#sec:kernel.attributes

This fails AMDGPU's attribute verification. The patch aims to provide the default values for missing dimensions when dealing with SYCL kernels. Rather than modifying the module's metadata it uses internal data to padd missing values.

Full diff: https://github.com/llvm/llvm-project/pull/72652.diff

3 Files Affected:

(modified) llvm/lib/Target/AMDGPU/AMDGPUHSAMetadataStreamer.cpp (+8-4)
(modified) llvm/lib/Target/AMDGPU/AMDGPUHSAMetadataStreamer.h (+2-1)
(added) llvm/test/CodeGen/AMDGPU/required_work_group_size_sycl.ll (+42)

diff --git a/llvm/lib/Target/AMDGPU/AMDGPUHSAMetadataStreamer.cpp b/llvm/lib/Target/AMDGPU/AMDGPUHSAMetadataStreamer.cpp
index b51a876750b58b0..bff30bda357e2b9 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUHSAMetadataStreamer.cpp
+++ b/llvm/lib/Target/AMDGPU/AMDGPUHSAMetadataStreamer.cpp
@@ -173,14 +173,18 @@ std::string MetadataStreamerMsgPackV4::getTypeName(Type *Ty,
 }
 
 msgpack::ArrayDocNode
-MetadataStreamerMsgPackV4::getWorkGroupDimensions(MDNode *Node) const {
+MetadataStreamerMsgPackV4::getWorkGroupDimensions(const Function &Func,
+                                                  MDNode *Node) const {
   auto Dims = HSAMetadataDoc->getArrayNode();
-  if (Node->getNumOperands() != 3)
+  if (Node->getNumOperands() != 3 && !Func.hasFnAttribute("sycl-module-id"))
     return Dims;
 
   for (auto &Op : Node->operands())
     Dims.push_back(Dims.getDocument()->getNode(
         uint64_t(mdconst::extract<ConstantInt>(Op)->getZExtValue())));
+  for (unsigned I = Dims.size(); I < 3; ++I)
+    Dims.push_back(Dims.getDocument()->getNode(1));
+
   return Dims;
 }
 
@@ -233,9 +237,9 @@ void MetadataStreamerMsgPackV4::emitKernelAttrs(const Function &Func,
                                                 msgpack::MapDocNode Kern) {
 
   if (auto Node = Func.getMetadata("reqd_work_group_size"))
-    Kern[".reqd_workgroup_size"] = getWorkGroupDimensions(Node);
+    Kern[".reqd_workgroup_size"] = getWorkGroupDimensions(Func, Node);
   if (auto Node = Func.getMetadata("work_group_size_hint"))
-    Kern[".workgroup_size_hint"] = getWorkGroupDimensions(Node);
+    Kern[".workgroup_size_hint"] = getWorkGroupDimensions(Func, Node);
   if (auto Node = Func.getMetadata("vec_type_hint")) {
     Kern[".vec_type_hint"] = Kern.getDocument()->getNode(
         getTypeName(
diff --git a/llvm/lib/Target/AMDGPU/AMDGPUHSAMetadataStreamer.h b/llvm/lib/Target/AMDGPU/AMDGPUHSAMetadataStreamer.h
index 18a7b5d7a9633e8..3214f096f27b9a9 100644
--- a/llvm/lib/Target/AMDGPU/AMDGPUHSAMetadataStreamer.h
+++ b/llvm/lib/Target/AMDGPU/AMDGPUHSAMetadataStreamer.h
@@ -81,7 +81,8 @@ class MetadataStreamerMsgPackV4 : public MetadataStreamer {
 
   std::string getTypeName(Type *Ty, bool Signed) const;
 
-  msgpack::ArrayDocNode getWorkGroupDimensions(MDNode *Node) const;
+  msgpack::ArrayDocNode getWorkGroupDimensions(const Function &Func,
+                                               MDNode *Node) const;
 
   msgpack::MapDocNode getHSAKernelProps(const MachineFunction &MF,
                                         const SIProgramInfo &ProgramInfo,
diff --git a/llvm/test/CodeGen/AMDGPU/required_work_group_size_sycl.ll b/llvm/test/CodeGen/AMDGPU/required_work_group_size_sycl.ll
new file mode 100644
index 000000000000000..1999a55ff31ee5a
--- /dev/null
+++ b/llvm/test/CodeGen/AMDGPU/required_work_group_size_sycl.ll
@@ -0,0 +1,42 @@
+; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx90a -verify-machineinstrs < %s | FileCheck %s
+
+; Make sure that SYCL kernels with less than 3 dimensions specified in required
+; work group size, have those dimensions padded up with 1.
+
+; CHECK-LABEL: .name:           sycl_kernel_1dim
+; CHECK:    .reqd_workgroup_size:
+; CHECK-NEXT:      - 3
+; CHECK-NEXT:      - 1
+; CHECK-NEXT:      - 1
+define weak_odr protected amdgpu_kernel void @sycl_kernel_1dim() #1 !reqd_work_group_size !0 {
+entry:
+  ret void
+}
+
+; CHECK-LABEL: .name:           sycl_kernel_2dim
+; CHECK:    .reqd_workgroup_size:
+; CHECK-NEXT:      - 5
+; CHECK-NEXT:      - 7
+; CHECK-NEXT:      - 1
+define weak_odr protected amdgpu_kernel void @sycl_kernel_2dim() #1 !reqd_work_group_size !1 {
+entry:
+  ret void
+}
+
+; CHECK-LABEL: .name:           sycl_kernel_3dim
+; CHECK:    .reqd_workgroup_size:
+; CHECK-NEXT:      - 11 
+; CHECK-NEXT:      - 13
+; CHECK-NEXT:      - 17
+define weak_odr protected amdgpu_kernel void @sycl_kernel_3dim() #1 !reqd_work_group_size !2 {
+entry:
+  ret void
+}
+
+attributes #0 = { nounwind speculatable memory(none) }
+attributes #1 = { "sycl-module-id"="reqd_work_group_size_check_exception.cpp" }
+
+
+!0 = !{i32 3}
+!1 = !{i32 5, i32 7}
+!2 = !{i32 11, i32 13, i32 17}

jchlanda · 2023-11-27T11:15:04Z

Friendly ping @arsenm

arsenm

I still think we need to have one single spec for the metadata and not modify based on any is-this-language type annotations. The different producers and consumers should just agree on the operand constraints of the metadata

arsenm · 2023-11-28T11:31:58Z

llvm/lib/Target/AMDGPU/AMDGPUHSAMetadataStreamer.cpp

  auto Dims = HSAMetadataDoc->getArrayNode();
-  if (Node->getNumOperands() != 3)
+  if (Node->getNumOperands() != 3 && !Func.hasFnAttribute("sycl-module-id"))


The attribute check should not be necessary. The metadata should have a standalone interpretation and not depend on any other attributes

arsenm · 2023-11-28T11:32:27Z

llvm/test/CodeGen/AMDGPU/required_work_group_size_sycl.ll

+; CHECK-NEXT:      - 3
+; CHECK-NEXT:      - 1
+; CHECK-NEXT:      - 1
+define weak_odr protected amdgpu_kernel void @sycl_kernel_1dim() #1 !reqd_work_group_size !0 {


Don't need the weak_odrs

arsenm · 2023-11-28T11:32:54Z

llvm/test/CodeGen/AMDGPU/required_work_group_size_sycl.ll

@@ -0,0 +1,42 @@
+; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx90a -verify-machineinstrs < %s | FileCheck %s


Don't bother with -verify-machineinstrs, it's unlikely to catch anything here and will be caught in EXPENSIVE_CHECKS builds

jchlanda · 2024-05-21T07:27:26Z

Closing this PR, we've decide to handle it internally in SYCL (intel/llvm#13600). It aligns with OpenCL's and always pads missing dimensions with 1.

jchlanda force-pushed the jakub/amd_rqwgs_sycl_streamer_padd branch from 1644043 to c2c9e5e Compare November 20, 2023 14:08

jchlanda requested a review from arsenm November 21, 2023 13:14

jchlanda marked this pull request as ready for review November 21, 2023 13:16

llvmbot added the backend:AMDGPU label Nov 21, 2023

jchlanda mentioned this pull request Nov 21, 2023

[AMDGPU] Defaults for missing dimensions in SYCL required wg size #68872

Closed

arsenm reviewed Nov 28, 2023

View reviewed changes

PR test fixes

4cfe82c

jchlanda mentioned this pull request May 6, 2024

[SYCL] Add work_group_num_dim metadata intel/llvm#13600

Merged

jchlanda closed this May 21, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[AMDGPU] Defaults for missing dimensions in SYCL required wg size #72652

[AMDGPU] Defaults for missing dimensions in SYCL required wg size #72652

jchlanda commented Nov 17, 2023 •

edited

llvmbot commented Nov 21, 2023

jchlanda commented Nov 27, 2023

arsenm left a comment

arsenm Nov 28, 2023

arsenm Nov 28, 2023

arsenm Nov 28, 2023

jchlanda commented May 21, 2024

		@@ -0,0 +1,42 @@
		; RUN: llc -mtriple=amdgcn-amd-amdhsa -mcpu=gfx90a -verify-machineinstrs < %s \| FileCheck %s

[AMDGPU] Defaults for missing dimensions in SYCL required wg size #72652

[AMDGPU] Defaults for missing dimensions in SYCL required wg size #72652

Conversation

jchlanda commented Nov 17, 2023 • edited

llvmbot commented Nov 21, 2023

jchlanda commented Nov 27, 2023

arsenm left a comment

Choose a reason for hiding this comment

arsenm Nov 28, 2023

Choose a reason for hiding this comment

arsenm Nov 28, 2023

Choose a reason for hiding this comment

arsenm Nov 28, 2023

Choose a reason for hiding this comment

jchlanda commented May 21, 2024

jchlanda commented Nov 17, 2023 •

edited