[MLIR] Enable GPU Dialect to SYCL runtime integration #71430

silee2 · 2023-11-06T19:06:00Z

GPU Dialect lowering to SYCL runtime is driven by spirv.target_env attached to gpu.module. As a result of this, spirv.target_env remains as an input to LLVMIR Translation.
A SPIRVToLLVMIRTranslation without any actual translation is added to avoid an unregistered error in mlir-cpu-runner.
SelectObjectAttr.cpp is updated to

Pass binary size argument to getModuleLoadFn
Pass parameter count to getKernelLaunchFn
This change does not impact CUDA and ROCM usage since both mlir_cuda_runtime and mlir_rocm_runtime are already updated to accept and ignore the extra arguments.

GPU Dialect lowering to SYCL runtime is driven by spirv.target_env attached to gpu.module. As a result of this, spirv.target_env remains as an input to LLVMIR Translation. A SPIRVToLLVMIRTranslation without any actual translation is added to avoid an unregistered error in mlir-cpu-runner. SelectObjectAttr.cpp is updated to 1) Pass binary size argument to getModuleLoadFn 2) Pass parameter count to getKernelLaunchFn This change does not impact CUDA and ROCM usage since both mlir_cuda_runtime and mlir_rocm_runtime are already updated to accept and ignore the extra arguments.

llvmbot · 2023-11-06T19:06:32Z

@llvm/pr-subscribers-mlir-spirv
@llvm/pr-subscribers-mlir-gpu

@llvm/pr-subscribers-mlir-llvm

Author: Sang Ik Lee (silee2)

Changes

GPU Dialect lowering to SYCL runtime is driven by spirv.target_env attached to gpu.module. As a result of this, spirv.target_env remains as an input to LLVMIR Translation.
A SPIRVToLLVMIRTranslation without any actual translation is added to avoid an unregistered error in mlir-cpu-runner.
SelectObjectAttr.cpp is updated to

Pass binary size argument to getModuleLoadFn
Pass parameter count to getKernelLaunchFn
This change does not impact CUDA and ROCM usage since both mlir_cuda_runtime and mlir_rocm_runtime are already updated to accept and ignore the extra arguments.

Full diff: https://github.com/llvm/llvm-project/pull/71430.diff

13 Files Affected:

(modified) mlir/include/mlir/Target/LLVMIR/Dialect/All.h (+3)
(added) mlir/include/mlir/Target/LLVMIR/Dialect/SPIRV/SPIRVToLLVMIRTranslation.h (+31)
(modified) mlir/lib/Target/LLVMIR/CMakeLists.txt (+1)
(modified) mlir/lib/Target/LLVMIR/Dialect/CMakeLists.txt (+1)
(modified) mlir/lib/Target/LLVMIR/Dialect/GPU/SelectObjectAttr.cpp (+39-11)
(added) mlir/lib/Target/LLVMIR/Dialect/SPIRV/CMakeLists.txt (+13)
(added) mlir/lib/Target/LLVMIR/Dialect/SPIRV/SPIRVToLLVMIRTranslation.cpp (+31)
(modified) mlir/test/CMakeLists.txt (+4)
(added) mlir/test/Integration/GPU/SYCL/gpu-to-spirv.mlir (+54)
(added) mlir/test/Integration/GPU/SYCL/lit.local.cfg (+2)
(modified) mlir/test/Target/LLVMIR/gpu.mlir (+5-4)
(modified) mlir/test/lit.cfg.py (+3)
(modified) mlir/test/lit.site.cfg.py.in (+1)

diff --git a/mlir/include/mlir/Target/LLVMIR/Dialect/All.h b/mlir/include/mlir/Target/LLVMIR/Dialect/All.h
index 0563b9bf3d475a4..5dfc15afb75931a 100644
--- a/mlir/include/mlir/Target/LLVMIR/Dialect/All.h
+++ b/mlir/include/mlir/Target/LLVMIR/Dialect/All.h
@@ -26,6 +26,7 @@
 #include "mlir/Target/LLVMIR/Dialect/OpenACC/OpenACCToLLVMIRTranslation.h"
 #include "mlir/Target/LLVMIR/Dialect/OpenMP/OpenMPToLLVMIRTranslation.h"
 #include "mlir/Target/LLVMIR/Dialect/ROCDL/ROCDLToLLVMIRTranslation.h"
+#include "mlir/Target/LLVMIR/Dialect/SPIRV/SPIRVToLLVMIRTranslation.h"
 #include "mlir/Target/LLVMIR/Dialect/X86Vector/X86VectorToLLVMIRTranslation.h"
 
 namespace mlir {
@@ -45,6 +46,7 @@ static inline void registerAllToLLVMIRTranslations(DialectRegistry &registry) {
   registerOpenACCDialectTranslation(registry);
   registerOpenMPDialectTranslation(registry);
   registerROCDLDialectTranslation(registry);
+  registerSPIRVDialectTranslation(registry);
   registerX86VectorDialectTranslation(registry);
 
   // Extension required for translating GPU offloading Ops.
@@ -61,6 +63,7 @@ registerAllGPUToLLVMIRTranslations(DialectRegistry &registry) {
   registerLLVMDialectTranslation(registry);
   registerNVVMDialectTranslation(registry);
   registerROCDLDialectTranslation(registry);
+  registerSPIRVDialectTranslation(registry);
 
   // Extension required for translating GPU offloading Ops.
   gpu::registerOffloadingLLVMTranslationInterfaceExternalModels(registry);
diff --git a/mlir/include/mlir/Target/LLVMIR/Dialect/SPIRV/SPIRVToLLVMIRTranslation.h b/mlir/include/mlir/Target/LLVMIR/Dialect/SPIRV/SPIRVToLLVMIRTranslation.h
new file mode 100644
index 000000000000000..e9580a10b4ca780
--- /dev/null
+++ b/mlir/include/mlir/Target/LLVMIR/Dialect/SPIRV/SPIRVToLLVMIRTranslation.h
@@ -0,0 +1,31 @@
+//===- SPIRVToLLVMIRTranslation.h - SPIRV to LLVM IR ------------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// This provides registration calls for SPIRV dialect to LLVM IR translation.
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef MLIR_TARGET_LLVMIR_DIALECT_SPIRV_SPIRVTOLLVMIRTRANSLATION_H
+#define MLIR_TARGET_LLVMIR_DIALECT_SPIRV_SPIRVTOLLVMIRTRANSLATION_H
+
+namespace mlir {
+
+class DialectRegistry;
+class MLIRContext;
+
+/// Register the SPIRV dialect and the translation from it to the LLVM IR in the
+/// given registry;
+void registerSPIRVDialectTranslation(DialectRegistry &registry);
+
+/// Register the SPIRV dialect and the translation from it in the registry
+/// associated with the given context.
+void registerSPIRVDialectTranslation(MLIRContext &context);
+
+} // namespace mlir
+
+#endif // MLIR_TARGET_LLVMIR_DIALECT_SPIRV_SPIRVTOLLVMIRTRANSLATION_H
diff --git a/mlir/lib/Target/LLVMIR/CMakeLists.txt b/mlir/lib/Target/LLVMIR/CMakeLists.txt
index 5db0885d70d6e7a..531c15a8703e948 100644
--- a/mlir/lib/Target/LLVMIR/CMakeLists.txt
+++ b/mlir/lib/Target/LLVMIR/CMakeLists.txt
@@ -58,6 +58,7 @@ add_mlir_translation_library(MLIRToLLVMIRTranslationRegistration
   MLIROpenACCToLLVMIRTranslation
   MLIROpenMPToLLVMIRTranslation
   MLIRROCDLToLLVMIRTranslation
+  MLIRSPIRVToLLVMIRTranslation
   )
 
 add_mlir_translation_library(MLIRTargetLLVMIRImport
diff --git a/mlir/lib/Target/LLVMIR/Dialect/CMakeLists.txt b/mlir/lib/Target/LLVMIR/Dialect/CMakeLists.txt
index fb0e5cd0649f636..c9d916d8a5d82d1 100644
--- a/mlir/lib/Target/LLVMIR/Dialect/CMakeLists.txt
+++ b/mlir/lib/Target/LLVMIR/Dialect/CMakeLists.txt
@@ -9,4 +9,5 @@ add_subdirectory(NVVM)
 add_subdirectory(OpenACC)
 add_subdirectory(OpenMP)
 add_subdirectory(ROCDL)
+add_subdirectory(SPIRV)
 add_subdirectory(X86Vector)
diff --git a/mlir/lib/Target/LLVMIR/Dialect/GPU/SelectObjectAttr.cpp b/mlir/lib/Target/LLVMIR/Dialect/GPU/SelectObjectAttr.cpp
index 47fe6973778cd7f..6ea0dac89a42c18 100644
--- a/mlir/lib/Target/LLVMIR/Dialect/GPU/SelectObjectAttr.cpp
+++ b/mlir/lib/Target/LLVMIR/Dialect/GPU/SelectObjectAttr.cpp
@@ -52,6 +52,10 @@ class SelectObjectAttrImpl
 std::string getBinaryIdentifier(StringRef binaryName) {
   return binaryName.str() + "_bin_cst";
 }
+// Returns an identifier for the global int64 holding the binary size.
+std::string getBinarySizeIdentifier(StringRef binaryName) {
+  return binaryName.str() + "_bin_size_cst";
+}
 } // namespace
 
 void mlir::gpu::registerOffloadingLLVMTranslationInterfaceExternalModels(
@@ -124,6 +128,17 @@ LogicalResult SelectObjectAttrImpl::embedBinary(
   serializedObj->setLinkage(llvm::GlobalValue::LinkageTypes::InternalLinkage);
   serializedObj->setAlignment(llvm::MaybeAlign(8));
   serializedObj->setUnnamedAddr(llvm::GlobalValue::UnnamedAddr::None);
+
+  // Embed the object size as a global constant.
+  llvm::Constant *binarySize =
+      llvm::ConstantInt::get(builder.getInt64Ty(), object.getObject().size());
+  llvm::GlobalVariable *serializedSize = new llvm::GlobalVariable(
+      *module, binarySize->getType(), true,
+      llvm::GlobalValue::LinkageTypes::InternalLinkage, binarySize,
+      getBinarySizeIdentifier(op.getName()));
+  serializedSize->setLinkage(llvm::GlobalValue::LinkageTypes::InternalLinkage);
+  serializedSize->setAlignment(llvm::MaybeAlign(8));
+  serializedSize->setUnnamedAddr(llvm::GlobalValue::UnnamedAddr::None);
   return success();
 }
 
@@ -172,6 +187,7 @@ class LaunchKernel {
   IRBuilderBase &builder;
   mlir::LLVM::ModuleTranslation &moduleTranslation;
   Type *i32Ty{};
+  Type *i64Ty{};
   Type *voidTy{};
   Type *intPtrTy{};
   PointerType *ptrTy{};
@@ -213,6 +229,7 @@ llvm::LaunchKernel::LaunchKernel(
     mlir::LLVM::ModuleTranslation &moduleTranslation)
     : module(module), builder(builder), moduleTranslation(moduleTranslation) {
   i32Ty = builder.getInt32Ty();
+  i64Ty = builder.getInt64Ty();
   ptrTy = builder.getPtrTy(0);
   voidTy = builder.getVoidTy();
   intPtrTy = builder.getIntPtrTy(module.getDataLayout());
@@ -221,11 +238,11 @@ llvm::LaunchKernel::LaunchKernel(
 llvm::FunctionCallee llvm::LaunchKernel::getKernelLaunchFn() {
   return module.getOrInsertFunction(
       "mgpuLaunchKernel",
-      FunctionType::get(
-          voidTy,
-          ArrayRef<Type *>({ptrTy, intPtrTy, intPtrTy, intPtrTy, intPtrTy,
-                            intPtrTy, intPtrTy, i32Ty, ptrTy, ptrTy, ptrTy}),
-          false));
+      FunctionType::get(voidTy,
+                        ArrayRef<Type *>({ptrTy, intPtrTy, intPtrTy, intPtrTy,
+                                          intPtrTy, intPtrTy, intPtrTy, i32Ty,
+                                          ptrTy, ptrTy, ptrTy, i64Ty}),
+                        false));
 }
 
 llvm::FunctionCallee llvm::LaunchKernel::getModuleFunctionFn() {
@@ -237,7 +254,7 @@ llvm::FunctionCallee llvm::LaunchKernel::getModuleFunctionFn() {
 llvm::FunctionCallee llvm::LaunchKernel::getModuleLoadFn() {
   return module.getOrInsertFunction(
       "mgpuModuleLoad",
-      FunctionType::get(ptrTy, ArrayRef<Type *>({ptrTy}), false));
+      FunctionType::get(ptrTy, ArrayRef<Type *>({ptrTy, i64Ty}), false));
 }
 
 llvm::FunctionCallee llvm::LaunchKernel::getModuleLoadJITFn() {
@@ -377,10 +394,21 @@ llvm::LaunchKernel::createKernelLaunch(mlir::gpu::LaunchFuncOp op,
   if (!binary)
     return op.emitError() << "Couldn't find the binary: " << binaryIdentifier;
 
+  llvm::Constant *paramsCount =
+      llvm::ConstantInt::get(i64Ty, op.getNumKernelOperands());
+
+  std::string binarySizeIdentifier = getBinarySizeIdentifier(moduleName);
+  Value *binarySizeVar = module.getGlobalVariable(binarySizeIdentifier, true);
+  if (!binarySizeVar)
+    return op.emitError() << "Couldn't find the binary size: "
+                          << binarySizeIdentifier;
+  Value *binarySize =
+      dyn_cast<llvm::GlobalVariable>(binarySizeVar)->getInitializer();
+
   Value *moduleObject =
       object.getFormat() == gpu::CompilationTarget::Assembly
           ? builder.CreateCall(getModuleLoadJITFn(), {binary, optV})
-          : builder.CreateCall(getModuleLoadFn(), {binary});
+          : builder.CreateCall(getModuleLoadFn(), {binary, binarySize});
 
   // Load the kernel function.
   Value *moduleFunction = builder.CreateCall(
@@ -401,10 +429,10 @@ llvm::LaunchKernel::createKernelLaunch(mlir::gpu::LaunchFuncOp op,
 
   // Create the launch call.
   Value *nullPtr = ConstantPointerNull::get(ptrTy);
-  builder.CreateCall(
-      getKernelLaunchFn(),
-      ArrayRef<Value *>({moduleFunction, gx, gy, gz, bx, by, bz,
-                         dynamicMemorySize, stream, argArray, nullPtr}));
+  builder.CreateCall(getKernelLaunchFn(),
+                     ArrayRef<Value *>({moduleFunction, gx, gy, gz, bx, by, bz,
+                                        dynamicMemorySize, stream, argArray,
+                                        nullPtr, paramsCount}));
 
   // Sync & destroy the stream, for synchronous launches.
   if (handleStream) {
diff --git a/mlir/lib/Target/LLVMIR/Dialect/SPIRV/CMakeLists.txt b/mlir/lib/Target/LLVMIR/Dialect/SPIRV/CMakeLists.txt
new file mode 100644
index 000000000000000..850b95b8ddc77a0
--- /dev/null
+++ b/mlir/lib/Target/LLVMIR/Dialect/SPIRV/CMakeLists.txt
@@ -0,0 +1,13 @@
+add_mlir_translation_library(MLIRSPIRVToLLVMIRTranslation
+  SPIRVToLLVMIRTranslation.cpp
+
+  LINK_COMPONENTS
+  Core
+
+  LINK_LIBS PUBLIC
+  MLIRIR
+  MLIRLLVMDialect
+  MLIRSPIRVDialect
+  MLIRSupport
+  MLIRTargetLLVMIRExport
+  )
diff --git a/mlir/lib/Target/LLVMIR/Dialect/SPIRV/SPIRVToLLVMIRTranslation.cpp b/mlir/lib/Target/LLVMIR/Dialect/SPIRV/SPIRVToLLVMIRTranslation.cpp
new file mode 100644
index 000000000000000..06038a17f2ef666
--- /dev/null
+++ b/mlir/lib/Target/LLVMIR/Dialect/SPIRV/SPIRVToLLVMIRTranslation.cpp
@@ -0,0 +1,31 @@
+//===- SPIRVToLLVMIRTranslation.cpp - Translate SPIRV to LLVM IR ----------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// This file implements a translation between the MLIR SPIRV dialect and
+// LLVM IR.
+//
+//===----------------------------------------------------------------------===//
+
+#include "mlir/Target/LLVMIR/Dialect/SPIRV/SPIRVToLLVMIRTranslation.h"
+#include "mlir/Dialect/SPIRV/IR/SPIRVDialect.h"
+#include "mlir/IR/BuiltinAttributes.h"
+#include "mlir/IR/Operation.h"
+#include "mlir/Target/LLVMIR/ModuleTranslation.h"
+
+using namespace mlir;
+using namespace mlir::LLVM;
+
+void mlir::registerSPIRVDialectTranslation(DialectRegistry &registry) {
+  registry.insert<spirv::SPIRVDialect>();
+}
+
+void mlir::registerSPIRVDialectTranslation(MLIRContext &context) {
+  DialectRegistry registry;
+  registerSPIRVDialectTranslation(registry);
+  context.appendDialectRegistry(registry);
+}
diff --git a/mlir/test/CMakeLists.txt b/mlir/test/CMakeLists.txt
index d81f3c4b1e20c5a..c26826d1b2c62fe 100644
--- a/mlir/test/CMakeLists.txt
+++ b/mlir/test/CMakeLists.txt
@@ -139,6 +139,10 @@ if(MLIR_ENABLE_ROCM_RUNNER)
   list(APPEND MLIR_TEST_DEPENDS mlir_rocm_runtime)
 endif()
 
+if(MLIR_ENABLE_SYCL_RUNNER)
+  list(APPEND MLIR_TEST_DEPENDS mlir_sycl_runtime)
+endif()
+
 list(APPEND MLIR_TEST_DEPENDS MLIRUnitTests)
 
 if(LLVM_BUILD_EXAMPLES)
diff --git a/mlir/test/Integration/GPU/SYCL/gpu-to-spirv.mlir b/mlir/test/Integration/GPU/SYCL/gpu-to-spirv.mlir
new file mode 100644
index 000000000000000..bc6f3cea080df20
--- /dev/null
+++ b/mlir/test/Integration/GPU/SYCL/gpu-to-spirv.mlir
@@ -0,0 +1,54 @@
+// RUN: mlir-opt %s -pass-pipeline='builtin.module(spirv-attach-target{ver=v1.0 caps=Addresses,Int64,Kernel},convert-gpu-to-spirv{use-64bit-index=true},gpu.module(spirv.module(spirv-lower-abi-attrs,spirv-update-vce)),func.func(llvm-request-c-wrappers),convert-scf-to-cf,convert-cf-to-llvm,convert-arith-to-llvm,convert-math-to-llvm,convert-func-to-llvm,gpu-to-llvm{use-bare-pointers-for-kernels=true},gpu-module-to-binary,expand-strided-metadata,lower-affine,finalize-memref-to-llvm,reconcile-unrealized-casts)' \
+// RUN: | mlir-cpu-runner \
+// RUN:   --shared-libs=%mlir_sycl_runtime \
+// RUN:   --shared-libs=%mlir_runner_utils \
+// RUN:   --entry-point-result=void \
+// RUN: | FileCheck %s
+
+module @add attributes {gpu.container_module} {
+  memref.global "private" constant @__constant_3x3xi64_0 : memref<3x3xi64> = dense<[[1, 4098, 3], [16777220, 5, 4294967302], [7, 1099511627784, 9]]>
+  memref.global "private" constant @__constant_3x3xi64 : memref<3x3xi64> = dense<[[1, 2, 3], [4, 5, 4102], [16777223, 4294967304, 1099511627785]]>
+  func.func @main() {
+    %0 = memref.get_global @__constant_3x3xi64 : memref<3x3xi64>
+    %1 = memref.get_global @__constant_3x3xi64_0 : memref<3x3xi64>
+    %2 = call @test(%0, %1) : (memref<3x3xi64>, memref<3x3xi64>) -> memref<3x3xi64>
+    %cast = memref.cast %2 : memref<3x3xi64> to memref<*xi64>
+    call @printMemrefI64(%cast) : (memref<*xi64>) -> ()
+    return
+  }
+  func.func private @printMemrefI64(memref<*xi64>)
+  func.func @test(%arg0: memref<3x3xi64>, %arg1: memref<3x3xi64>) -> memref<3x3xi64> {
+  %c3 = arith.constant 3 : index
+  %c1 = arith.constant 1 : index
+  %mem = gpu.alloc host_shared () : memref<3x3xi64>
+  memref.copy %arg1, %mem : memref<3x3xi64> to memref<3x3xi64>
+  %memref_0 = gpu.alloc host_shared () : memref<3x3xi64>
+  memref.copy %arg0, %memref_0 : memref<3x3xi64> to memref<3x3xi64>
+  %memref_2 = gpu.alloc host_shared () : memref<3x3xi64>
+  %2 = gpu.wait async
+  %3 = gpu.launch_func async [%2] @test_kernel::@test_kernel blocks in (%c3, %c3, %c1) threads in (%c1, %c1, %c1) args(%memref_0 : memref<3x3xi64>, %mem : memref<3x3xi64>, %memref_2 : memref<3x3xi64>)
+  gpu.wait [%3]
+  %alloc = memref.alloc() : memref<3x3xi64>
+  memref.copy %memref_2, %alloc : memref<3x3xi64> to memref<3x3xi64>
+  %4 = gpu.wait async
+  %5 = gpu.dealloc async [%4] %memref_2 : memref<3x3xi64>
+  %6 = gpu.dealloc async [%5] %memref_0 : memref<3x3xi64>
+  %7 = gpu.dealloc async [%6] %mem : memref<3x3xi64>
+  gpu.wait [%7]
+  return %alloc : memref<3x3xi64>
+  }
+  gpu.module @test_kernel attributes {spirv.target_env = #spirv.target_env<#spirv.vce<v1.0, [Addresses, Int64, Kernel], []>, api=OpenCL, #spirv.resource_limits<>>} {
+    gpu.func @test_kernel(%arg0: memref<3x3xi64>, %arg1: memref<3x3xi64>, %arg2: memref<3x3xi64>) kernel attributes {gpu.known_block_size = array<i32: 1, 1, 1>, gpu.known_grid_size = array<i32: 3, 3, 1>, spirv.entry_point_abi = #spirv.entry_point_abi<>} {
+      %0 = gpu.block_id  x
+      %1 = gpu.block_id  y
+      %2 = memref.load %arg0[%0, %1] : memref<3x3xi64>
+      %3 = memref.load %arg1[%0, %1] : memref<3x3xi64>
+      %4 = arith.addi %2, %3 : i64
+      memref.store %4, %arg2[%0, %1] : memref<3x3xi64>
+      gpu.return
+    }
+  }
+  // CHECK: [2,   4100,   6],
+  // CHECK: [16777224,   10,   4294971404],
+  // CHECK: [16777230,   1103806595088,   1099511627794]
+}
diff --git a/mlir/test/Integration/GPU/SYCL/lit.local.cfg b/mlir/test/Integration/GPU/SYCL/lit.local.cfg
new file mode 100644
index 000000000000000..75bac1882eed5c9
--- /dev/null
+++ b/mlir/test/Integration/GPU/SYCL/lit.local.cfg
@@ -0,0 +1,2 @@
+if not config.enable_sycl_runner:
+    config.unsupported = True
diff --git a/mlir/test/Target/LLVMIR/gpu.mlir b/mlir/test/Target/LLVMIR/gpu.mlir
index fddbbee962c1aee..8a3fc13e0b9af71 100644
--- a/mlir/test/Target/LLVMIR/gpu.mlir
+++ b/mlir/test/Target/LLVMIR/gpu.mlir
@@ -4,6 +4,7 @@
 module attributes {gpu.container_module} {
   // CHECK: [[ARGS_TY:%.*]] = type { i32, i32 }
   // CHECK: @kernel_module_bin_cst = internal constant [4 x i8] c"BLOB", align 8
+  // CHECK:  @kernel_module_bin_size_cst = internal constant i64 4, align 8
   // CHECK: @kernel_module_kernel_kernel_name = private unnamed_addr constant [7 x i8] c"kernel\00", align 1
   gpu.binary @kernel_module  [#gpu.object<#nvvm.target, "BLOB">]
   llvm.func @foo() {
@@ -17,10 +18,10 @@ module attributes {gpu.container_module} {
     // CHECK: store i32 32, ptr [[ARG1]], align 4
     // CHECK: %{{.*}} = getelementptr ptr, ptr [[ARGS_ARRAY]], i32 1
     // CHECK: store ptr [[ARG1]], ptr %{{.*}}, align 8
-    // CHECK: [[MODULE:%.*]] = call ptr @mgpuModuleLoad(ptr @kernel_module_bin_cst)
+    // CHECK: [[MODULE:%.*]] = call ptr @mgpuModuleLoad(ptr @kernel_module_bin_cst, i64 4)
     // CHECK: [[FUNC:%.*]] = call ptr @mgpuModuleGetFunction(ptr [[MODULE]], ptr @kernel_module_kernel_kernel_name)
     // CHECK: [[STREAM:%.*]] = call ptr @mgpuStreamCreate()
-    // CHECK: call void @mgpuLaunchKernel(ptr [[FUNC]], i64 8, i64 8, i64 8, i64 8, i64 8, i64 8, i32 256, ptr [[STREAM]], ptr [[ARGS_ARRAY]], ptr null)
+    // CHECK: call void @mgpuLaunchKernel(ptr [[FUNC]], i64 8, i64 8, i64 8, i64 8, i64 8, i64 8, i32 256, ptr [[STREAM]], ptr [[ARGS_ARRAY]], ptr null, i64 2)
     // CHECK: call void @mgpuStreamSynchronize(ptr [[STREAM]])
     // CHECK: call void @mgpuStreamDestroy(ptr [[STREAM]])
     // CHECK: call void @mgpuModuleUnload(ptr [[MODULE]])
@@ -59,9 +60,9 @@ module attributes {gpu.container_module} {
     // CHECK: = call ptr @mgpuStreamCreate()
     // CHECK-NEXT: = alloca {{.*}}, align 8
     // CHECK-NEXT: [[ARGS:%.*]] = alloca ptr, i64 0, align 8
-    // CHECK-NEXT: [[MODULE:%.*]] = call ptr @mgpuModuleLoad(ptr @kernel_module_bin_cst)
+    // CHECK-NEXT: [[MODULE:%.*]] = call ptr @mgpuModuleLoad(ptr @kernel_module_bin_cst, i64 4)
     // CHECK-NEXT: [[FUNC:%.*]] = call ptr @mgpuModuleGetFunction(ptr [[MODULE]], ptr @kernel_module_kernel_kernel_name)
-    // CHECK-NEXT: call void @mgpuLaunchKernel(ptr [[FUNC]], i64 8, i64 8, i64 8, i64 8, i64 8, i64 8, i32 0, ptr {{.*}}, ptr [[ARGS]], ptr null)
+    // CHECK-NEXT: call void @mgpuLaunchKernel(ptr [[FUNC]], i64 8, i64 8, i64 8, i64 8, i64 8, i64 8, i32 0, ptr {{.*}}, ptr [[ARGS]], ptr null, i64 0)
     // CHECK-NEXT: call void @mgpuModuleUnload(ptr [[MODULE]])
     // CHECK-NEXT: call void @mgpuStreamSynchronize(ptr %{{.*}})
     // CHECK-NEXT: call void @mgpuStreamDestroy(ptr %{{.*}})
diff --git a/mlir/test/lit.cfg.py b/mlir/test/lit.cfg.py
index da8488373862c36..cb8e1ab9d8a4ca8 100644
--- a/mlir/test/lit.cfg.py
+++ b/mlir/test/lit.cfg.py
@@ -126,6 +126,9 @@ def add_runtime(name):
 if config.enable_cuda_runner:
     tools.extend([add_runtime("mlir_cuda_runtime")])
 
+if config.enable_sycl_runner:
+    tools.extend([add_runtime("mlir_sycl_runtime")])
+
 # The following tools are optional
 tools.extend(
     [
diff --git a/mlir/test/lit.site.cfg.py.in b/mlir/test/lit.site.cfg.py.in
index 2de40ba5e8e57e6..c994de0d3d16b7e 100644
--- a/mlir/test/lit.site.cfg.py.in
+++ b/mlir/test/lit.site.cfg.py.in
@@ -31,6 +31,7 @@ config.run_rocm_tests = @MLIR_ENABLE_ROCM_CONVERSIONS@
 config.enable_rocm_runner = @MLIR_ENABLE_ROCM_RUNNER@
 config.gpu_compilation_format = "@MLIR_GPU_COMPILATION_TEST_FORMAT@"
 config.rocm_test_chipset = "@ROCM_TEST_CHIPSET@"
+config.enable_sycl_runner = @MLIR_ENABLE_SYCL_RUNNER@
 config.enable_spirv_cpu_runner = @MLIR_ENABLE_SPIRV_CPU_RUNNER@
 config.enable_vulkan_runner = @MLIR_ENABLE_VULKAN_RUNNER@
 config.enable_bindings_python = @MLIR_ENABLE_BINDINGS_PYTHON@

silee2 · 2023-11-06T19:07:48Z

This is last part of refactoring #65539

silee2 · 2023-11-07T18:31:25Z

@fabianmcg This is remaining part of SPIR-V target for GPU compilation pipeline.

fabianmcg · 2023-11-10T14:58:23Z

mlir/include/mlir/Target/LLVMIR/Dialect/All.h

@@ -61,6 +63,7 @@ registerAllGPUToLLVMIRTranslations(DialectRegistry &registry) {
  registerLLVMDialectTranslation(registry);
  registerNVVMDialectTranslation(registry);
  registerROCDLDialectTranslation(registry);
+  registerSPIRVDialectTranslation(registry);


Adding a full translation library for LLVM from SPIRV is not required, strictly speaking the only required change is adding here a call to registry.insert<spirv::SPIRVDialect>();, so that mlir-translate is able to load the dialect at parsing. @joker-eph thoughts on a library vs inlining the call?

@joker-eph Any thoughts on library vs inlining the call?

Trying to understand...
What is the unit test that requires this?

The call to registry.insert<spirv::SPIRVDialect>(); is needed so that mlir-translate can parse the code containing the SPIR-V target attribute, nothing more; there's no translation happening from SPIR-V to LLVM. If the call is not added, then mlir-translate throws an error because SPIR-V never gets registered.

The question is, should an empty translation to LLVM be added to mirror all other * to LLVM translation code structure, or is inlining the call ok? I definitely prefer the second option -one less CMake target.

Which mlir-translate test requires that? I don't find it by skimming the patch.

@silee2 can you add a test here using the GPU SPIR-V target attribute?

@fabianmcg Added test.
Any other request? The PR has been open for a while and blocking other work that depends on it. Would be great if you can approve soon.

The PR has been open for a while and blocking other work that depends on it. Would be great if you can approve soon.

Well, we were waiting on the test that you just added today!

fabianmcg · 2023-11-10T14:59:20Z

mlir/include/mlir/Target/LLVMIR/Dialect/SPIRV/SPIRVToLLVMIRTranslation.h

+class DialectRegistry;
+class MLIRContext;
+
+/// Register the SPIRV dialect and the translation from it to the LLVM IR in the


Nit: SPIRV -> SPIR-V in all occurrences.

fabianmcg · 2023-11-10T15:01:00Z

mlir/test/Integration/GPU/SYCL/gpu-to-spirv.mlir

@@ -0,0 +1,54 @@
+// RUN: mlir-opt %s -pass-pipeline='builtin.module(spirv-attach-target{ver=v1.0 caps=Addresses,Int64,Kernel},convert-gpu-to-spirv{use-64bit-index=true},gpu.module(spirv.module(spirv-lower-abi-attrs,spirv-update-vce)),func.func(llvm-request-c-wrappers),convert-scf-to-cf,convert-cf-to-llvm,convert-arith-to-llvm,convert-math-to-llvm,convert-func-to-llvm,gpu-to-llvm{use-bare-pointers-for-kernels=true},gpu-module-to-binary,expand-strided-metadata,lower-affine,finalize-memref-to-llvm,reconcile-unrealized-casts)' \


Can we get a couple more integration tests?

Added two more tests.

fabianmcg · 2023-11-10T15:02:03Z

mlir/lib/Target/LLVMIR/Dialect/SPIRV/SPIRVToLLVMIRTranslation.cpp

+//
+//===----------------------------------------------------------------------===//
+//
+// This file implements a translation between the MLIR SPIRV dialect and


SPIRV -> SPIR-V

fabianmcg · 2023-11-10T15:09:22Z

mlir/lib/Target/LLVMIR/Dialect/GPU/SelectObjectAttr.cpp

+      llvm::ConstantInt::get(i64Ty, op.getNumKernelOperands());
+
+  std::string binarySizeIdentifier = getBinarySizeIdentifier(moduleName);
+  Value *binarySizeVar = module.getGlobalVariable(binarySizeIdentifier, true);


Why can't you use the initializer in Value *binary = module.getGlobalVariable(binaryIdentifier, true); to obtain the size of the binary?

fabianmcg · 2023-11-14T00:36:10Z

mlir/test/Target/LLVMIR/gpu.mlir

@@ -4,6 +4,7 @@
 module attributes {gpu.container_module} {
  // CHECK: [[ARGS_TY:%.*]] = type { i32, i32 }
  // CHECK: @kernel_module_bin_cst = internal constant [4 x i8] c"BLOB", align 8
+  // CHECK:  @kernel_module_bin_size_cst = internal constant i64 4, align 8


Please update the tests with the new changes.

fabianmcg · 2023-11-14T00:46:17Z

mlir/lib/Target/LLVMIR/Dialect/GPU/SelectObjectAttr.cpp

+  auto binaryVar = dyn_cast<llvm::GlobalVariable>(binary);
+  llvm::Constant *binaryInit = binaryVar->getInitializer();
+  auto binaryDataSeq = dyn_cast<llvm::ConstantDataSequential>(binaryInit);


Some form of check must be placed on the pointers:

Suggested change

auto binaryVar = dyn_cast<llvm::GlobalVariable>(binary);

llvm::Constant *binaryInit = binaryVar->getInitializer();

auto binaryDataSeq = dyn_cast<llvm::ConstantDataSequential>(binaryInit);

auto binaryVar = dyn_cast<llvm::GlobalVariable>(binary);

assert(binaryVar && "expected a global variable");

llvm::Constant *binaryInit = binaryVar->getInitializer();

auto binaryDataSeq = dyn_cast_or_null<llvm::ConstantDataSequential>(binaryInit);

assert(binaryDataSeq && "expected a valid initializer");

Added pointer checks.

mlir/test/Integration/GPU/SYCL/gpu-addf32-to-spirv.mlir

silee2 · 2023-11-16T18:59:57Z

@antiagainst @kuhar Any comments?

joker-eph · 2023-11-17T05:38:26Z

mlir/lib/Target/LLVMIR/Dialect/SPIRV/SPIRVToLLVMIRTranslation.cpp

+using namespace mlir::LLVM;
+
+void mlir::registerSPIRVDialectTranslation(DialectRegistry &registry) {
+  registry.insert<spirv::SPIRVDialect>();


I'm confused here: there is no translation interface.

There is no translation but dialect registration is still required as spirv.target_env is attached and appears in input.
Other option is to register SPIR-V dialect directly here: mlir/include/mlir/Target/LLVMIR/Dialect/All.h
@joker-eph Any thoughts? Or better option?

I don't quite understand why the API is called "registerSPIRVDialectTranslation" when it itself registers a dialect and no translation actually...

I'm sorry I missed these. I had the same concern, the alternative was adding an inline registration call. However, I then realized that adding #include "mlir/Dialect/SPIRV/IR/SPIRVDialect.h" to mlir/Target/LLVMIR/Dialect/All.h was needed , which seemed even more undesirable thus I changed my mind and agreed on the current scheme.
One way to clean it, is removing inline function from mlir/Target/LLVMIR/Dialect/All.h and creating a MLIRToLLVMIR library.

antiagainst

LGTM regarding SPIR-V. But please address comments from others. :)

fabianmcg

LGTM! I'll squash and merge as soon it passes all pre check tests.

llvmbot added mlir:llvm mlir:gpu mlir:spirv mlir labels Nov 6, 2023

fabianmcg self-requested a review November 7, 2023 18:41

fabianmcg reviewed Nov 10, 2023

View reviewed changes

fabianmcg requested review from antiagainst and joker-eph November 10, 2023 15:10

kuhar self-requested a review November 10, 2023 17:42

silee2 added 4 commits November 13, 2023 19:04

Merge branch 'main' into gpu_sycl_integration

47ac95e

Address reviewer comments.

50c621e

Add more integration tests.

b4068d5

Reorder code.

95559ef

fabianmcg reviewed Nov 14, 2023

View reviewed changes

Dinistro reviewed Nov 14, 2023

View reviewed changes

mlir/test/Integration/GPU/SYCL/gpu-addf32-to-spirv.mlir Show resolved Hide resolved

silee2 added 2 commits November 14, 2023 19:28

Address reviewer comments.

2783bc1

Merge branch 'main' into gpu_sycl_integration

e3e14f0

joker-eph reviewed Nov 17, 2023

View reviewed changes

fabianmcg mentioned this pull request Nov 20, 2023

[mlir][gpu] Support Cluster of Thread Blocks in gpu.launch_func #72871

Merged

antiagainst approved these changes Nov 26, 2023

View reviewed changes

silee2 added 4 commits November 27, 2023 21:38

Merge remote-tracking branch 'origin/main' into gpu_sycl_integration

117ed91

Merge remote-tracking branch 'origin/main' into gpu_sycl_integration

59f5fdc

Merge branch 'main' into gpu_sycl_integration

61f2be9

Add test for selecting gpu.object based on spirv.target_env

044132a

fabianmcg self-requested a review December 5, 2023 19:13

fabianmcg requested a review from Dinistro December 5, 2023 19:13

fabianmcg approved these changes Dec 5, 2023

View reviewed changes

fabianmcg merged commit 7fc792c into llvm:main Dec 5, 2023
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MLIR] Enable GPU Dialect to SYCL runtime integration #71430

[MLIR] Enable GPU Dialect to SYCL runtime integration #71430

silee2 commented Nov 6, 2023

llvmbot commented Nov 6, 2023 •

edited

Loading

silee2 commented Nov 6, 2023

silee2 commented Nov 7, 2023

fabianmcg Nov 10, 2023 •

edited

Loading

silee2 Nov 15, 2023

joker-eph Nov 17, 2023

fabianmcg Nov 17, 2023 •

edited

Loading

joker-eph Nov 18, 2023 •

edited

Loading

fabianmcg Nov 20, 2023

silee2 Dec 5, 2023

joker-eph Dec 6, 2023

fabianmcg Nov 10, 2023

silee2 Nov 13, 2023

fabianmcg Nov 10, 2023

silee2 Nov 13, 2023

fabianmcg Nov 10, 2023

silee2 Nov 13, 2023

fabianmcg Nov 10, 2023

silee2 Nov 13, 2023

fabianmcg Nov 14, 2023

silee2 Nov 14, 2023

fabianmcg Nov 14, 2023

silee2 Nov 14, 2023

silee2 commented Nov 16, 2023

joker-eph Nov 17, 2023

silee2 Nov 17, 2023

silee2 Nov 28, 2023

joker-eph Dec 6, 2023 •

edited

Loading

fabianmcg Dec 6, 2023

antiagainst left a comment

fabianmcg left a comment

		@@ -0,0 +1,54 @@
		// RUN: mlir-opt %s -pass-pipeline='builtin.module(spirv-attach-target{ver=v1.0 caps=Addresses,Int64,Kernel},convert-gpu-to-spirv{use-64bit-index=true},gpu.module(spirv.module(spirv-lower-abi-attrs,spirv-update-vce)),func.func(llvm-request-c-wrappers),convert-scf-to-cf,convert-cf-to-llvm,convert-arith-to-llvm,convert-math-to-llvm,convert-func-to-llvm,gpu-to-llvm{use-bare-pointers-for-kernels=true},gpu-module-to-binary,expand-strided-metadata,lower-affine,finalize-memref-to-llvm,reconcile-unrealized-casts)' \

[MLIR] Enable GPU Dialect to SYCL runtime integration #71430

[MLIR] Enable GPU Dialect to SYCL runtime integration #71430

Conversation

silee2 commented Nov 6, 2023

llvmbot commented Nov 6, 2023 • edited Loading

silee2 commented Nov 6, 2023

silee2 commented Nov 7, 2023

fabianmcg Nov 10, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fabianmcg Nov 17, 2023 • edited Loading

Choose a reason for hiding this comment

joker-eph Nov 18, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

silee2 commented Nov 16, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

joker-eph Dec 6, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

antiagainst left a comment

Choose a reason for hiding this comment

fabianmcg left a comment

Choose a reason for hiding this comment

llvmbot commented Nov 6, 2023 •

edited

Loading

fabianmcg Nov 10, 2023 •

edited

Loading

fabianmcg Nov 17, 2023 •

edited

Loading

joker-eph Nov 18, 2023 •

edited

Loading

joker-eph Dec 6, 2023 •

edited

Loading