Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MLIR][GPU-LLVM] Define -convert-gpu-to-llvm-spv pass #90972

Merged
merged 8 commits into from
May 31, 2024

Conversation

victor-eds
Copy link
Contributor

Define pass for GPU to LLVM conversion for SPIR-V backend tool ingest.

Supported operations:

  • gpu.block_id
  • gpu.global_id
  • gpu.block_dim
  • gpu.thread_id
  • gpu.grid_dim
  • gpu.barrier
  • gpu.shuffle

Define pass for GPU to LLVM conversion for SPIR-V backend tool ingest.

Supported operations:

- `gpu.block_id`
- `gpu.global_id`
- `gpu.block_dim`
- `gpu.thread_id`
- `gpu.grid_dim`
- `gpu.barrier`
- `gpu.shuffle`

Signed-off-by: Victor Perez <victor.perez@codeplay.com>
@victor-eds victor-eds self-assigned this May 3, 2024
@victor-eds victor-eds added the mlir label May 3, 2024
mlir/include/mlir/Conversion/Passes.td Show resolved Hide resolved
mlir/lib/Conversion/GPUToLLVMSPV/GPUToLLVMSPV.cpp Outdated Show resolved Hide resolved
mlir/lib/Conversion/GPUToLLVMSPV/GPUToLLVMSPV.cpp Outdated Show resolved Hide resolved
mlir/lib/Conversion/GPUToLLVMSPV/GPUToLLVMSPV.cpp Outdated Show resolved Hide resolved
mlir/lib/Conversion/GPUToLLVMSPV/GPUToLLVMSPV.cpp Outdated Show resolved Hide resolved
@victor-eds victor-eds requested a review from kuhar May 3, 2024 16:17
@llvmbot
Copy link
Collaborator

llvmbot commented May 4, 2024

@llvm/pr-subscribers-mlir-gpu

@llvm/pr-subscribers-mlir

Author: Victor Perez (victor-eds)

Changes

Define pass for GPU to LLVM conversion for SPIR-V backend tool ingest.

Supported operations:

  • gpu.block_id
  • gpu.global_id
  • gpu.block_dim
  • gpu.thread_id
  • gpu.grid_dim
  • gpu.barrier
  • gpu.shuffle

Patch is 28.18 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/90972.diff

7 Files Affected:

  • (added) mlir/include/mlir/Conversion/GPUToLLVMSPV/GPUToLLVMSPVPass.h (+27)
  • (modified) mlir/include/mlir/Conversion/Passes.h (+1)
  • (modified) mlir/include/mlir/Conversion/Passes.td (+18)
  • (modified) mlir/lib/Conversion/CMakeLists.txt (+1)
  • (added) mlir/lib/Conversion/GPUToLLVMSPV/CMakeLists.txt (+12)
  • (added) mlir/lib/Conversion/GPUToLLVMSPV/GPUToLLVMSPV.cpp (+327)
  • (added) mlir/test/Conversion/GPUToLLVMSPV/gpu-to-llvm-spv.mlir (+216)
diff --git a/mlir/include/mlir/Conversion/GPUToLLVMSPV/GPUToLLVMSPVPass.h b/mlir/include/mlir/Conversion/GPUToLLVMSPV/GPUToLLVMSPVPass.h
new file mode 100644
index 00000000000000..e156c3093e21be
--- /dev/null
+++ b/mlir/include/mlir/Conversion/GPUToLLVMSPV/GPUToLLVMSPVPass.h
@@ -0,0 +1,27 @@
+//===- GPUToLLVMSPVPass.h - Convert GPU kernel to LLVM operations *- C++ -*-==//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef MLIR_CONVERSION_GPUTOLLVMSPV_GPUTOLLVMSPVPASS_H_
+#define MLIR_CONVERSION_GPUTOLLVMSPV_GPUTOLLVMSPVPASS_H_
+
+#include <memory>
+
+namespace mlir {
+class DialectRegistry;
+class LLVMTypeConverter;
+class RewritePatternSet;
+class Pass;
+
+#define GEN_PASS_DECL_CONVERTGPUOPSTOLLVMSPVOPS
+#include "mlir/Conversion/Passes.h.inc"
+
+void populateGpuToLLVMSPVConversionPatterns(LLVMTypeConverter &converter,
+                                            RewritePatternSet &patterns);
+} // namespace mlir
+
+#endif // MLIR_CONVERSION_GPUTOLLVMSPV_GPUTOLLVMSPVPASS_H_
diff --git a/mlir/include/mlir/Conversion/Passes.h b/mlir/include/mlir/Conversion/Passes.h
index 2179ae18ac074b..7700299b3a4f32 100644
--- a/mlir/include/mlir/Conversion/Passes.h
+++ b/mlir/include/mlir/Conversion/Passes.h
@@ -34,6 +34,7 @@
 #include "mlir/Conversion/FuncToLLVM/ConvertFuncToLLVMPass.h"
 #include "mlir/Conversion/FuncToSPIRV/FuncToSPIRVPass.h"
 #include "mlir/Conversion/GPUCommon/GPUCommonPass.h"
+#include "mlir/Conversion/GPUToLLVMSPV/GPUToLLVMSPVPass.h"
 #include "mlir/Conversion/GPUToNVVM/GPUToNVVMPass.h"
 #include "mlir/Conversion/GPUToROCDL/GPUToROCDLPass.h"
 #include "mlir/Conversion/GPUToSPIRV/GPUToSPIRVPass.h"
diff --git a/mlir/include/mlir/Conversion/Passes.td b/mlir/include/mlir/Conversion/Passes.td
index d094ee3b36ab95..6d1d942e5412b7 100644
--- a/mlir/include/mlir/Conversion/Passes.td
+++ b/mlir/include/mlir/Conversion/Passes.td
@@ -508,6 +508,24 @@ def LowerHostCodeToLLVMPass : Pass<"lower-host-to-llvm", "ModuleOp"> {
   let dependentDialects = ["LLVM::LLVMDialect"];
 }
 
+//===----------------------------------------------------------------------===//
+// GPUToLLVMSPV
+//===----------------------------------------------------------------------===//
+
+def ConvertGpuOpsToLLVMSPVOps : Pass<"convert-gpu-to-llvm-spv", "gpu::GPUModuleOp"> {
+  let summary =
+    "Generate LLVM operations to be ingested by a SPIR-V backend for gpu operations";
+  let dependentDialects = [
+    "LLVM::LLVMDialect",
+    "spirv::SPIRVDialect",
+  ];
+  let options = [
+    Option<"indexBitwidth", "index-bitwidth", "unsigned",
+           /*default=kDeriveIndexBitwidthFromDataLayout*/"0",
+           "Bitwidth of the index type, 0 to use size of machine word">,
+  ];
+}
+
 //===----------------------------------------------------------------------===//
 // GPUToNVVM
 //===----------------------------------------------------------------------===//
diff --git a/mlir/lib/Conversion/CMakeLists.txt b/mlir/lib/Conversion/CMakeLists.txt
index 41ab7046b91ce3..0a03a2e133db18 100644
--- a/mlir/lib/Conversion/CMakeLists.txt
+++ b/mlir/lib/Conversion/CMakeLists.txt
@@ -23,6 +23,7 @@ add_subdirectory(FuncToEmitC)
 add_subdirectory(FuncToLLVM)
 add_subdirectory(FuncToSPIRV)
 add_subdirectory(GPUCommon)
+add_subdirectory(GPUToLLVMSPV)
 add_subdirectory(GPUToNVVM)
 add_subdirectory(GPUToROCDL)
 add_subdirectory(GPUToSPIRV)
diff --git a/mlir/lib/Conversion/GPUToLLVMSPV/CMakeLists.txt b/mlir/lib/Conversion/GPUToLLVMSPV/CMakeLists.txt
new file mode 100644
index 00000000000000..da5650b2b68dde
--- /dev/null
+++ b/mlir/lib/Conversion/GPUToLLVMSPV/CMakeLists.txt
@@ -0,0 +1,12 @@
+add_mlir_conversion_library(MLIRGPUToLLVMSPV
+  GPUToLLVMSPV.cpp
+
+  DEPENDS
+  MLIRConversionPassIncGen
+
+  LINK_LIBS PUBLIC
+  MLIRGPUDialect
+  MLIRLLVMCommonConversion
+  MLIRLLVMDialect
+  MLIRSPIRVDialect
+)
diff --git a/mlir/lib/Conversion/GPUToLLVMSPV/GPUToLLVMSPV.cpp b/mlir/lib/Conversion/GPUToLLVMSPV/GPUToLLVMSPV.cpp
new file mode 100644
index 00000000000000..91dad6ff713522
--- /dev/null
+++ b/mlir/lib/Conversion/GPUToLLVMSPV/GPUToLLVMSPV.cpp
@@ -0,0 +1,327 @@
+//===- GPUToLLVMSPV.cpp - Convert GPU operations to LLVM dialect ----------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#include "mlir/Conversion/GPUToLLVMSPV/GPUToLLVMSPVPass.h"
+
+#include "mlir/Conversion/LLVMCommon/ConversionTarget.h"
+#include "mlir/Conversion/LLVMCommon/LoweringOptions.h"
+#include "mlir/Conversion/LLVMCommon/Pattern.h"
+#include "mlir/Conversion/LLVMCommon/TypeConverter.h"
+#include "mlir/Dialect/GPU/IR/GPUDialect.h"
+#include "mlir/Dialect/LLVMIR/LLVMAttrs.h"
+#include "mlir/Dialect/LLVMIR/LLVMDialect.h"
+#include "mlir/Dialect/LLVMIR/LLVMTypes.h"
+#include "mlir/Dialect/SPIRV/IR/SPIRVDialect.h"
+#include "mlir/Dialect/SPIRV/IR/TargetAndABI.h"
+#include "mlir/IR/BuiltinTypes.h"
+#include "mlir/IR/Matchers.h"
+#include "mlir/IR/PatternMatch.h"
+#include "mlir/IR/SymbolTable.h"
+#include "mlir/Pass/Pass.h"
+#include "mlir/Support/LLVM.h"
+#include "mlir/Transforms/DialectConversion.h"
+
+#include "llvm/ADT/TypeSwitch.h"
+#include "llvm/Support/FormatVariadic.h"
+
+namespace mlir {
+#define GEN_PASS_DEF_CONVERTGPUOPSTOLLVMSPVOPS
+#include "mlir/Conversion/Passes.h.inc"
+} // namespace mlir
+
+namespace {
+using namespace mlir;
+
+//===----------------------------------------------------------------------===//
+// Helper Functions
+//===----------------------------------------------------------------------===//
+
+LLVM::LLVMFuncOp lookupOrCreateSPIRVFn(Operation *symbolTable, StringRef name,
+                                       ArrayRef<Type> paramTypes,
+                                       Type resultType) {
+  auto func = dyn_cast_or_null<LLVM::LLVMFuncOp>(
+      SymbolTable::lookupSymbolIn(symbolTable, name));
+  if (!func) {
+    OpBuilder b(symbolTable->getRegion(0));
+    func = b.create<LLVM::LLVMFuncOp>(
+        symbolTable->getLoc(), name,
+        LLVM::LLVMFunctionType::get(resultType, paramTypes));
+    func.setCConv(LLVM::cconv::CConv::SPIR_FUNC);
+  }
+  return func;
+}
+
+LLVM::CallOp createSPIRVBuiltinCall(Location loc,
+                                    ConversionPatternRewriter &rewriter,
+                                    LLVM::LLVMFuncOp func, ValueRange args) {
+  auto call = rewriter.create<LLVM::CallOp>(loc, func, args);
+  call.setCConv(func.getCConv());
+  return call;
+}
+
+//===----------------------------------------------------------------------===//
+// Barriers
+//===----------------------------------------------------------------------===//
+
+/// Replace `gpu.barrier` with an `llvm.call` to `barrier` with
+/// `CLK_LOCAL_MEM_FENCE` argument, indicating work-group memory scope:
+/// ```
+/// // gpu.barrier
+/// %c1 = llvm.mlir.constant(1: i32) : i32
+/// llvm.call spir_funccc @_Z7barrierj(%c1) : (i32) -> ()
+/// ```
+struct GPUBarrierConversion final : ConvertOpToLLVMPattern<gpu::BarrierOp> {
+  using ConvertOpToLLVMPattern::ConvertOpToLLVMPattern;
+
+  LogicalResult
+  matchAndRewrite(gpu::BarrierOp op, OpAdaptor adaptor,
+                  ConversionPatternRewriter &rewriter) const final {
+    constexpr StringLiteral funcName = "_Z7barrierj";
+
+    Operation *moduleOp = op->getParentWithTrait<OpTrait::SymbolTable>();
+    assert(moduleOp && "Expecting module");
+    Type flagTy = rewriter.getI32Type();
+    Type voidTy = rewriter.getType<LLVM::LLVMVoidType>();
+    LLVM::LLVMFuncOp func =
+        lookupOrCreateSPIRVFn(moduleOp, funcName, flagTy, voidTy);
+
+    // Value used by SPIR-V backend to represent `CLK_LOCAL_MEM_FENCE`.
+    // See `llvm/lib/Target/SPIRV/SPIRVBuiltins.td`.
+    constexpr int64_t localMemFenceFlag = 1;
+    Location loc = op->getLoc();
+    Value flag =
+        rewriter.create<LLVM::ConstantOp>(loc, flagTy, localMemFenceFlag);
+    rewriter.replaceOp(op, createSPIRVBuiltinCall(loc, rewriter, func, flag));
+    return success();
+  }
+};
+
+//===----------------------------------------------------------------------===//
+// SPIR-V Builtins
+//===----------------------------------------------------------------------===//
+
+/// Replace `gpu.*` with an `llvm.call` to the corresponding SPIR-V builtin with
+/// a constant argument for the `dimension` attribute. Return type will depend
+/// on index width option:
+/// ```
+/// // %thread_id_y = gpu.thread_id y
+/// %c1 = llvm.mlir.constant(1: i32) : i32
+/// %0 = llvm.call spir_funccc @_Z12get_local_idj(%c1) : (i32) -> i64
+/// ```
+struct LaunchConfigConversion : ConvertToLLVMPattern {
+  LaunchConfigConversion(StringRef funcName, StringRef rootOpName,
+                         MLIRContext *context,
+                         const LLVMTypeConverter &typeConverter,
+                         PatternBenefit benefit)
+      : ConvertToLLVMPattern(rootOpName, context, typeConverter, benefit),
+        funcName(funcName) {}
+
+  virtual gpu::Dimension getDimension(Operation *op) const = 0;
+
+  LogicalResult
+  matchAndRewrite(Operation *op, ArrayRef<Value> operands,
+                  ConversionPatternRewriter &rewriter) const final {
+    Operation *moduleOp = op->getParentWithTrait<OpTrait::SymbolTable>();
+    assert(moduleOp && "Expecting module");
+    Type dimTy = rewriter.getI32Type();
+    Type indexTy = getTypeConverter()->getIndexType();
+    LLVM::LLVMFuncOp func =
+        lookupOrCreateSPIRVFn(moduleOp, funcName, dimTy, indexTy);
+
+    Location loc = op->getLoc();
+    gpu::Dimension dim = getDimension(op);
+    Value dimVal = rewriter.create<LLVM::ConstantOp>(loc, dimTy,
+                                                     static_cast<int64_t>(dim));
+    rewriter.replaceOp(op, createSPIRVBuiltinCall(loc, rewriter, func, dimVal));
+    return success();
+  }
+
+  StringRef funcName;
+};
+
+template <typename SourceOp>
+struct LaunchConfigOpConversion final : LaunchConfigConversion {
+  static StringRef getFuncName();
+
+  explicit LaunchConfigOpConversion(const LLVMTypeConverter &typeConverter,
+                                    PatternBenefit benefit = 1)
+      : LaunchConfigConversion(getFuncName(), SourceOp::getOperationName(),
+                               &typeConverter.getContext(), typeConverter,
+                               benefit) {}
+
+  gpu::Dimension getDimension(Operation *op) const final {
+    return cast<SourceOp>(op).getDimension();
+  }
+};
+
+template <>
+StringRef LaunchConfigOpConversion<gpu::BlockIdOp>::getFuncName() {
+  return "_Z12get_group_idj";
+}
+
+template <>
+StringRef LaunchConfigOpConversion<gpu::GridDimOp>::getFuncName() {
+  return "_Z14get_num_groupsj";
+}
+
+template <>
+StringRef LaunchConfigOpConversion<gpu::BlockDimOp>::getFuncName() {
+  return "_Z14get_local_sizej";
+}
+
+template <>
+StringRef LaunchConfigOpConversion<gpu::ThreadIdOp>::getFuncName() {
+  return "_Z12get_local_idj";
+}
+
+template <>
+StringRef LaunchConfigOpConversion<gpu::GlobalIdOp>::getFuncName() {
+  return "_Z13get_global_idj";
+}
+
+//===----------------------------------------------------------------------===//
+// Shuffles
+//===----------------------------------------------------------------------===//
+
+/// Replace `gpu.shuffle` with an `llvm.call` to the corresponding SPIR-V
+/// builtin for `shuffleResult`, keeping `value` and `offset` arguments, and a
+/// `true` constant for the `valid` result type. Conversion will only take place
+/// if `width` is constant and equal to the `subgroup` pass option:
+/// ```
+/// // %0 = gpu.shuffle idx %value, %offset, %width : f64
+/// %0 = llvm.call spir_funccc @_Z17sub_group_shuffledj(%value, %offset)
+///     : (f64, i32) -> f64
+/// ```
+struct GPUShuffleConversion final : ConvertOpToLLVMPattern<gpu::ShuffleOp> {
+  using ConvertOpToLLVMPattern::ConvertOpToLLVMPattern;
+
+  static StringRef getBaseName(gpu::ShuffleMode mode) {
+    switch (mode) {
+    case gpu::ShuffleMode::IDX:
+      return "sub_group_shuffle";
+    case gpu::ShuffleMode::XOR:
+      return "sub_group_shuffle_xor";
+    case gpu::ShuffleMode::UP:
+      return "sub_group_shuffle_up";
+    case gpu::ShuffleMode::DOWN:
+      return "sub_group_shuffle_down";
+    }
+    llvm_unreachable("Unhandled shuffle mode");
+  }
+
+  static StringRef getTypeMangling(Type type) {
+    return TypeSwitch<Type, StringRef>(type)
+        .Case<Float32Type>([](auto) { return "fj"; })
+        .Case<Float64Type>([](auto) { return "dj"; })
+        .Case<IntegerType>([](auto intTy) {
+          switch (intTy.getWidth()) {
+          case 32:
+            return "ij";
+          case 64:
+            return "lj";
+          }
+          llvm_unreachable("Invalid integer width");
+        });
+  }
+
+  static std::string getFuncName(gpu::ShuffleOp op) {
+    StringRef baseName = getBaseName(op.getMode());
+    StringRef typeMangling = getTypeMangling(op.getType(0));
+    return llvm::formatv("_Z{0}{1}{2}", baseName.size(), baseName,
+                         typeMangling);
+  }
+
+  /// Get the subgroup size from the target or return a default.
+  static int getSubgroupSize(Operation *op) {
+    return spirv::lookupTargetEnvOrDefault(op)
+        .getResourceLimits()
+        .getSubgroupSize();
+  }
+
+  static bool hasValidWidth(gpu::ShuffleOp op) {
+    llvm::APInt val;
+    Value width = op.getWidth();
+    return matchPattern(width, m_ConstantInt(&val)) &&
+           val == getSubgroupSize(op);
+  }
+
+  LogicalResult
+  matchAndRewrite(gpu::ShuffleOp op, OpAdaptor adaptor,
+                  ConversionPatternRewriter &rewriter) const final {
+    if (!hasValidWidth(op))
+      return rewriter.notifyMatchFailure(
+          op, "shuffle width and subgroup size mismatch");
+
+    std::string funcName = getFuncName(op);
+
+    Operation *moduleOp = op->getParentWithTrait<OpTrait::SymbolTable>();
+    assert(moduleOp && "Expecting module");
+    Type valueType = adaptor.getValue().getType();
+    Type offsetType = adaptor.getOffset().getType();
+    Type resultType = valueType;
+    LLVM::LLVMFuncOp func = lookupOrCreateSPIRVFn(
+        moduleOp, funcName, {valueType, offsetType}, resultType);
+
+    Location loc = op->getLoc();
+    std::array<Value, 2> args{adaptor.getValue(), adaptor.getOffset()};
+    Value result =
+        createSPIRVBuiltinCall(loc, rewriter, func, args).getResult();
+    Value trueVal =
+        rewriter.create<LLVM::ConstantOp>(loc, rewriter.getI1Type(), true);
+    rewriter.replaceOp(op, {result, trueVal});
+    return success();
+  }
+};
+
+//===----------------------------------------------------------------------===//
+// GPU To LLVM-SPV Pass.
+//===----------------------------------------------------------------------===//
+
+struct GPUToLLVMSPVConversionPass final
+    : impl::ConvertGpuOpsToLLVMSPVOpsBase<GPUToLLVMSPVConversionPass> {
+  using Base::Base;
+
+  void runOnOperation() final {
+    MLIRContext *context = &getContext();
+    RewritePatternSet patterns(context);
+
+    LowerToLLVMOptions options(context);
+    if (indexBitwidth != kDeriveIndexBitwidthFromDataLayout)
+      options.overrideIndexBitwidth(indexBitwidth);
+
+    LLVMTypeConverter converter(context, options);
+    LLVMConversionTarget target(*context);
+
+    target.addIllegalOp<gpu::BarrierOp, gpu::BlockDimOp, gpu::BlockIdOp,
+                        gpu::GlobalIdOp, gpu::GridDimOp, gpu::ShuffleOp,
+                        gpu::ThreadIdOp>();
+
+    populateGpuToLLVMSPVConversionPatterns(converter, patterns);
+
+    if (failed(applyPartialConversion(getOperation(), target,
+                                      std::move(patterns))))
+      signalPassFailure();
+  }
+};
+} // namespace
+
+//===----------------------------------------------------------------------===//
+// GPU To LLVM-SPV Patterns.
+//===----------------------------------------------------------------------===//
+
+namespace mlir {
+void populateGpuToLLVMSPVConversionPatterns(LLVMTypeConverter &typeConverter,
+                                            RewritePatternSet &patterns) {
+  patterns.add<GPUBarrierConversion, GPUShuffleConversion,
+               LaunchConfigOpConversion<gpu::BlockIdOp>,
+               LaunchConfigOpConversion<gpu::GridDimOp>,
+               LaunchConfigOpConversion<gpu::BlockDimOp>,
+               LaunchConfigOpConversion<gpu::ThreadIdOp>,
+               LaunchConfigOpConversion<gpu::GlobalIdOp>>(typeConverter);
+}
+} // namespace mlir
diff --git a/mlir/test/Conversion/GPUToLLVMSPV/gpu-to-llvm-spv.mlir b/mlir/test/Conversion/GPUToLLVMSPV/gpu-to-llvm-spv.mlir
new file mode 100644
index 00000000000000..689c2efcf00afb
--- /dev/null
+++ b/mlir/test/Conversion/GPUToLLVMSPV/gpu-to-llvm-spv.mlir
@@ -0,0 +1,216 @@
+// RUN: mlir-opt -pass-pipeline="builtin.module(gpu.module(convert-gpu-to-llvm-spv))" -split-input-file -verify-diagnostics %s \
+// RUN: | FileCheck --check-prefixes=CHECK-64,CHECK %s
+// RUN: mlir-opt -pass-pipeline="builtin.module(gpu.module(convert-gpu-to-llvm-spv{index-bitwidth=32}))" -split-input-file -verify-diagnostics %s \
+// RUN: | FileCheck --check-prefixes=CHECK-32,CHECK %s
+
+gpu.module @builtins {
+  // CHECK-64:    llvm.func spir_funccc @_Z14get_num_groupsj(i32) -> i64
+  // CHECK-64:    llvm.func spir_funccc @_Z12get_local_idj(i32) -> i64
+  // CHECK-64:    llvm.func spir_funccc @_Z14get_local_sizej(i32) -> i64
+  // CHECK-64:    llvm.func spir_funccc @_Z13get_global_idj(i32) -> i64
+  // CHECK-64:    llvm.func spir_funccc @_Z12get_group_idj(i32) -> i64
+  // CHECK-32:    llvm.func spir_funccc @_Z14get_num_groupsj(i32) -> i32
+  // CHECK-32:    llvm.func spir_funccc @_Z12get_local_idj(i32) -> i32
+  // CHECK-32:    llvm.func spir_funccc @_Z14get_local_sizej(i32) -> i32
+  // CHECK-32:    llvm.func spir_funccc @_Z13get_global_idj(i32) -> i32
+  // CHECK-32:    llvm.func spir_funccc @_Z12get_group_idj(i32) -> i32
+
+  // CHECK-LABEL: gpu_block_id
+  func.func @gpu_block_id() -> (index, index, index) {
+    // CHECK:         [[C0:%.*]] = llvm.mlir.constant(0 : i32) : i32
+    // CHECK-64:      llvm.call spir_funccc @_Z12get_group_idj([[C0]]) : (i32) -> i64
+    // CHECK-32:      llvm.call spir_funccc @_Z12get_group_idj([[C0]]) : (i32) -> i32
+    %block_id_x = gpu.block_id x
+    // CHECK:         [[C1:%.*]] = llvm.mlir.constant(1 : i32) : i32
+    // CHECK-64:      llvm.call spir_funccc @_Z12get_group_idj([[C1]]) : (i32) -> i64
+    // CHECK-32:      llvm.call spir_funccc @_Z12get_group_idj([[C1]]) : (i32) -> i32
+    %block_id_y = gpu.block_id y
+    // CHECK:         [[C2:%.*]] = llvm.mlir.constant(2 : i32) : i32
+    // CHECK-64:      llvm.call spir_funccc @_Z12get_group_idj([[C2]]) : (i32) -> i64
+    // CHECK-32:      llvm.call spir_funccc @_Z12get_group_idj([[C2]]) : (i32) -> i32
+    %block_id_z = gpu.block_id z
+    return %block_id_x, %block_id_y, %block_id_z : index, index, index
+  }
+
+  // CHECK-LABEL: gpu_global_id
+  func.func @gpu_global_id() -> (index, index, index) {
+    // CHECK:         [[C0:%.*]] = llvm.mlir.constant(0 : i32) : i32
+    // CHECK-64:      llvm.call spir_funccc @_Z13get_global_idj([[C0]]) : (i32) -> i64
+    // CHECK-32:      llvm.call spir_funccc @_Z13get_global_idj([[C0]]) : (i32) -> i32
+    %global_id_x = gpu.global_id x
+    // CHECK:         [[C1:%.*]] = llvm.mlir.constant(1 : i32) : i32
+    // CHECK-64:      llvm.call spir_funccc @_Z13get_global_idj([[C1]]) : (i32) -> i64
+    // CHECK-32:      llvm.call spir_funccc @_Z13get_global_idj([[C1]]) : (i32) -> i32
+    %global_id_y = gpu.global_id y
+    // CHECK:         [[C2:%.*]] = llvm.mlir.constant(2 : i32) : i32
+    // CHECK-64:      llvm.call spir_funccc @_Z13get_global_idj([[C2]]) : (i32) -> i64
+    // CHECK-32:      llvm.call spir_funccc @_Z13get_global_idj([[C2]]) : (i32) -> i32
+    %global_id_z = gpu.global_id z
+    return %global_id_x, %global_id_y, %global_id_z : index, index, index
+  }
+
+  // CHECK-LABEL: gpu_block_dim
+  func.func @gpu_block_dim() -> (index, index, index) {
+    // CHECK:         [[C0:%.*]] = llvm.mlir.constant(0 : i32) : i32
+    // CHECK-64:      llvm.call spir_fun...
[truncated]

@tschuett
Copy link
Member

tschuett commented May 6, 2024

You are still too aggressive with anonymous namespaces, see coding standards.

@victor-eds victor-eds requested a review from tschuett May 6, 2024 16:01
Copy link
Member

@kuhar kuhar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM % the declaration / definition split from the most recent revision. I'd strongly prefer this to be undone.

@victor-eds
Copy link
Contributor Author

@antiagainst @joker-eph any further comments on this?

@victor-eds
Copy link
Contributor Author

I got no negative comments and all conversations resolved, so I will proceed to close this by Friday. I still want to leave some time for further reviews, as only @kuhar has approved.

@joker-eph
Copy link
Collaborator

No specific concerns for me, thanks!

@victor-eds victor-eds merged commit 98d5d34 into llvm:main May 31, 2024
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants