Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[mlir][gpu][bufferization] Implement BufferDeallocationOpInterface for gpu.terminator #66880

Conversation

maerhart
Copy link
Member

This is necessary to support deallocation of IR with gpu.launch operations because it does not implement the RegionBranchOpInterface. Implementing the interface would require it to support regions with unstructured control flow and produced arguments/results.

@llvmbot
Copy link
Collaborator

llvmbot commented Sep 20, 2023

@llvm/pr-subscribers-mlir-bufferization
@llvm/pr-subscribers-mlir
@llvm/pr-subscribers-mlir-scf

@llvm/pr-subscribers-mlir-gpu

Changes

This is necessary to support deallocation of IR with gpu.launch operations because it does not implement the RegionBranchOpInterface. Implementing the interface would require it to support regions with unstructured control flow and produced arguments/results.


Full diff: https://github.com/llvm/llvm-project/pull/66880.diff

10 Files Affected:

  • (modified) mlir/include/mlir/Dialect/Bufferization/IR/BufferDeallocationOpInterface.h (+13)
  • (added) mlir/include/mlir/Dialect/GPU/Transforms/BufferDeallocationOpInterfaceImpl.h (+22)
  • (modified) mlir/include/mlir/InitAllDialects.h (+2)
  • (modified) mlir/lib/Dialect/Bufferization/IR/BufferDeallocationOpInterface.cpp (+41)
  • (modified) mlir/lib/Dialect/Bufferization/Transforms/OwnershipBasedBufferDeallocation.cpp (+6-29)
  • (modified) mlir/lib/Dialect/GPU/CMakeLists.txt (+1)
  • (added) mlir/lib/Dialect/GPU/Transforms/BufferDeallocationOpInterfaceImpl.cpp (+37)
  • (modified) mlir/lib/Dialect/SCF/Transforms/BufferDeallocationOpInterfaceImpl.cpp (+3-24)
  • (added) mlir/test/Dialect/GPU/bufferization-buffer-deallocation.mlir (+18)
  • (modified) utils/bazel/llvm-project-overlay/mlir/BUILD.bazel (+1)
diff --git a/mlir/include/mlir/Dialect/Bufferization/IR/BufferDeallocationOpInterface.h b/mlir/include/mlir/Dialect/Bufferization/IR/BufferDeallocationOpInterface.h
index 7ac4592de7875fb..4f72eb0c62f298f 100644
--- a/mlir/include/mlir/Dialect/Bufferization/IR/BufferDeallocationOpInterface.h
+++ b/mlir/include/mlir/Dialect/Bufferization/IR/BufferDeallocationOpInterface.h
@@ -205,6 +205,19 @@ class DeallocationState {
   Liveness liveness;
 };
 
+namespace deallocation_impl {
+/// Insert a `bufferization.dealloc` operation right before 'op' which has to be
+/// a terminator without any successors. Note that it is not required to have
+/// the ReturnLike trait attached. The MemRef values in the 'operands' argument
+/// will be added to the list of retained values and their updated ownership
+/// values will be appended to the 'updatedOperandOwnerships' list. 'op' is not
+/// modified in any way. Returns failure or the original 'op'.
+FailureOr<Operation *>
+insertDeallocOpForReturnLike(DeallocationState &state, Operation *op,
+                             ValueRange operands,
+                             SmallVectorImpl<Value> &updatedOperandOwnerships);
+} // namespace deallocation_impl
+
 } // namespace bufferization
 } // namespace mlir
 
diff --git a/mlir/include/mlir/Dialect/GPU/Transforms/BufferDeallocationOpInterfaceImpl.h b/mlir/include/mlir/Dialect/GPU/Transforms/BufferDeallocationOpInterfaceImpl.h
new file mode 100644
index 000000000000000..16cf96980de136d
--- /dev/null
+++ b/mlir/include/mlir/Dialect/GPU/Transforms/BufferDeallocationOpInterfaceImpl.h
@@ -0,0 +1,22 @@
+//===- BufferDeallocationOpInterfaceImpl.h ----------------------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef MLIR_DIALECT_GPU_TRANSFORMS_BUFFERDEALLOCATIONOPINTERFACEIMPL_H
+#define MLIR_DIALECT_GPU_TRANSFORMS_BUFFERDEALLOCATIONOPINTERFACEIMPL_H
+
+namespace mlir {
+
+class DialectRegistry;
+
+namespace gpu {
+void registerBufferDeallocationOpInterfaceExternalModels(
+    DialectRegistry &registry);
+} // namespace gpu
+} // namespace mlir
+
+#endif // MLIR_DIALECT_GPU_TRANSFORMS_BUFFERDEALLOCATIONOPINTERFACEIMPL_H
diff --git a/mlir/include/mlir/InitAllDialects.h b/mlir/include/mlir/InitAllDialects.h
index 5b2b1ed24d5173d..8a085d91cedffbd 100644
--- a/mlir/include/mlir/InitAllDialects.h
+++ b/mlir/include/mlir/InitAllDialects.h
@@ -36,6 +36,7 @@
 #include "mlir/Dialect/EmitC/IR/EmitC.h"
 #include "mlir/Dialect/Func/IR/FuncOps.h"
 #include "mlir/Dialect/GPU/IR/GPUDialect.h"
+#include "mlir/Dialect/GPU/Transforms/BufferDeallocationOpInterfaceImpl.h"
 #include "mlir/Dialect/IRDL/IR/IRDL.h"
 #include "mlir/Dialect/Index/IR/IndexDialect.h"
 #include "mlir/Dialect/LLVMIR/LLVMDialect.h"
@@ -144,6 +145,7 @@ inline void registerAllDialects(DialectRegistry &registry) {
   builtin::registerCastOpInterfaceExternalModels(registry);
   cf::registerBufferizableOpInterfaceExternalModels(registry);
   cf::registerBufferDeallocationOpInterfaceExternalModels(registry);
+  gpu::registerBufferDeallocationOpInterfaceExternalModels(registry);
   linalg::registerBufferizableOpInterfaceExternalModels(registry);
   linalg::registerTilingInterfaceExternalModels(registry);
   linalg::registerValueBoundsOpInterfaceExternalModels(registry);
diff --git a/mlir/lib/Dialect/Bufferization/IR/BufferDeallocationOpInterface.cpp b/mlir/lib/Dialect/Bufferization/IR/BufferDeallocationOpInterface.cpp
index 407d75e2426e9f9..8d21446f1eb777e 100644
--- a/mlir/lib/Dialect/Bufferization/IR/BufferDeallocationOpInterface.cpp
+++ b/mlir/lib/Dialect/Bufferization/IR/BufferDeallocationOpInterface.cpp
@@ -272,3 +272,44 @@ bool ValueComparator::operator()(const Value &lhs, const Value &rhs) const {
   assert(lhsRegion && "this should only happen if lhs == rhs");
   return false;
 }
+
+//===----------------------------------------------------------------------===//
+// Implementation utilities
+//===----------------------------------------------------------------------===//
+
+FailureOr<Operation *> deallocation_impl::insertDeallocOpForReturnLike(
+    DeallocationState &state, Operation *op, ValueRange operands,
+    SmallVectorImpl<Value> &updatedOperandOwnerships) {
+  assert(op->hasTrait<OpTrait::IsTerminator>() && "must be a terminator");
+  assert(!op->hasSuccessors() && "must not have any successors");
+  // Collect the values to deallocate and retain and use them to create the
+  // dealloc operation.
+  OpBuilder builder(op);
+  Block *block = op->getBlock();
+  SmallVector<Value> memrefs, conditions, toRetain;
+  if (failed(state.getMemrefsAndConditionsToDeallocate(
+          builder, op->getLoc(), block, memrefs, conditions)))
+    return failure();
+
+  state.getMemrefsToRetain(block, /*toBlock=*/nullptr, operands, toRetain);
+  if (memrefs.empty() && toRetain.empty())
+    return op;
+
+  auto deallocOp = builder.create<bufferization::DeallocOp>(
+      op->getLoc(), memrefs, conditions, toRetain);
+
+  // We want to replace the current ownership of the retained values with the
+  // result values of the dealloc operation as they are always unique.
+  state.resetOwnerships(deallocOp.getRetained(), block);
+  for (auto [retained, ownership] :
+       llvm::zip(deallocOp.getRetained(), deallocOp.getUpdatedConditions()))
+    state.updateOwnership(retained, ownership, block);
+
+  unsigned numMemrefOperands = llvm::count_if(operands, isMemref);
+  auto newOperandOwnerships =
+      deallocOp.getUpdatedConditions().take_front(numMemrefOperands);
+  updatedOperandOwnerships.append(newOperandOwnerships.begin(),
+                                  newOperandOwnerships.end());
+
+  return op;
+}
diff --git a/mlir/lib/Dialect/Bufferization/Transforms/OwnershipBasedBufferDeallocation.cpp b/mlir/lib/Dialect/Bufferization/Transforms/OwnershipBasedBufferDeallocation.cpp
index 09d30835828084d..94a26f3aff5e019 100644
--- a/mlir/lib/Dialect/Bufferization/Transforms/OwnershipBasedBufferDeallocation.cpp
+++ b/mlir/lib/Dialect/Bufferization/Transforms/OwnershipBasedBufferDeallocation.cpp
@@ -47,10 +47,6 @@ static Value buildBoolValue(OpBuilder &builder, Location loc, bool value) {
 
 static bool isMemref(Value v) { return v.getType().isa<BaseMemRefType>(); }
 
-static bool isMemrefOperand(OpOperand &operand) {
-  return isMemref(operand.get());
-}
-
 //===----------------------------------------------------------------------===//
 // Backedges analysis
 //===----------------------------------------------------------------------===//
@@ -917,35 +913,16 @@ BufferDeallocation::handleInterface(RegionBranchTerminatorOpInterface op) {
   MutableOperandRange operands =
       op.getMutableSuccessorOperands(RegionBranchPoint::parent());
 
-  // Collect the values to deallocate and retain and use them to create the
-  // dealloc operation.
-  Block *block = op->getBlock();
-  SmallVector<Value> memrefs, conditions, toRetain;
-  if (failed(state.getMemrefsAndConditionsToDeallocate(
-          builder, op.getLoc(), block, memrefs, conditions)))
-    return failure();
-
-  state.getMemrefsToRetain(block, nullptr, OperandRange(operands), toRetain);
-  if (memrefs.empty() && toRetain.empty())
-    return op.getOperation();
-
-  auto deallocOp = builder.create<bufferization::DeallocOp>(
-      op.getLoc(), memrefs, conditions, toRetain);
-
-  // We want to replace the current ownership of the retained values with the
-  // result values of the dealloc operation as they are always unique.
-  state.resetOwnerships(deallocOp.getRetained(), block);
-  for (auto [retained, ownership] :
-       llvm::zip(deallocOp.getRetained(), deallocOp.getUpdatedConditions()))
-    state.updateOwnership(retained, ownership, block);
+  SmallVector<Value> updatedOwnerships;
+  auto result = deallocation_impl::insertDeallocOpForReturnLike(
+      state, op, OperandRange(operands), updatedOwnerships);
+  if (failed(result) || !*result)
+    return result;
 
   // Add an additional operand for every MemRef for the ownership indicator.
   if (!funcWithoutDynamicOwnership) {
-    unsigned numMemRefs = llvm::count_if(operands, isMemrefOperand);
     SmallVector<Value> newOperands{OperandRange(operands)};
-    auto ownershipValues =
-        deallocOp.getUpdatedConditions().take_front(numMemRefs);
-    newOperands.append(ownershipValues.begin(), ownershipValues.end());
+    newOperands.append(updatedOwnerships.begin(), updatedOwnerships.end());
     operands.assign(newOperands);
   }
 
diff --git a/mlir/lib/Dialect/GPU/CMakeLists.txt b/mlir/lib/Dialect/GPU/CMakeLists.txt
index 6244132c073a4a6..9158929d383cf54 100644
--- a/mlir/lib/Dialect/GPU/CMakeLists.txt
+++ b/mlir/lib/Dialect/GPU/CMakeLists.txt
@@ -79,6 +79,7 @@ add_mlir_dialect_library(MLIRGPUTransforms
   MLIRAffineUtils
   MLIRArithDialect
   MLIRAsyncDialect
+  MLIRBufferizationDialect
   MLIRBuiltinToLLVMIRTranslation
   MLIRDataLayoutInterfaces
   MLIRExecutionEngineUtils
diff --git a/mlir/lib/Dialect/GPU/Transforms/BufferDeallocationOpInterfaceImpl.cpp b/mlir/lib/Dialect/GPU/Transforms/BufferDeallocationOpInterfaceImpl.cpp
new file mode 100644
index 000000000000000..6ccc0a26426c147
--- /dev/null
+++ b/mlir/lib/Dialect/GPU/Transforms/BufferDeallocationOpInterfaceImpl.cpp
@@ -0,0 +1,37 @@
+//===- BufferDeallocationOpInterfaceImpl.cpp ------------------------------===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+
+#include "mlir/Dialect/GPU/Transforms/BufferDeallocationOpInterfaceImpl.h"
+#include "mlir/Dialect/Bufferization/IR/BufferDeallocationOpInterface.h"
+#include "mlir/Dialect/Bufferization/IR/Bufferization.h"
+#include "mlir/Dialect/GPU/IR/GPUDialect.h"
+
+using namespace mlir;
+using namespace mlir::bufferization;
+
+namespace {
+///
+struct GPUTerminatorOpInterface
+    : public BufferDeallocationOpInterface::ExternalModel<
+          GPUTerminatorOpInterface, gpu::TerminatorOp> {
+  FailureOr<Operation *> process(Operation *op, DeallocationState &state,
+                                 const DeallocationOptions &options) const {
+    SmallVector<Value> updatedOperandOwnerships;
+    return deallocation_impl::insertDeallocOpForReturnLike(
+        state, op, {}, updatedOperandOwnerships);
+  }
+};
+
+} // namespace
+
+void mlir::gpu::registerBufferDeallocationOpInterfaceExternalModels(
+    DialectRegistry &registry) {
+  registry.addExtension(+[](MLIRContext *ctx, GPUDialect *dialect) {
+    gpu::TerminatorOp::attachInterface<GPUTerminatorOpInterface>(*ctx);
+  });
+}
diff --git a/mlir/lib/Dialect/SCF/Transforms/BufferDeallocationOpInterfaceImpl.cpp b/mlir/lib/Dialect/SCF/Transforms/BufferDeallocationOpInterfaceImpl.cpp
index 88cb3e9b097147f..4ded8ba55013dc6 100644
--- a/mlir/lib/Dialect/SCF/Transforms/BufferDeallocationOpInterfaceImpl.cpp
+++ b/mlir/lib/Dialect/SCF/Transforms/BufferDeallocationOpInterfaceImpl.cpp
@@ -47,33 +47,12 @@ struct InParallelOpInterface
   FailureOr<Operation *> process(Operation *op, DeallocationState &state,
                                  const DeallocationOptions &options) const {
     auto inParallelOp = cast<scf::InParallelOp>(op);
-    OpBuilder builder(op);
     if (!inParallelOp.getBody()->empty())
       return op->emitError("only supported when nested region is empty");
 
-    // Collect the values to deallocate and retain and use them to create the
-    // dealloc operation.
-    Block *block = op->getBlock();
-    SmallVector<Value> memrefs, conditions, toRetain;
-    if (failed(state.getMemrefsAndConditionsToDeallocate(
-            builder, op->getLoc(), block, memrefs, conditions)))
-      return failure();
-
-    state.getMemrefsToRetain(block, /*toBlock=*/nullptr, {}, toRetain);
-    if (memrefs.empty() && toRetain.empty())
-      return op;
-
-    auto deallocOp = builder.create<bufferization::DeallocOp>(
-        op->getLoc(), memrefs, conditions, toRetain);
-
-    // We want to replace the current ownership of the retained values with the
-    // result values of the dealloc operation as they are always unique.
-    state.resetOwnerships(deallocOp.getRetained(), block);
-    for (auto [retained, ownership] :
-         llvm::zip(deallocOp.getRetained(), deallocOp.getUpdatedConditions()))
-      state.updateOwnership(retained, ownership, block);
-
-    return op;
+    SmallVector<Value> updatedOperandOwnership;
+    return deallocation_impl::insertDeallocOpForReturnLike(
+        state, op, {}, updatedOperandOwnership);
   }
 };
 
diff --git a/mlir/test/Dialect/GPU/bufferization-buffer-deallocation.mlir b/mlir/test/Dialect/GPU/bufferization-buffer-deallocation.mlir
new file mode 100644
index 000000000000000..25349967e61d3e0
--- /dev/null
+++ b/mlir/test/Dialect/GPU/bufferization-buffer-deallocation.mlir
@@ -0,0 +1,18 @@
+// RUN: mlir-opt %s -buffer-deallocation-pipeline --allow-unregistered-dialect | FileCheck %s
+
+func.func @gpu_launch() {
+  %c1 = arith.constant 1 : index
+  gpu.launch blocks(%arg0, %arg1, %arg2) in (%arg6 = %c1, %arg7 = %c1, %arg8 = %c1)
+    threads(%arg3, %arg4, %arg5) in (%arg9 = %c1, %arg10 = %c1, %arg11 = %c1) {
+    %alloc = memref.alloc() : memref<2xf32>
+    "test.memref_user"(%alloc) : (memref<2xf32>) -> ()
+    gpu.terminator
+  }
+  return
+}
+
+// CHECK-LABEL: func @gpu_launch
+//       CHECK:   gpu.launch
+//       CHECK:     [[ALLOC:%.+]] = memref.alloc(
+//       CHECK:     memref.dealloc [[ALLOC]]
+//       CHECK:     gpu.terminator
diff --git a/utils/bazel/llvm-project-overlay/mlir/BUILD.bazel b/utils/bazel/llvm-project-overlay/mlir/BUILD.bazel
index 7ae9b6173ec727f..3c167abbd5e9ae3 100644
--- a/utils/bazel/llvm-project-overlay/mlir/BUILD.bazel
+++ b/utils/bazel/llvm-project-overlay/mlir/BUILD.bazel
@@ -4927,6 +4927,7 @@ cc_library(
         ":ArithDialect",
         ":AsmParser",
         ":AsyncDialect",
+        ":BufferizationDialect",
         ":ControlFlowDialect",
         ":DLTIDialect",
         ":DialectUtils",

@maerhart maerhart force-pushed the merhart_bufferdeallocationinterface_gpu_terminator branch from 544823b to f03c09d Compare September 20, 2023 10:02
…r gpu.terminator

This is necessary to support deallocation of IR with gpu.launch operations
because it does not implement the RegionBranchOpInterface. Implementing the
interface would require it to support regions with unstructured control flow
and produced arguments/results.
@maerhart maerhart force-pushed the merhart_bufferdeallocationinterface_gpu_terminator branch from f03c09d to 1192f5a Compare September 20, 2023 10:11
@maerhart maerhart merged commit 522c1d0 into llvm:main Sep 20, 2023
2 checks passed
@maerhart maerhart deleted the merhart_bufferdeallocationinterface_gpu_terminator branch September 20, 2023 10:28
Guzhu-AMD pushed a commit to GPUOpen-Drivers/llvm-project that referenced this pull request Sep 28, 2023
Local branch amd-gfx 286fb30 Merged main:71c83fb8b618 into amd-gfx:71157ae85e25
Remote branch main 522c1d0 [mlir][gpu][bufferization] Implement BufferDeallocationOpInterface for gpu.terminator (llvm#66880)

Change-Id: Ia9139d64b9c0614f8597c2ba63023ae3a7a10051
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants