[MLIR][OpenMP] Make omp.distribute into a loop wrapper #87239

skatrak · 2024-04-01T13:50:23Z

This patch updates the definition of omp.distribute to enforce the restrictions of a wrapper operation.

This patch introduces an operation intended to hold loop information associated to the `omp.distribute`, `omp.simdloop`, `omp.taskloop` and `omp.wsloop` operations. This is a stopgap solution to unblock work on transitioning these operations to becoming wrappers, as discussed in [this RFC](https://discourse.llvm.org/t/rfc-representing-combined-composite-constructs-in-the-openmp-dialect/76986). Long-term, this operation will likely be replaced by `omp.canonical_loop`, which is being designed to address missing support for loop transformations, etc.

This patch defines a common interface to be shared by all OpenMP loop wrapper operations. The main restrictions these operations must meet in order to be considered a wrapper are: - They contain a single region. - Their region contains a single block. - Their block only contains another loop wrapper or `omp.loop_nest` and a terminator. The new interface is attached to the `omp.parallel`, `omp.wsloop`, `omp.simdloop`, `omp.distribute` and `omp.taskloop` operations. It is not currently enforced that these operations meet the wrapper restrictions, which would break existing OpenMP loop-generating code. Rather, this will be introduced progressively in subsequent patches.

This patch updates the definition of `omp.distribute` to enforce the restrictions of a wrapper operation.

llvmbot · 2024-04-01T13:50:57Z

@llvm/pr-subscribers-mlir-openmp
@llvm/pr-subscribers-mlir

@llvm/pr-subscribers-flang-openmp

Author: Sergio Afonso (skatrak)

Changes

This patch updates the definition of omp.distribute to enforce the restrictions of a wrapper operation.

Full diff: https://github.com/llvm/llvm-project/pull/87239.diff

4 Files Affected:

(modified) mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td (+18-4)
(modified) mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp (+11)
(modified) mlir/test/Dialect/OpenMP/invalid.mlir (+33-1)
(modified) mlir/test/Dialect/OpenMP/ops.mlir (+31-7)

diff --git a/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td b/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td
index a7bf93deae2fb3..8dbfe447616e11 100644
--- a/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td
+++ b/mlir/include/mlir/Dialect/OpenMP/OpenMPOps.td
@@ -816,7 +816,8 @@ def YieldOp : OpenMP_Op<"yield",
 //===----------------------------------------------------------------------===//
 def DistributeOp : OpenMP_Op<"distribute", [AttrSizedOperandSegments,
                              DeclareOpInterfaceMethods<LoopWrapperInterface>,
-                             RecursiveMemoryEffects]> {
+                             RecursiveMemoryEffects,
+                             SingleBlockImplicitTerminator<"TerminatorOp">]> {
   let summary = "distribute construct";
   let description = [{
     The distribute construct specifies that the iterations of one or more loops
@@ -831,15 +832,28 @@ def DistributeOp : OpenMP_Op<"distribute", [AttrSizedOperandSegments,
     The distribute loop construct specifies that the iterations of the loop(s)
     will be executed in parallel by threads in the current context. These
     iterations are spread across threads that already exist in the enclosing
-    region. The lower and upper bounds specify a half-open range: the
-    range includes the lower bound but does not include the upper bound. If the
-    `inclusive` attribute is specified then the upper bound is also included.
+    region.
+    
+    The body region can contain a single block which must contain a single
+    operation and a terminator. The operation must be another compatible loop
+    wrapper or an `omp.loop_nest`.
 
     The `dist_schedule_static` attribute specifies the  schedule for this
     loop, determining how the loop is distributed across the parallel threads.
     The optional `schedule_chunk` associated with this determines further
     controls this distribution.
 
+    ```mlir
+    omp.distribute <clauses> {
+      omp.loop_nest (%i1, %i2) : index = (%c0, %c0) to (%c10, %c10) step (%c1, %c1) {
+        %a = load %arrA[%i1, %i2] : memref<?x?xf32>
+        %b = load %arrB[%i1, %i2] : memref<?x?xf32>
+        %sum = arith.addf %a, %b : f32
+        store %sum, %arrC[%i1, %i2] : memref<?x?xf32>
+        omp.yield
+      }
+    }
+    ```
     // TODO: private_var, firstprivate_var, lastprivate_var, collapse
   }];
   let arguments = (ins
diff --git a/mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp b/mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp
index 564c23201db4fd..b407d27ef53e39 100644
--- a/mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp
+++ b/mlir/lib/Dialect/OpenMP/IR/OpenMPDialect.cpp
@@ -1516,6 +1516,17 @@ LogicalResult DistributeOp::verify() {
     return emitError(
         "expected equal sizes for allocate and allocator variables");
 
+  if (!isWrapper())
+    return emitOpError() << "must be a loop wrapper";
+
+  if (LoopWrapperInterface nested = getNestedWrapper()) {
+    // Check for the allowed leaf constructs that may appear in a composite
+    // construct directly after DISTRIBUTE.
+    if (!isa<ParallelOp, SimdLoopOp>(nested))
+      return emitError() << "only supported nested wrappers are 'omp.parallel' "
+                            "and 'omp.simdloop'";
+  }
+
   return success();
 }
 
diff --git a/mlir/test/Dialect/OpenMP/invalid.mlir b/mlir/test/Dialect/OpenMP/invalid.mlir
index 8f4103dabee5df..35f5d24deb5d17 100644
--- a/mlir/test/Dialect/OpenMP/invalid.mlir
+++ b/mlir/test/Dialect/OpenMP/invalid.mlir
@@ -1847,7 +1847,16 @@ func.func @omp_target_depend(%data_var: memref<i32>) {
 
 // -----
 
-func.func @omp_distribute(%data_var : memref<i32>) -> () {
+func.func @omp_distribute_schedule(%chunk_size : i32) -> () {
+  // expected-error @below {{op chunk size set without dist_schedule_static being present}}
+  "omp.distribute"(%chunk_size) <{operandSegmentSizes = array<i32: 1, 0, 0>}> ({
+      "omp.terminator"() : () -> ()
+    }) : (i32) -> ()
+}
+
+// -----
+
+func.func @omp_distribute_allocate(%data_var : memref<i32>) -> () {
   // expected-error @below {{expected equal sizes for allocate and allocator variables}}
   "omp.distribute"(%data_var) <{operandSegmentSizes = array<i32: 0, 1, 0>}> ({
       "omp.terminator"() : () -> ()
@@ -1856,6 +1865,29 @@ func.func @omp_distribute(%data_var : memref<i32>) -> () {
 
 // -----
 
+func.func @omp_distribute_wrapper() -> () {
+  // expected-error @below {{op must be a loop wrapper}}
+  "omp.distribute"() ({
+      %0 = arith.constant 0 : i32
+      "omp.terminator"() : () -> ()
+    }) : () -> ()
+}
+
+// -----
+
+func.func @omp_distribute_nested_wrapper(%data_var : memref<i32>) -> () {
+  // expected-error @below {{only supported nested wrappers are 'omp.parallel' and 'omp.simdloop'}}
+  "omp.distribute"() ({
+      "omp.wsloop"() ({
+        %0 = arith.constant 0 : i32
+        "omp.terminator"() : () -> ()
+      }) : () -> ()
+      "omp.terminator"() : () -> ()
+    }) : () -> ()
+}
+
+// -----
+
 omp.private {type = private} @x.privatizer : i32 alloc {
 ^bb0(%arg0: i32):
   %0 = arith.constant 0.0 : f32
diff --git a/mlir/test/Dialect/OpenMP/ops.mlir b/mlir/test/Dialect/OpenMP/ops.mlir
index 8d9acab67e0358..a7b0832eff21f3 100644
--- a/mlir/test/Dialect/OpenMP/ops.mlir
+++ b/mlir/test/Dialect/OpenMP/ops.mlir
@@ -559,30 +559,54 @@ func.func @omp_simdloop_pretty_multiple(%lb1 : index, %ub1 : index, %step1 : ind
 }
 
 // CHECK-LABEL: omp_distribute
-func.func @omp_distribute(%chunk_size : i32, %data_var : memref<i32>) -> () {
+func.func @omp_distribute(%chunk_size : i32, %data_var : memref<i32>, %arg0 : i32) -> () {
   // CHECK: omp.distribute
   "omp.distribute" () ({
-    omp.terminator
+    "omp.loop_nest" (%arg0, %arg0, %arg0) ({
+    ^bb0(%iv: i32):
+      "omp.yield"() : () -> ()
+    }) : (i32, i32, i32) -> ()
+    "omp.terminator"() : () -> ()
   }) {} : () -> ()
   // CHECK: omp.distribute
   omp.distribute {
-    omp.terminator
+    omp.loop_nest (%iv) : i32 = (%arg0) to (%arg0) step (%arg0) {
+      omp.yield
+    }
   }
   // CHECK: omp.distribute dist_schedule_static
   omp.distribute dist_schedule_static {
-    omp.terminator
+    omp.loop_nest (%iv) : i32 = (%arg0) to (%arg0) step (%arg0) {
+      omp.yield
+    }
   }
   // CHECK: omp.distribute dist_schedule_static chunk_size(%{{.+}} : i32)
   omp.distribute dist_schedule_static chunk_size(%chunk_size : i32) {
-    omp.terminator
+    omp.loop_nest (%iv) : i32 = (%arg0) to (%arg0) step (%arg0) {
+      omp.yield
+    }
   }
   // CHECK: omp.distribute order(concurrent)
   omp.distribute order(concurrent) {
-    omp.terminator
+    omp.loop_nest (%iv) : i32 = (%arg0) to (%arg0) step (%arg0) {
+      omp.yield
+    }
   }
   // CHECK: omp.distribute allocate(%{{.+}} : memref<i32> -> %{{.+}} : memref<i32>)
   omp.distribute allocate(%data_var : memref<i32> -> %data_var : memref<i32>) {
-    omp.terminator
+    omp.loop_nest (%iv) : i32 = (%arg0) to (%arg0) step (%arg0) {
+      omp.yield
+    }
+  }
+  // CHECK: omp.distribute
+  omp.distribute {
+    // TODO Remove induction variables from omp.simdloop.
+    omp.simdloop for (%iv) : i32 = (%arg0) to (%arg0) step (%arg0) {
+      omp.loop_nest (%iv2) : i32 = (%arg0) to (%arg0) step (%arg0) {
+        omp.yield
+      }
+      omp.yield
+    }
   }
 return
 }

…/spr/loop-nest-02-wrapper-iface

…s/skatrak/spr/loop-nest-03-distribute-mlir

skatrak · 2024-04-15T10:45:52Z

This PR is now up to date and ready to land. I'll wait one more day to give time in case someone else wants to share any further feedback. Thank you!

mjklemm

LGTM

skatrak added 3 commits March 29, 2024 16:07

[MLIR][OpenMP] Make omp.distribute into a loop wrapper

4537071

This patch updates the definition of `omp.distribute` to enforce the restrictions of a wrapper operation.

skatrak requested review from ergawy, Meinersbur, jsjodin, kiranchandramohan, abidh, mjklemm, raghavendhra, agozillon, DominikAdamski, TIFitis, kparzysz and bhandarkar-pranav April 1, 2024 13:50

llvmbot added mlir mlir:openmp flang:openmp labels Apr 1, 2024

Meinersbur approved these changes Apr 2, 2024

View reviewed changes

skatrak added 4 commits April 2, 2024 14:15

Update op description according to review comments

e3c440c

Address review comments

904f274

Merge branch 'users/skatrak/spr/loop-nest-01-mlir' into users/skatrak…

99e8123

…/spr/loop-nest-02-wrapper-iface

Merge branch 'users/skatrak/spr/loop-nest-02-wrapper-iface' into user…

66e13f9

…s/skatrak/spr/loop-nest-03-distribute-mlir

Base automatically changed from users/skatrak/spr/loop-nest-02-wrapper-iface to main April 15, 2024 09:33

Merge branch 'main' into users/skatrak/spr/loop-nest-03-distribute-mlir

abcf52a

mjklemm approved these changes Apr 15, 2024

View reviewed changes

skatrak mentioned this pull request Apr 15, 2024

[Flang][OpenMP][Lower] Split MLIR codegen for clauses and constructs #86963

Merged

Merge branch 'main' into users/skatrak/spr/loop-nest-03-distribute-mlir

8ce3f95

skatrak merged commit 70fe6ad into main Apr 16, 2024
3 of 4 checks passed

skatrak deleted the users/skatrak/spr/loop-nest-03-distribute-mlir branch April 16, 2024 09:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MLIR][OpenMP] Make omp.distribute into a loop wrapper #87239

[MLIR][OpenMP] Make omp.distribute into a loop wrapper #87239

skatrak commented Apr 1, 2024

llvmbot commented Apr 1, 2024 •

edited

skatrak commented Apr 15, 2024

mjklemm left a comment

[MLIR][OpenMP] Make omp.distribute into a loop wrapper #87239

[MLIR][OpenMP] Make omp.distribute into a loop wrapper #87239

Conversation

skatrak commented Apr 1, 2024

llvmbot commented Apr 1, 2024 • edited

skatrak commented Apr 15, 2024

mjklemm left a comment

Choose a reason for hiding this comment

llvmbot commented Apr 1, 2024 •

edited