[mlir][memref][vector] Fold memref.subview into out-of-bounds vector transfer ops #80517

dcaballe · 2024-02-03T00:22:18Z

This PR removes a precondition check to fold a memref.subview into a vector.transfer_read or vector.transfer_write with in_bounds set to false. There is no reason to not to do so as long as the same in_bounds is preserved after the folding, which is what the implementation was doing already.

… ops This PR removes a precondition check to fold a `memref.subview` into a `vector.transfer_read` or `vector.transfer_write` with `in_bounds` set to false. There is no reason to not to do so as long as the same `in_bounds` is preserved after the folding, which is what the implementation was doing already.

llvmbot · 2024-02-03T00:22:59Z

@llvm/pr-subscribers-mlir

@llvm/pr-subscribers-mlir-memref

Author: Diego Caballero (dcaballe)

Changes

This PR removes a precondition check to fold a memref.subview into a vector.transfer_read or vector.transfer_write with in_bounds set to false. There is no reason to not to do so as long as the same in_bounds is preserved after the folding, which is what the implementation was doing already.

Full diff: https://github.com/llvm/llvm-project/pull/80517.diff

2 Files Affected:

(modified) mlir/lib/Dialect/MemRef/Transforms/FoldMemRefAliasOps.cpp (-2)
(modified) mlir/test/Dialect/MemRef/fold-memref-alias-ops.mlir (+41-1)

diff --git a/mlir/lib/Dialect/MemRef/Transforms/FoldMemRefAliasOps.cpp b/mlir/lib/Dialect/MemRef/Transforms/FoldMemRefAliasOps.cpp
index aa44455ada7f9..c15056cd168c9 100644
--- a/mlir/lib/Dialect/MemRef/Transforms/FoldMemRefAliasOps.cpp
+++ b/mlir/lib/Dialect/MemRef/Transforms/FoldMemRefAliasOps.cpp
@@ -348,8 +348,6 @@ preconditionsFoldSubViewOpImpl(RewriterBase &rewriter, XferOp xferOp,
   static_assert(
       !llvm::is_one_of<vector::TransferReadOp, vector::TransferWriteOp>::value,
       "must be a vector transfer op");
-  if (xferOp.hasOutOfBoundsDim())
-    return rewriter.notifyMatchFailure(xferOp, "out of bounds transfer dim");
   if (!subviewOp.hasUnitStride()) {
     return rewriter.notifyMatchFailure(
         xferOp, "non-1 stride subview, need to track strides in folded memref");
diff --git a/mlir/test/Dialect/MemRef/fold-memref-alias-ops.mlir b/mlir/test/Dialect/MemRef/fold-memref-alias-ops.mlir
index 5b853a6cc5a37..6ad13d2176b34 100644
--- a/mlir/test/Dialect/MemRef/fold-memref-alias-ops.mlir
+++ b/mlir/test/Dialect/MemRef/fold-memref-alias-ops.mlir
@@ -111,6 +111,27 @@ func.func @fold_subview_with_transfer_read(%arg0 : memref<12x32xf32>, %arg1 : in
 
 // -----
 
+func.func @fold_subview_with_oob_transfer_read(%arg0 : memref<12x32xf32>, %arg1 : index, %arg2 : index, %arg3 : index, %arg4 : index) -> vector<32xf32> {
+  %f1 = arith.constant 1.0 : f32
+
+  %0 = memref.subview %arg0[%arg1, %arg2][8, 8][1, 1] : memref<12x32xf32> to memref<8x8xf32, strided<[256, 1], offset: ?>>
+  %1 = vector.transfer_read %0[%arg3, %arg4], %f1 {in_bounds = [false]} : memref<8x8xf32, strided<[256, 1], offset: ?>>, vector<32xf32>
+  return %1 : vector<32xf32>
+}
+
+//      CHECK: #[[MAP:[a-zA-Z0-9]+]] = affine_map<()[s0, s1] -> (s0 + s1)>
+//      CHECK: func @fold_subview_with_oob_transfer_read
+// CHECK-SAME:   %[[MEM:[a-zA-Z0-9_]+]]: memref<12x32xf32>
+// CHECK-SAME:   %[[SZ0:[a-zA-Z0-9_]+]]: index
+// CHECK-SAME:   %[[SZ1:[a-zA-Z0-9_]+]]: index
+// CHECK-SAME:   %[[IDX0:[a-zA-Z0-9_]+]]: index
+// CHECK-SAME:   %[[IDX1:[a-zA-Z0-9_]+]]: index
+//      CHECK:   %[[M0:[a-zA-Z0-9_]+]] = affine.apply #[[MAP]]()[%[[SZ0]], %[[IDX0]]]
+//      CHECK:   %[[M1:[a-zA-Z0-9_]+]] = affine.apply #[[MAP]]()[%[[SZ1]], %[[IDX1]]]
+//      CHECK:   vector.transfer_read %[[MEM]][%[[M0]], %[[M1]]], %{{[a-zA-Z0-9]+}} : memref<12x32xf32>, vector<32xf32>
+
+// -----
+
 func.func @fold_static_stride_subview_with_transfer_write_0d(
     %arg0 : memref<12x32xf32>, %arg1 : index, %arg2 : index, %arg3 : index,
     %v : vector<f32>) {
@@ -141,6 +162,25 @@ func.func @fold_static_stride_subview_with_transfer_write(%arg0 : memref<12x32xf
 
 // -----
 
+func.func @fold_subview_with_oob_transfer_write(%arg0 : memref<12x32xf32>, %arg1 : index, %arg2 : index, %arg3 : index, %arg4 : index, %arg5 : vector<32xf32>) {
+  %0 = memref.subview %arg0[%arg1, %arg2][8, 8][1, 1] : memref<12x32xf32> to memref<8x8xf32, strided<[256, 1], offset: ?>>
+  vector.transfer_write %arg5, %0[%arg3, %arg4] {in_bounds = [false]} : vector<32xf32>, memref<8x8xf32, strided<[256, 1], offset: ?>>
+  return
+}
+//      CHECK: #[[MAP:[a-zA-Z0-9]+]] = affine_map<()[s0, s1] -> (s0 + s1)>
+//      CHECK: func @fold_subview_with_oob_transfer_write
+// CHECK-SAME:   %[[MEM:[a-zA-Z0-9_]+]]: memref<12x32xf32>
+// CHECK-SAME:   %[[SZ0:[a-zA-Z0-9_]+]]: index
+// CHECK-SAME:   %[[SZ1:[a-zA-Z0-9_]+]]: index
+// CHECK-SAME:   %[[IDX0:[a-zA-Z0-9_]+]]: index
+// CHECK-SAME:   %[[IDX1:[a-zA-Z0-9_]+]]: index
+// CHECK-SAME:   %[[ST1:[a-zA-Z0-9_]+]]: vector<32xf32>
+//      CHECK:   %[[M0:[a-zA-Z0-9_]+]] = affine.apply #[[MAP]]()[%[[SZ0]], %[[IDX0]]]
+//      CHECK:   %[[M1:[a-zA-Z0-9_]+]] = affine.apply #[[MAP]]()[%[[SZ1]], %[[IDX1]]]
+//      CHECK:   vector.transfer_write %[[ST1]], %[[MEM]][%[[M0]], %[[M1]]] : vector<32xf32>, memref<12x32xf32>
+
+// -----
+
 func.func @fold_rank_reducing_subview_with_load
     (%arg0 : memref<?x?x?x?x?x?xf32>, %arg1 : index, %arg2 : index,
      %arg3 : index, %arg4 : index, %arg5 : index, %arg6 : index,
@@ -633,7 +673,7 @@ func.func @fold_load_keep_nontemporal(%arg0 : memref<12x32xf32>, %arg1 : index,
 // -----
 
 // CHECK-LABEL: func @fold_store_keep_nontemporal(
-//      CHECK:   memref.store %{{.+}}, %{{.+}}[%{{.+}}, %{{.+}}]  {nontemporal = true} : memref<12x32xf32> 
+//      CHECK:   memref.store %{{.+}}, %{{.+}}[%{{.+}}, %{{.+}}]  {nontemporal = true} : memref<12x32xf32>
 func.func @fold_store_keep_nontemporal(%arg0 : memref<12x32xf32>, %arg1 : index, %arg2 : index, %arg3 : index, %arg4 : index, %arg5 : f32) {
   %0 = memref.subview %arg0[%arg1, %arg2][4, 4][2, 3] :
     memref<12x32xf32> to memref<4x4xf32, strided<[64, 3], offset: ?>>

matthias-springer · 2024-02-05T08:04:34Z

Is this really correct? The memref.subview could cause more padding to be added.

%0 = memref.subview %m[0][10][1] : memref<100xf32> to memref<10xf32>
// this will add padding
%1 = vector.transfer_read %0[%c0], %p {in_bounds = [false]} : memref<10xf32>, vector<20xf32>

It looks like after this change the IR would be:

// this will not add padding
%1 = vector.transfer_read %m[%c0], %p {in_bounds = [false]} : memref<100xf32>, vector<20xf32>

MaheshRavishankar · 2024-02-06T04:21:09Z

Is this really correct? The memref.subview could cause more padding to be added.

%0 = memref.subview %m[0][10][1] : memref<100xf32> to memref<10xf32>
// this will add padding
%1 = vector.transfer_read %0[%c0], %p {in_bounds = [false]} : memref<10xf32>, vector<20xf32>

It looks like after this change the IR would be:

// this will not add padding
%1 = vector.transfer_read %m[%c0], %p {in_bounds = [false]} : memref<100xf32>, vector<20xf32>

Geez, maybe the vector.transfer_read is carrying too much semantics... Is the padding value optional? If so we can fold this in the optional cases.

remove the approval because there are issues that I was not aware of.

dcaballe · 2024-02-06T07:52:53Z

Geez, maybe the vector.transfer_read is carrying too much semantics... Is the padding value optional? If so we can fold this in the optional cases.

No, it's required esp. for in_bounds = false since we use the pad value to fill the out-of-bounds elements. I think the in_bounds attribute became obsolete with the introduction of masking. It is actually a scalar implementation of it which should be redundant at this point but... we are still generating it :)

dcaballe requested review from MaheshRavishankar, hanhanW and nicolasvasilache February 3, 2024 00:22

llvmbot added mlir mlir:memref labels Feb 3, 2024

dcaballe mentioned this pull request Feb 3, 2024

[CPU] i4 pack op fails to compile iree-org/iree#16285

Closed

hanhanW previously approved these changes Feb 5, 2024

View reviewed changes

dcaballe closed this Apr 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[mlir][memref][vector] Fold memref.subview into out-of-bounds vector transfer ops #80517

[mlir][memref][vector] Fold memref.subview into out-of-bounds vector transfer ops #80517

Uh oh!

dcaballe commented Feb 3, 2024

Uh oh!

llvmbot commented Feb 3, 2024 •

edited

Loading

Uh oh!

matthias-springer commented Feb 5, 2024

Uh oh!

MaheshRavishankar commented Feb 6, 2024

Uh oh!

dcaballe commented Feb 6, 2024

Uh oh!

Uh oh!

[mlir][memref][vector] Fold memref.subview into out-of-bounds vector transfer ops #80517

[mlir][memref][vector] Fold memref.subview into out-of-bounds vector transfer ops #80517

Uh oh!

Conversation

dcaballe commented Feb 3, 2024

Uh oh!

llvmbot commented Feb 3, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

matthias-springer commented Feb 5, 2024

Uh oh!

MaheshRavishankar commented Feb 6, 2024

Uh oh!

dcaballe commented Feb 6, 2024

Uh oh!

Uh oh!

llvmbot commented Feb 3, 2024 •

edited

Loading