Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[mlir][vector] Make the in_bounds attribute mandatory #97049

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

banach-space
Copy link
Contributor

@banach-space banach-space commented Jun 28, 2024

At the moment, the in_bounds attribute has two confusing/contradicting
properties:

  1. It is both optional and has an effective default-value.
  2. The default value is "out-of-bounds" for non-broadcast dims, and
    "in-bounds" for broadcast dims.

(see the isDimInBounds vector interface method for an example of this
"default" behaviour [1]).

This PR aims to clarify the logic surrounding the in_bounds attribute
by:

  • making the attribute mandatory (i.e. it is always present),
  • always setting the default value to "out of bounds" (that's
    consistent with the current behaviour for the most common cases).

Broadcast dimensions in tests

As per [2], the broadcast dimensions requires the corresponding
in_bounds attribute to be true:

  vector.transfer_read op requires broadcast dimensions to be in-bounds

The changes in this PR mean that we can no longer rely on the
default value in cases like the following (dim 0 is a broadcast dim):

  %read = vector.transfer_read %A[%base1, %base2], %f, %mask
      {permutation_map = affine_map<(d0, d1) -> (0, d1)>} :
    memref<?x?xf32>, vector<4x9xf32>

Instead, the broadcast dimension has to explicitly be marked as "in
bounds:

  %read = vector.transfer_read %A[%base1, %base2], %f, %mask
      {in_bounds = [true, false], permutation_map = affine_map<(d0, d1) -> (0, d1)>} :
    memref<?x?xf32>, vector<4x9xf32>

All tests with broadcast dims are updated accordingly.

Changes in "SuperVectorize.cpp" and "Vectorization.cpp"

The following patterns in "Vectorization.cpp" are updated to explicitly
set the in_bounds attribute to false:

  • LinalgCopyVTRForwardingPattern and LinalgCopyVTWForwardingPattern

Also, vectorizeAffineLoad (from "SuperVectorize.cpp") and
vectorizeAsLinalgGeneric (from "Vectorization.cpp") are updated to
make sure that xfer Ops created by these hooks set the dimension
corresponding to broadcast dims as "in bounds". Otherwise, the Op
verifier would complain

Note that there is no mechanism to verify whether the corresponding
memory access are indeed in bounds. Still, this is consistent with the
current behaviour where the broadcast dim would be implicitly assumed
to be "in bounds".

[1]

if (!$_op.getInBounds())
return false;
auto inBounds = ::llvm::cast<::mlir::ArrayAttr>(*$_op.getInBounds());
return ::llvm::cast<::mlir::BoolAttr>(inBounds[dim]).getValue();

[2] https://mlir.llvm.org/docs/Dialects/Vector/#vectortransfer_read-vectortransferreadop

@banach-space banach-space force-pushed the andrzej/make_in_bounds_mandatory branch 2 times, most recently from e7848f5 to 925fd02 Compare July 1, 2024 09:39
@llvmbot
Copy link
Collaborator

llvmbot commented Jul 2, 2024

@llvm/pr-subscribers-mlir-affine
@llvm/pr-subscribers-mlir-sparse
@llvm/pr-subscribers-mlir-tensor
@llvm/pr-subscribers-mlir-bufferization
@llvm/pr-subscribers-mlir-scf
@llvm/pr-subscribers-mlir-memref
@llvm/pr-subscribers-mlir-nvgpu
@llvm/pr-subscribers-mlir-sme

@llvm/pr-subscribers-mlir

Author: Andrzej Warzyński (banach-space)

Changes

Makes the in_bounds attribute for vector.transfer_read and
vector.transfer_write Ops mandatory. In addition, makes the Asm printer
always print this attribute - tests are updated accordingly.

Updates in tests

Default in_bounds value

Originally, most tests would skip the in_bounds attribute - this was
equivalent to setting all values to false [1]. With this change, this
has to be done explicitly when writing a test. Note, especially when
reviewing this change, that the vast majority of newly inserted
in_bounds attributes are set to false to preserve the original
semantics of the tests.

There is only one exception - for broadcast dimensions the newly
inserted in_bounds attribute is set to true. As per [2]:

  vector.transfer_read op requires broadcast dimensions to be in-bounds

This matches the original semantics:

  • the in_bounds attribute in the context of broadcast dimensions
    would only be checked when present,
  • the verifier wasn't aware of the default value set in [1],

This means that effectively, the attribute was set to false even for
broadcast dims, but the verifier wasn't aware of that. This change makes
that behaviour more explicit by setting the attribute to true for
broadcast dims. In all other cases, the attribute is set to false - if
that's not the case, consider that as a typo.

0-D vectors

Reading and writing to/from 0D vectors also requires the in_bounds
attribute. In this case, the attribute has to be empty:

vector.transfer_write %5, %m1[] {in_bounds=[]} : vector&lt;f32&gt;, memref&lt;f32&gt;

CHECK lines

With this PR, the in_bounds attribute is always print. This required
updating the CHECK lines that previously assumed that the attribute
would be skipped. To keep this type of changes simple, I've only added
{{.*}} to make sure that tests pass.

Changes in "Vectorization.cpp"

The following patterns are updated to explicitly set the in_bounds
attribute to false:

  • LinalgCopyVTRForwardingPattern and LinalgCopyVTWForwardingPattern

Changes in "SuperVectorize.cpp" and "Vectorization.cpp"

The updates in vectorizeAffineLoad (SuperVectorize.cpp) and
vectorizeAsLinalgGeneric (Vectorization.cpp) are introduced to make
sure that xfer Ops created by these vectorisers set the dimension
corresponding to broadcast dims as "in bounds". Otherwise, the Op
verifier would fail.

Note that there is no mechanism to verify whether the corresponding
memory access are indeed in bounds. Previously, when in_bounds was
optional, the verification would skip checking the attribute if it
wasn't present. However, it would default to false in other places.
Put differently, this change does not change the existing behaviour, it
merely makes it more explicit.

[1]

if (!$_op.getInBounds())
return false;
auto inBounds = ::llvm::cast<::mlir::ArrayAttr>(*$_op.getInBounds());
return ::llvm::cast<::mlir::BoolAttr>(inBounds[dim]).getValue();

[2] https://mlir.llvm.org/docs/Dialects/Vector/#vectortransfer_read-vectortransferreadop


Patch is 314.73 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/97049.diff

104 Files Affected:

  • (modified) mlir/include/mlir/Dialect/Vector/IR/VectorOps.td (+2-2)
  • (modified) mlir/include/mlir/IR/AffineMap.h (+4)
  • (modified) mlir/include/mlir/Interfaces/VectorInterfaces.td (+2-2)
  • (modified) mlir/lib/Dialect/Affine/Transforms/SuperVectorize.cpp (+13-1)
  • (modified) mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp (+19-7)
  • (modified) mlir/lib/Dialect/Vector/IR/VectorOps.cpp (+20-23)
  • (modified) mlir/lib/Dialect/Vector/Transforms/LowerVectorMask.cpp (+2-2)
  • (modified) mlir/lib/Dialect/Vector/Transforms/LowerVectorTransfer.cpp (+2-6)
  • (modified) mlir/lib/IR/AffineMap.cpp (+17)
  • (modified) mlir/test/Conversion/VectorToLLVM/vector-mask-to-llvm.mlir (+1-1)
  • (modified) mlir/test/Conversion/VectorToLLVM/vector-to-llvm.mlir (+8-8)
  • (modified) mlir/test/Conversion/VectorToSCF/unrolled-vector-to-loops.mlir (+1-1)
  • (modified) mlir/test/Conversion/VectorToSCF/vector-to-scf-mask-and-permutation-map.mlir (+2-2)
  • (modified) mlir/test/Conversion/VectorToSCF/vector-to-scf.mlir (+33-33)
  • (modified) mlir/test/Dialect/Affine/SuperVectorize/vector_utils.mlir (+1-1)
  • (modified) mlir/test/Dialect/Affine/SuperVectorize/vectorize_1d.mlir (+10-10)
  • (modified) mlir/test/Dialect/Affine/SuperVectorize/vectorize_2d.mlir (+5-5)
  • (modified) mlir/test/Dialect/Affine/SuperVectorize/vectorize_affine_apply.mlir (+6-6)
  • (modified) mlir/test/Dialect/Affine/SuperVectorize/vectorize_outer_loop_2d.mlir (+1-1)
  • (modified) mlir/test/Dialect/Affine/SuperVectorize/vectorize_outer_loop_transpose_2d.mlir (+4-4)
  • (modified) mlir/test/Dialect/Affine/SuperVectorize/vectorize_transpose_2d.mlir (+4-4)
  • (modified) mlir/test/Dialect/ArmSME/vector-legalization.mlir (+5-5)
  • (modified) mlir/test/Dialect/Bufferization/Transforms/one-shot-bufferize-analysis-bottom-up-from-terminators.mlir (+2-2)
  • (modified) mlir/test/Dialect/Bufferization/Transforms/one-shot-bufferize-partial.mlir (+7-7)
  • (modified) mlir/test/Dialect/Bufferization/Transforms/one-shot-bufferize.mlir (+5-5)
  • (modified) mlir/test/Dialect/Bufferization/Transforms/one-shot-module-bufferize-analysis.mlir (+11-11)
  • (modified) mlir/test/Dialect/Bufferization/Transforms/one-shot-module-bufferize-force-copy-before-write.mlir (+2-2)
  • (modified) mlir/test/Dialect/Bufferization/Transforms/one-shot-module-bufferize.mlir (+4-4)
  • (modified) mlir/test/Dialect/Bufferization/Transforms/transform-ops.mlir (+5-5)
  • (modified) mlir/test/Dialect/Linalg/forward-vector-transfers.mlir (+8-5)
  • (modified) mlir/test/Dialect/Linalg/hoisting.mlir (+50-50)
  • (modified) mlir/test/Dialect/Linalg/one-shot-bufferize.mlir (+3-3)
  • (modified) mlir/test/Dialect/Linalg/transform-op-bufferize-to-allocation.mlir (+1-1)
  • (modified) mlir/test/Dialect/Linalg/vectorization-with-patterns.mlir (+9-9)
  • (modified) mlir/test/Dialect/Linalg/vectorization.mlir (+1-1)
  • (modified) mlir/test/Dialect/Linalg/vectorize-tensor-extract.mlir (+1-1)
  • (modified) mlir/test/Dialect/MemRef/extract-address-computations.mlir (+10-10)
  • (modified) mlir/test/Dialect/MemRef/fold-memref-alias-ops.mlir (+1-1)
  • (modified) mlir/test/Dialect/NVGPU/transform-pipeline-shared.mlir (+4-4)
  • (modified) mlir/test/Dialect/SCF/one-shot-bufferize-analysis.mlir (+19-19)
  • (modified) mlir/test/Dialect/SCF/one-shot-bufferize.mlir (+4-3)
  • (modified) mlir/test/Dialect/Tensor/fold-tensor-subset-ops.mlir (+1-1)
  • (modified) mlir/test/Dialect/Tensor/one-shot-bufferize.mlir (+2-2)
  • (modified) mlir/test/Dialect/Vector/bufferize-invalid.mlir (+1-1)
  • (modified) mlir/test/Dialect/Vector/canonicalize.mlir (+22-22)
  • (modified) mlir/test/Dialect/Vector/invalid.mlir (+27-26)
  • (modified) mlir/test/Dialect/Vector/lower-vector-mask.mlir (+6-6)
  • (modified) mlir/test/Dialect/Vector/one-shot-bufferize.mlir (+4-4)
  • (modified) mlir/test/Dialect/Vector/ops.mlir (+49-49)
  • (modified) mlir/test/Dialect/Vector/scalar-vector-transfer-to-memref.mlir (+6-6)
  • (modified) mlir/test/Dialect/Vector/value-bounds-op-interface-impl.mlir (+1-1)
  • (modified) mlir/test/Dialect/Vector/vector-emulate-narrow-type.mlir (+2-2)
  • (modified) mlir/test/Dialect/Vector/vector-transfer-collapse-inner-most-dims.mlir (+7-7)
  • (modified) mlir/test/Dialect/Vector/vector-transfer-drop-unit-dims-patterns.mlir (+7-7)
  • (modified) mlir/test/Dialect/Vector/vector-transfer-flatten.mlir (+12-12)
  • (modified) mlir/test/Dialect/Vector/vector-transfer-full-partial-split-copy-transform.mlir (+4-4)
  • (modified) mlir/test/Dialect/Vector/vector-transfer-full-partial-split.mlir (+11-11)
  • (modified) mlir/test/Dialect/Vector/vector-transfer-permutation-lowering.mlir (+2-2)
  • (modified) mlir/test/Dialect/Vector/vector-transfer-to-vector-load-store.mlir (+21-21)
  • (modified) mlir/test/Dialect/Vector/vector-transfer-unroll.mlir (+12-12)
  • (modified) mlir/test/Dialect/Vector/vector-transforms.mlir (+19-19)
  • (modified) mlir/test/Dialect/Vector/vector-warp-distribute.mlir (+33-33)
  • (modified) mlir/test/Integration/Dialect/SparseTensor/CPU/dual_sparse_conv_2d.mlir (+1-1)
  • (modified) mlir/test/Integration/Dialect/SparseTensor/CPU/padded_sparse_conv_2d.mlir (+2-2)
  • (modified) mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_block_matmul.mlir (+1-1)
  • (modified) mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_cast.mlir (+10-10)
  • (modified) mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_cmp.mlir (+1-1)
  • (modified) mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_collapse_shape.mlir (+6-6)
  • (modified) mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_conv_1d_nwc_wcf.mlir (+1-1)
  • (modified) mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_conv_2d.mlir (+2-2)
  • (modified) mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_conv_2d_55.mlir (+6-6)
  • (modified) mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_conv_2d_nchw_fchw.mlir (+4-4)
  • (modified) mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_conv_2d_nhwc_hwcf.mlir (+1-1)
  • (modified) mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_conv_3d.mlir (+1-1)
  • (modified) mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_conv_3d_ndhwc_dhwcf.mlir (+1-1)
  • (modified) mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_conversion_element.mlir (+1-1)
  • (modified) mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_conversion_sparse2dense.mlir (+1-1)
  • (modified) mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_conversion_sparse2sparse.mlir (+1-1)
  • (modified) mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_coo_test.mlir (+4-4)
  • (modified) mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_dilated_conv_2d_nhwc_hwcf.mlir (+4-4)
  • (modified) mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_expand_shape.mlir (+6-6)
  • (modified) mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_filter_conv2d.mlir (+1-1)
  • (modified) mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_index_dense.mlir (+8-8)
  • (modified) mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_matvec.mlir (+1-1)
  • (modified) mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_pack.mlir (+5-5)
  • (modified) mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_permute.mlir (+1-1)
  • (modified) mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_pooling_nhwc.mlir (+1-1)
  • (modified) mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_quantized_matmul.mlir (+1-1)
  • (modified) mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_rewrite_push_back.mlir (+1-1)
  • (modified) mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_rewrite_sort_coo.mlir (+15-15)
  • (modified) mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_sampled_matmul.mlir (+1-1)
  • (modified) mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_sampled_mm_fusion.mlir (+2-2)
  • (modified) mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_spmm.mlir (+1-1)
  • (modified) mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_strided_conv_2d_nhwc_hwcf.mlir (+4-4)
  • (modified) mlir/test/Integration/Dialect/SparseTensor/CPU/sparse_unary.mlir (+1-1)
  • (modified) mlir/test/Integration/Dialect/Standard/CPU/test-ceil-floor-pos-neg.mlir (+1-1)
  • (modified) mlir/test/Integration/Dialect/Vector/CPU/realloc.mlir (+3-3)
  • (modified) mlir/test/Integration/Dialect/Vector/CPU/transfer-read-1d.mlir (+5-5)
  • (modified) mlir/test/Integration/Dialect/Vector/CPU/transfer-read-2d.mlir (+9-9)
  • (modified) mlir/test/Integration/Dialect/Vector/CPU/transfer-read-3d.mlir (+9-4)
  • (modified) mlir/test/Integration/Dialect/Vector/CPU/transfer-read.mlir (+3-3)
  • (modified) mlir/test/Integration/Dialect/Vector/CPU/transfer-to-loops.mlir (+7-7)
  • (modified) mlir/test/Integration/Dialect/Vector/CPU/transfer-write.mlir (+4-3)
  • (modified) mlir/test/Transforms/loop-invariant-subset-hoisting.mlir (+39-39)
diff --git a/mlir/include/mlir/Dialect/Vector/IR/VectorOps.td b/mlir/include/mlir/Dialect/Vector/IR/VectorOps.td
index 097e5e6fb0d61..4a77291b7fafb 100644
--- a/mlir/include/mlir/Dialect/Vector/IR/VectorOps.td
+++ b/mlir/include/mlir/Dialect/Vector/IR/VectorOps.td
@@ -1363,7 +1363,7 @@ def Vector_TransferReadOp :
                    AffineMapAttr:$permutation_map,
                    AnyType:$padding,
                    Optional<VectorOf<[I1]>>:$mask,
-                   OptionalAttr<BoolArrayAttr>:$in_bounds)>,
+                   BoolArrayAttr:$in_bounds)>,
     Results<(outs AnyVectorOfAnyRank:$vector)> {
 
   let summary = "Reads a supervector from memory into an SSA vector value.";
@@ -1607,7 +1607,7 @@ def Vector_TransferWriteOp :
                    Variadic<Index>:$indices,
                    AffineMapAttr:$permutation_map,
                    Optional<VectorOf<[I1]>>:$mask,
-                   OptionalAttr<BoolArrayAttr>:$in_bounds)>,
+                   BoolArrayAttr:$in_bounds)>,
     Results<(outs Optional<AnyRankedTensor>:$result)> {
 
   let summary = "The vector.transfer_write op writes a supervector to memory.";
diff --git a/mlir/include/mlir/IR/AffineMap.h b/mlir/include/mlir/IR/AffineMap.h
index cce141253989e..01772f7782ba8 100644
--- a/mlir/include/mlir/IR/AffineMap.h
+++ b/mlir/include/mlir/IR/AffineMap.h
@@ -156,6 +156,10 @@ class AffineMap {
   bool isMinorIdentityWithBroadcasting(
       SmallVectorImpl<unsigned> *broadcastedDims = nullptr) const;
 
+  // TODO: Document
+  void
+  getBroadcastDims(SmallVectorImpl<unsigned> *broadcastedDims = nullptr) const;
+
   /// Return true if this affine map can be converted to a minor identity with
   /// broadcast by doing a permute. Return a permutation (there may be
   /// several) to apply to get to a minor identity with broadcasts.
diff --git a/mlir/include/mlir/Interfaces/VectorInterfaces.td b/mlir/include/mlir/Interfaces/VectorInterfaces.td
index 781d6d3e3f813..f6682f2eabe1e 100644
--- a/mlir/include/mlir/Interfaces/VectorInterfaces.td
+++ b/mlir/include/mlir/Interfaces/VectorInterfaces.td
@@ -98,7 +98,7 @@ def VectorTransferOpInterface : OpInterface<"VectorTransferOpInterface"> {
         dimension whether it is in-bounds or not. (Broadcast dimensions are
         always in-bounds).
       }],
-      /*retTy=*/"::std::optional<::mlir::ArrayAttr>",
+      /*retTy=*/"::mlir::ArrayAttr",
       /*methodName=*/"getInBounds",
       /*args=*/(ins)
     >,
@@ -242,7 +242,7 @@ def VectorTransferOpInterface : OpInterface<"VectorTransferOpInterface"> {
         return true;
       if (!$_op.getInBounds())
         return false;
-      auto inBounds = ::llvm::cast<::mlir::ArrayAttr>(*$_op.getInBounds());
+      auto inBounds = $_op.getInBounds();
       return ::llvm::cast<::mlir::BoolAttr>(inBounds[dim]).getValue();
     }
 
diff --git a/mlir/lib/Dialect/Affine/Transforms/SuperVectorize.cpp b/mlir/lib/Dialect/Affine/Transforms/SuperVectorize.cpp
index 71e9648a5e00f..033f35391849e 100644
--- a/mlir/lib/Dialect/Affine/Transforms/SuperVectorize.cpp
+++ b/mlir/lib/Dialect/Affine/Transforms/SuperVectorize.cpp
@@ -1223,8 +1223,20 @@ static Operation *vectorizeAffineLoad(AffineLoadOp loadOp,
   LLVM_DEBUG(dbgs() << "\n[early-vect]+++++ permutationMap: ");
   LLVM_DEBUG(permutationMap.print(dbgs()));
 
+  // Make sure that the in_bounds attribute corresponding to a broadcast dim
+  // is set to `true` - that's required by the xfer Op.
+  // FIXME: We're not veryfying whether the corresponding access is in bounds.
+  // TODO: Use masking instead.
+  SmallVector<unsigned> broadcastedDims = {};
+  permutationMap.getBroadcastDims(&broadcastedDims);
+  SmallVector<bool> inBounds(vectorType.getRank(), false);
+
+  for (auto idx : broadcastedDims)
+    inBounds[idx] = true;
+
   auto transfer = state.builder.create<vector::TransferReadOp>(
-      loadOp.getLoc(), vectorType, loadOp.getMemRef(), indices, permutationMap);
+      loadOp.getLoc(), vectorType, loadOp.getMemRef(), indices, permutationMap,
+      ArrayRef<bool>(inBounds));
 
   // Register replacement for future uses in the scope.
   state.registerOpVectorReplacement(loadOp, transfer);
diff --git a/mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp b/mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp
index 3a75d2ac08157..c15d101afa94b 100644
--- a/mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp
+++ b/mlir/lib/Dialect/Linalg/Transforms/Vectorization.cpp
@@ -1338,8 +1338,18 @@ vectorizeAsLinalgGeneric(RewriterBase &rewriter, VectorizationState &state,
 
     SmallVector<Value> indices(linalgOp.getShape(opOperand).size(), zero);
 
+    // Make sure that the in_bounds attribute corresponding to a broadcast dim
+    // is `true`
+    SmallVector<unsigned> broadcastedDims = {};
+    readMap.getBroadcastDims(&broadcastedDims);
+    SmallVector<bool> inBounds(readType.getRank(), false);
+
+    for (auto idx : broadcastedDims)
+      inBounds[idx] = true;
+
     Operation *read = rewriter.create<vector::TransferReadOp>(
-        loc, readType, opOperand->get(), indices, readMap);
+        loc, readType, opOperand->get(), indices, readMap,
+        ArrayRef<bool>(inBounds));
     read = state.maskOperation(rewriter, read, linalgOp, maskingMap);
     Value readValue = read->getResult(0);
 
@@ -2676,11 +2686,12 @@ LogicalResult LinalgCopyVTRForwardingPattern::matchAndRewrite(
   // The `masked` attribute is only valid on this padded buffer.
   // When forwarding to vector.transfer_read, the attribute must be reset
   // conservatively.
+  auto vectorType = xferOp.getVectorType();
   Value res = rewriter.create<vector::TransferReadOp>(
-      xferOp.getLoc(), xferOp.getVectorType(), in, xferOp.getIndices(),
+      xferOp.getLoc(), vectorType, in, xferOp.getIndices(),
       xferOp.getPermutationMapAttr(), xferOp.getPadding(), xferOp.getMask(),
-      // in_bounds is explicitly reset
-      /*inBoundsAttr=*/ArrayAttr());
+      rewriter.getBoolArrayAttr(
+          SmallVector<bool>(vectorType.getRank(), false)));
 
   if (maybeFillOp)
     rewriter.eraseOp(maybeFillOp);
@@ -2734,11 +2745,12 @@ LogicalResult LinalgCopyVTWForwardingPattern::matchAndRewrite(
   // The `masked` attribute is only valid on this padded buffer.
   // When forwarding to vector.transfer_write, the attribute must be reset
   // conservatively.
+  auto vector = xferOp.getVector();
   rewriter.create<vector::TransferWriteOp>(
-      xferOp.getLoc(), xferOp.getVector(), out, xferOp.getIndices(),
+      xferOp.getLoc(), vector, out, xferOp.getIndices(),
       xferOp.getPermutationMapAttr(), xferOp.getMask(),
-      // in_bounds is explicitly reset
-      /*inBoundsAttr=*/ArrayAttr());
+      rewriter.getBoolArrayAttr(
+          SmallVector<bool>(vector.getType().getRank(), false)));
 
   rewriter.eraseOp(copyOp);
   rewriter.eraseOp(xferOp);
diff --git a/mlir/lib/Dialect/Vector/IR/VectorOps.cpp b/mlir/lib/Dialect/Vector/IR/VectorOps.cpp
index 149723f51cc12..cb4cb92a66a93 100644
--- a/mlir/lib/Dialect/Vector/IR/VectorOps.cpp
+++ b/mlir/lib/Dialect/Vector/IR/VectorOps.cpp
@@ -3817,7 +3817,8 @@ void TransferReadOp::build(OpBuilder &builder, OperationState &result,
   auto permutationMapAttr = AffineMapAttr::get(permutationMap);
   auto inBoundsAttr = (inBounds && !inBounds.value().empty())
                           ? builder.getBoolArrayAttr(inBounds.value())
-                          : ArrayAttr();
+                          : builder.getBoolArrayAttr(
+                                SmallVector<bool>(vectorType.getRank(), false));
   build(builder, result, vectorType, source, indices, permutationMapAttr,
         inBoundsAttr);
 }
@@ -3832,7 +3833,8 @@ void TransferReadOp::build(OpBuilder &builder, OperationState &result,
   auto permutationMapAttr = AffineMapAttr::get(permutationMap);
   auto inBoundsAttr = (inBounds && !inBounds.value().empty())
                           ? builder.getBoolArrayAttr(inBounds.value())
-                          : ArrayAttr();
+                          : builder.getBoolArrayAttr(
+                                SmallVector<bool>(vectorType.getRank(), false));
   build(builder, result, vectorType, source, indices, permutationMapAttr,
         padding,
         /*mask=*/Value(), inBoundsAttr);
@@ -3950,17 +3952,15 @@ verifyTransferOp(VectorTransferOpInterface op, ShapedType shapedType,
            << inferredMaskType << ") and mask operand type (" << maskType
            << ") don't match";
 
-  if (inBounds) {
-    if (permutationMap.getNumResults() != static_cast<int64_t>(inBounds.size()))
-      return op->emitOpError("expects the optional in_bounds attr of same rank "
-                             "as permutation_map results: ")
-             << AffineMapAttr::get(permutationMap)
-             << " vs inBounds of size: " << inBounds.size();
-    for (unsigned int i = 0; i < permutationMap.getNumResults(); ++i)
-      if (isa<AffineConstantExpr>(permutationMap.getResult(i)) &&
-          !llvm::cast<BoolAttr>(inBounds.getValue()[i]).getValue())
-        return op->emitOpError("requires broadcast dimensions to be in-bounds");
-  }
+  if (permutationMap.getNumResults() != static_cast<int64_t>(inBounds.size()))
+    return op->emitOpError("expects the in_bounds attr of same rank "
+                           "as permutation_map results: ")
+           << AffineMapAttr::get(permutationMap)
+           << " vs inBounds of size: " << inBounds.size();
+  for (unsigned int i = 0; i < permutationMap.getNumResults(); ++i)
+    if (isa<AffineConstantExpr>(permutationMap.getResult(i)) &&
+        !llvm::cast<BoolAttr>(inBounds.getValue()[i]).getValue())
+      return op->emitOpError("requires broadcast dimensions to be in-bounds");
 
   return success();
 }
@@ -3970,9 +3970,6 @@ static void printTransferAttrs(OpAsmPrinter &p, VectorTransferOpInterface op) {
   elidedAttrs.push_back(TransferReadOp::getOperandSegmentSizeAttr());
   if (op.getPermutationMap().isMinorIdentity())
     elidedAttrs.push_back(op.getPermutationMapAttrName());
-  // Elide in_bounds attribute if all dims are out-of-bounds.
-  if (llvm::none_of(op.getInBoundsValues(), [](bool b) { return b; }))
-    elidedAttrs.push_back(op.getInBoundsAttrName());
   p.printOptionalAttrDict(op->getAttrs(), elidedAttrs);
 }
 
@@ -4080,8 +4077,7 @@ LogicalResult TransferReadOp::verify() {
 
   if (failed(verifyTransferOp(cast<VectorTransferOpInterface>(getOperation()),
                               shapedType, vectorType, maskType,
-                              inferredMaskType, permutationMap,
-                              getInBounds() ? *getInBounds() : ArrayAttr())))
+                              inferredMaskType, permutationMap, getInBounds())))
     return failure();
 
   if (auto sourceVectorElementType =
@@ -4354,9 +4350,11 @@ void TransferWriteOp::build(OpBuilder &builder, OperationState &result,
                             AffineMap permutationMap,
                             std::optional<ArrayRef<bool>> inBounds) {
   auto permutationMapAttr = AffineMapAttr::get(permutationMap);
-  auto inBoundsAttr = (inBounds && !inBounds.value().empty())
-                          ? builder.getBoolArrayAttr(inBounds.value())
-                          : ArrayAttr();
+  auto inBoundsAttr =
+      (inBounds && !inBounds.value().empty())
+          ? builder.getBoolArrayAttr(inBounds.value())
+          : builder.getBoolArrayAttr(SmallVector<bool>(
+                llvm::cast<VectorType>(vector.getType()).getRank(), false));
   build(builder, result, vector, dest, indices, permutationMapAttr,
         /*mask=*/Value(), inBoundsAttr);
 }
@@ -4462,8 +4460,7 @@ LogicalResult TransferWriteOp::verify() {
 
   if (failed(verifyTransferOp(cast<VectorTransferOpInterface>(getOperation()),
                               shapedType, vectorType, maskType,
-                              inferredMaskType, permutationMap,
-                              getInBounds() ? *getInBounds() : ArrayAttr())))
+                              inferredMaskType, permutationMap, getInBounds())))
     return failure();
 
   return verifyPermutationMap(permutationMap,
diff --git a/mlir/lib/Dialect/Vector/Transforms/LowerVectorMask.cpp b/mlir/lib/Dialect/Vector/Transforms/LowerVectorMask.cpp
index f53bb5157eb37..dfeb7bc53adad 100644
--- a/mlir/lib/Dialect/Vector/Transforms/LowerVectorMask.cpp
+++ b/mlir/lib/Dialect/Vector/Transforms/LowerVectorMask.cpp
@@ -224,7 +224,7 @@ struct MaskedTransferReadOpPattern
     rewriter.replaceOpWithNewOp<TransferReadOp>(
         maskingOp.getOperation(), readOp.getVectorType(), readOp.getSource(),
         readOp.getIndices(), readOp.getPermutationMap(), readOp.getPadding(),
-        maskingOp.getMask(), readOp.getInBounds().value_or(ArrayAttr()));
+        maskingOp.getMask(), readOp.getInBounds());
     return success();
   }
 };
@@ -246,7 +246,7 @@ struct MaskedTransferWriteOpPattern
     rewriter.replaceOpWithNewOp<TransferWriteOp>(
         maskingOp.getOperation(), resultType, writeOp.getVector(),
         writeOp.getSource(), writeOp.getIndices(), writeOp.getPermutationMap(),
-        maskingOp.getMask(), writeOp.getInBounds().value_or(ArrayAttr()));
+        maskingOp.getMask(), writeOp.getInBounds());
     return success();
   }
 };
diff --git a/mlir/lib/Dialect/Vector/Transforms/LowerVectorTransfer.cpp b/mlir/lib/Dialect/Vector/Transforms/LowerVectorTransfer.cpp
index c31c51489ecc9..b3c6dec47f6be 100644
--- a/mlir/lib/Dialect/Vector/Transforms/LowerVectorTransfer.cpp
+++ b/mlir/lib/Dialect/Vector/Transforms/LowerVectorTransfer.cpp
@@ -133,9 +133,7 @@ struct TransferReadPermutationLowering
 
     // Transpose in_bounds attribute.
     ArrayAttr newInBoundsAttr =
-        op.getInBounds() ? inverseTransposeInBoundsAttr(
-                               rewriter, op.getInBounds().value(), permutation)
-                         : ArrayAttr();
+        inverseTransposeInBoundsAttr(rewriter, op.getInBounds(), permutation);
 
     // Generate new transfer_read operation.
     VectorType newReadType = VectorType::get(
@@ -208,9 +206,7 @@ struct TransferWritePermutationLowering
 
     // Transpose in_bounds attribute.
     ArrayAttr newInBoundsAttr =
-        op.getInBounds() ? inverseTransposeInBoundsAttr(
-                               rewriter, op.getInBounds().value(), permutation)
-                         : ArrayAttr();
+        inverseTransposeInBoundsAttr(rewriter, op.getInBounds(), permutation);
 
     // Generate new transfer_write operation.
     Value newVec = rewriter.create<vector::TransposeOp>(
diff --git a/mlir/lib/IR/AffineMap.cpp b/mlir/lib/IR/AffineMap.cpp
index e5993eb08dc8b..3df6d1e81d0f8 100644
--- a/mlir/lib/IR/AffineMap.cpp
+++ b/mlir/lib/IR/AffineMap.cpp
@@ -188,6 +188,23 @@ bool AffineMap::isMinorIdentityWithBroadcasting(
   return true;
 }
 
+void AffineMap::getBroadcastDims(
+    SmallVectorImpl<unsigned> *broadcastedDims) const {
+  if (broadcastedDims)
+    broadcastedDims->clear();
+  for (const auto &idxAndExpr : llvm::enumerate(getResults())) {
+    unsigned resIdx = idxAndExpr.index();
+    AffineExpr expr = idxAndExpr.value();
+    if (auto constExpr = dyn_cast<AffineConstantExpr>(expr)) {
+      // Each result may be either a constant 0 (broadcasted dimension).
+      if (constExpr.getValue() != 0)
+        continue;
+      if (broadcastedDims)
+        broadcastedDims->push_back(resIdx);
+    }
+  }
+}
+
 /// Return true if this affine map can be converted to a minor identity with
 /// broadcast by doing a permute. Return a permutation (there may be
 /// several) to apply to get to a minor identity with broadcasts.
diff --git a/mlir/test/Conversion/VectorToLLVM/vector-mask-to-llvm.mlir b/mlir/test/Conversion/VectorToLLVM/vector-mask-to-llvm.mlir
index 1abadcc345cd2..d73efd41cce05 100644
--- a/mlir/test/Conversion/VectorToLLVM/vector-mask-to-llvm.mlir
+++ b/mlir/test/Conversion/VectorToLLVM/vector-mask-to-llvm.mlir
@@ -73,6 +73,6 @@ func.func @genbool_var_1d_scalable(%arg0: index) -> vector<[11]xi1> {
 
 func.func @transfer_read_1d(%A : memref<?xf32>, %i: index) -> vector<16xf32> {
   %d = arith.constant -1.0: f32
-  %f = vector.transfer_read %A[%i], %d {permutation_map = affine_map<(d0) -> (d0)>} : memref<?xf32>, vector<16xf32>
+  %f = vector.transfer_read %A[%i], %d {in_bounds = [false], permutation_map = affine_map<(d0) -> (d0)>} : memref<?xf32>, vector<16xf32>
   return %f : vector<16xf32>
 }
diff --git a/mlir/test/Conversion/VectorToLLVM/vector-to-llvm.mlir b/mlir/test/Conversion/VectorToLLVM/vector-to-llvm.mlir
index 09b79708a9ab2..36a3c6eeb175f 100644
--- a/mlir/test/Conversion/VectorToLLVM/vector-to-llvm.mlir
+++ b/mlir/test/Conversion/VectorToLLVM/vector-to-llvm.mlir
@@ -1689,10 +1689,10 @@ func.func @matrix_ops_index(%A: vector<64xindex>, %B: vector<48xindex>) -> vecto
 func.func @transfer_read_1d(%A : memref<?xf32>, %base: index) -> vector<17xf32> {
   %f7 = arith.constant 7.0: f32
   %f = vector.transfer_read %A[%base], %f7
-      {permutation_map = affine_map<(d0) -> (d0)>} :
+      {in_bounds = [false], permutation_map = affine_map<(d0) -> (d0)>} :
     memref<?xf32>, vector<17xf32>
   vector.transfer_write %f, %A[%base]
-      {permutation_map = affine_map<(d0) -> (d0)>} :
+      {in_bounds = [false], permutation_map = affine_map<(d0) -> (d0)>} :
     vector<17xf32>, memref<?xf32>
   return %f: vector<17xf32>
 }
@@ -1763,10 +1763,10 @@ func.func @transfer_read_1d(%A : memref<?xf32>, %base: index) -> vector<17xf32>
 func.func @transfer_read_index_1d(%A : memref<?xindex>, %base: index) -> vector<17xindex> {
   %f7 = arith.constant 7: index
   %f = vector.transfer_read %A[%base], %f7
-      {permutation_map = affine_map<(d0) -> (d0)>} :
+      {in_bounds = [false], permutation_map = affine_map<(d0) -> (d0)>} :
     memref<?xindex>, vector<17xindex>
   vector.transfer_write %f, %A[%base]
-      {permutation_map = affine_map<(d0) -> (d0)>} :
+      {in_bounds = [false], permutation_map = affine_map<(d0) -> (d0)>} :
     vector<17xindex>, memref<?xindex>
   return %f: vector<17xindex>
 }
@@ -1786,7 +1786,7 @@ func.func @transfer_read_index_1d(%A : memref<?xindex>, %base: index) -> vector<
 func.func @transfer_read_2d_to_1d(%A : memref<?x?xf32>, %base0: index, %base1: index) -> vector<17xf32> {
   %f7 = arith.constant 7.0: f32
   %f = vector.transfer_read %A[%base0, %base1], %f7
-      {permutation_map = affine_map<(d0, d1) -> (d1)>} :
+      {in_bounds = [false], permutation_map = affine_map<(d0, d1) -> (d1)>} :
     memref<?x?xf32>, vector<17xf32>
   return %f: vector<17xf32>
 }
@@ -1815,10 +1815,10 @@ func.func @transfer_read_2d_to_1d(%A : memref<?x?xf32>, %base0: index, %base1: i
 func.func @transfer_read_1d_non_zero_addrspace(%A : memref<?xf32, 3>, %base: index) -> vector<17xf32> {
   %f7 = arith.constant 7.0: f32
   %f = vector.transfer_read %A[%base], %f7
-      {permutation_map = affine_map<(d0) -> (d0)>} :
+      {in_bounds = [false], permutation_map = affine_map<(d0) -> (d0)>} :
     memref<?xf32, 3>, vector<17xf32>
   vector.transfer_write %f, %A[%base]
-      {permutation_map = affine_map<(d0) -> (d0)>} :
+      {in_bounds = [false], permutation_map = affine_map<(d0) -> (d0)>} :
     vector<17xf32>, memref<?xf32, 3>
   return %f: vector<17xf32>
 }
@@ -1866,7 +1866,7 @@ func.func @transfer_read_1d_inbounds(%A : memref<?xf32>, %base: index) -> vector
 func.func @transfer_read_1d_mask(%A : memref<?xf32>, %base : index) -> vector<5xf32> {
   %m = arith.constant dense<[0, 0, 1, 0, 1]> : vector<5xi1>
   %f7 = arith.constant 7.0: f32
-  %f = vector.transfer_read %A[%base], %f7, %m : memref<?xf32>, vector<5xf32>
+  %f = vector.transfer_read %A[%base], %f7, %m {in_bounds=[false]} : memref<?xf32>, vector<5xf32>
   return %f: vector<5xf32>
 }
 
diff --git a/mlir/test/Conversion/VectorToSCF/unrolled-vector-to-loops.mlir b/mlir/test/Conversion/VectorToSCF/unrolled-vector-to-loops.mlir
index 7d97829c06599..598530eecec64 100644
--- a/mlir/test/Conversion/VectorToSCF/unrolled-vector-to-loops.mlir
+++ b/mlir/test/Conversion/VectorToSCF/unrolled-vector-to-loops.mlir
@@ -51,7 +51,7 @@ func.func @transfer_read_out_of_bounds(%A : memref<?x?x?xf32>) -> (vector<2x3x4x
   // CHECK: vector.transfer_read {{.*}} : memref<?x?x?xf32>, vector<4xf32>
   // CHECK: vector.insert {{.*}} [1, 2] : vector<4xf32> into vector<2x3x4xf32>
   // CHECK-NOT: scf.for
-  %vec = vector.transfer_read %A[%c0, %c0, %c0], %f0 : memref<?x?x?xf32>, vector<2x3x4xf32>
+  %vec = vector.transfer_read %A[%c0, %c0, %c0], %f0 {in_...
[truncated]

@banach-space
Copy link
Contributor Author

For context - this is a follow-up for #96031. Sadly, there's a lot of churn in the tests. To make reviewing easier - *test and code updates have been split into 2 separate commits.

The updates in the tests are quite dense. Ideally we should avoid that, but I struggled to find a solution that would work. I might be missing something obvious - suggestions are very welcome.

@@ -1363,7 +1363,7 @@ def Vector_TransferReadOp :
AffineMapAttr:$permutation_map,
AnyType:$padding,
Optional<VectorOf<[I1]>>:$mask,
OptionalAttr<BoolArrayAttr>:$in_bounds)>,
BoolArrayAttr:$in_bounds)>,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use a DenseBoolArrayAttr?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ping on this thread? (maybe you missed it)
(It might be something to address in a separate PR as well)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did try this briefly, but hit build errors :(

Yes, I'd rather leave this for a separate PR.

@joker-eph
Copy link
Collaborator

We're lacking a motivation in the description, mostly around why we wouldn't want a default value?

@banach-space
Copy link
Contributor Author

banach-space commented Jul 2, 2024

We're lacking a motivation in the description

Added extra paragraph to the summary.

, mostly around why we wouldn't want a default value?

I blamed the limitations of ODS. But it might be a limitation in my understanding of ODS - pointers are very welcome.

I found a way to keep the default value. I will be updating this shortly.

@banach-space banach-space force-pushed the andrzej/make_in_bounds_mandatory branch from 01bf155 to b46f737 Compare July 4, 2024 20:37
@banach-space
Copy link
Contributor Author

Summary of updates:

  1. Restored the default value for the in_bounds attribute ("out of bounds")
  2. Reverted most of the updates in tests (these are no longer needed) - this significantly reduces the code churn
  3. Re-wrote the summary (added more context re broadcast dims)

@banach-space banach-space force-pushed the andrzej/make_in_bounds_mandatory branch from b46f737 to b24edbf Compare July 4, 2024 20:43
Copy link

github-actions bot commented Jul 5, 2024

✅ With the latest revision this PR passed the Python code formatter.

@banach-space banach-space force-pushed the andrzej/make_in_bounds_mandatory branch 2 times, most recently from b94af64 to 89af514 Compare July 5, 2024 20:35
)
vector.TransferReadOp(
vector_type, A, [zero, zero], identity_map_attr, padding
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're losing the default in Python? Can this be recovered somehow?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be recovered somehow?

I don't think so.

From a quick scan, the Python bindings are build based on what's in ODS. However, the default value for in_bounds is implemented by customising C++:

  • the Op parsers (updated here), and
  • the Op builders (already flexible enough).

IIUC, such C++ customisation are not visible/available through Python bindings.

Copy link
Contributor

@dcaballe dcaballe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for addressing the feedback. Much better! LGTM but please wait for Mehdi's ok his open questions.

/// * (d0, d1, d2) -> (0, d1) gives [0]
/// * (d0, d1, d2) -> (d2, d1) gives []
/// * (d0, d1, d2, d4) -> (d0, 0, d1, 0) gives [1, 3]
SmallVector<unsigned> getBroadcastDims() const;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Have you looked at isMinorIdentityWithBroadcasting

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes - that was my source of inspiration :) It's just below getBroadcastDims (though I am about to move it a bit).

Note, isMinorIdentityWithBroadcasting wouldn't work as maps in xfer_read/xfer_write can be permutations. I also tried re-using getBroadcastDims inside isMinorIdentityWithBroadcasting, but that didn't feel right (perhaps isMinorIdentityWithBroadcasting should be split?).

@@ -1223,8 +1223,19 @@ static Operation *vectorizeAffineLoad(AffineLoadOp loadOp,
LLVM_DEBUG(dbgs() << "\n[early-vect]+++++ permutationMap: ");
LLVM_DEBUG(permutationMap.print(dbgs()));

// Make sure that the in_bounds attribute corresponding to a broadcast dim
// is set to `true` - that's required by the xfer Op.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nothing for this PR but this is still a constraint that seems unnecessary and makes this complicated, as we can see here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. Expect a PR to remove this shortly :) (~next week)

mlir/lib/Dialect/Vector/IR/VectorOps.cpp Outdated Show resolved Hide resolved
Copy link
Contributor

@hanhanW hanhanW left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good to me, just some nits and questions.

mlir/lib/Dialect/Affine/Transforms/SuperVectorize.cpp Outdated Show resolved Hide resolved
mlir/lib/IR/AffineMap.cpp Outdated Show resolved Hide resolved
mlir/lib/IR/AffineMap.cpp Outdated Show resolved Hide resolved
out-of-bounds. A `vector.transfer_read` can be lowered to a simple load if
all dimensions are specified to be within bounds and no `mask` was
specified. Note that non-vector dimensions *must* always be in-bounds.
Non-vector and broadcast dimensions *must* always be in-bounds. The
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dumb question: what does Non-vector mean here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also don't know :) That note was introduced by @matthias-springer in https://reviews.llvm.org/D155719. Matthias, do you remember what cases that was referring to?

Makes the `in_bounds` attribute for vector.transfer_read and
vector.transfer_write Ops mandatory. In addition, makes the Asm printer
always print this attribute - tests are updated accordingly.

1. Updates in tests - default `in_bounds` value

Originally, most tests would skip the `in_bounds` attribute - this was
equivalent to setting all values to `false` [1]. With this change, this
has to be done explicitly when writing a test. Note, especially when
reviewing this change, that the vast majority of newly inserted
`in_bounds` attributes are set to `false` to preserve the original
semantics of the tests.

There is only one exception - for broadcast dimensions the newly
inserted `in_bounds` attribute is set to `true`. As per [2]:
```
  vector.transfer_read op requires broadcast dimensions to be in-bounds
```

This matches the original semantics:
  * the `in_bounds` attribute in the context of broadcast dimensions
    would only be checked when present,
  * the verifier wasn't aware of the default value set in [1],

This means that effectively, the attribute was set to `false` even for
broadcast dims, but the verifier wasn't aware of that. This change makes
that behaviour more explicit by setting the attribute to `true` for
broadcast dims. In all other cases, the attribute is set to `false` - if
that's not the case, consider that as a typo.

2. Updates in tests - 0-D vectors

Reading and writing to/from 0D vectors also requires the `in_bounds`
attribute. In this case, the attribute has to be empty:

```mlir
vector.transfer_write %5, %m1[] {in_bounds=[]} : vector<f32>, memref<f32>
```

3. Updates in tests - CHECK lines

With this PR, the `in_bounds` attribute is always print. This required
updating the `CHECK` lines that previously assumed that the attribute
would be skipped. To keep this type of changes simple, I've only added
`{{.*}}` to make sure that tests pass.

4. Changes in "Vectorization.cpp"

The following patterns are updated to explicitly set the `in_bounds`
attribute to `false`:
  * `LinalgCopyVTRForwardingPattern` and `LinalgCopyVTWForwardingPattern`

5. Changes in "SuperVectorize.cpp" and "Vectorization.cpp"

The updates in `vectorizeAffineLoad` (SuperVectorize.cpp) and
`vectorizeAsLinalgGeneric` (Vectorization.cpp) are introduced to make
sure that xfer Ops created by these vectorisers set the dimension
corresponding to broadcast dims as "in bounds". Otherwise, the Op
verifier would fail.

Note that there is no mechanism to verify whether the corresponding
memory access are indeed in bounds. Previously, when `in_bounds` was
optional, the verification would skip checking the attribute if it
wasn't present. However, it would default to `false` in other places.
Put differently, this change does not change the existing behaviour, it
merely makes it more explicit.

[1] https://github.com/llvm/llvm-project/blob/4145ad2bac4bb99d5034d60c74bb2789f6c6e802/mlir/include/mlir/Interfaces/VectorInterfaces.td#L243-L246
[2] https://mlir.llvm.org/docs/Dialects/Vector/#vectortransfer_read-vectortransferreadop
At the moment, the in_bounds attribute has two confusing/contradicting
properties:
  1. It is both optional _and_ has an effective default-value.
  2. The default value is "out-of-bounds" for non-broadcast dims, and
     "in-bounds" for broadcast dims.

(see the `isDimInBounds` vector interface method for an example of this
"default" behaviour [1]).

This PR aims to clarify the logic surrounding the `in_bounds` attribute
by:
  * making the attribute it mandatory (i.e. it is always present),
  * always setting the default value to "out of bounds" (that's
    consistent with the current behaviour for the most common cases).

1. Broadcast dimensions in tests

As per [2], the broadcast dimensions requires the corresponding
`in_bounds` attribute to be `true`:
```
  vector.transfer_read op requires broadcast dimensions to be in-bounds
```

The changes in this PR mean that we can no longer rely on the
default value in cases like the following (dim 0 is a broadcast dim):
```mlir
  %read = vector.transfer_read %A[%base1, %base2], %f, %mask
      {permutation_map = affine_map<(d0, d1) -> (0, d1)>} :
    memref<?x?xf32>, vector<4x9xf32>
```

Instead, the broadcast dimension has to explicitly be marked as "in
bounds:

```mlir
  %read = vector.transfer_read %A[%base1, %base2], %f, %mask
      {in_bounds = [true, false], permutation_map = affine_map<(d0, d1) -> (0, d1)>} :
    memref<?x?xf32>, vector<4x9xf32>
```

All tests with broadcast dims are updated accordingly.

2. Changes in "SuperVectorize.cpp" and "Vectorization.cpp"

The following patterns in "Vectorization.cpp" are updated to explicitly
set the `in_bounds` attribute to `false`:
  * `LinalgCopyVTRForwardingPattern` and `LinalgCopyVTWForwardingPattern`

Also, `vectorizeAffineLoad` (from "SuperVectorize.cpp") and
`vectorizeAsLinalgGeneric` (from "Vectorization.cpp") are updated to
make sure that xfer Ops created by these hooks set the dimension
corresponding to broadcast dims as "in bounds". Otherwise, the Op
verifier would complain

Note that there is no mechanism to verify whether the corresponding
memory access are indeed in bounds. Still, this is consistent with the
current behaviour where the broadcast dim would be implicitly assumed
to be "in bounds".

[1] https://github.com/llvm/llvm-project/blob/4145ad2bac4bb99d5034d60c74bb2789f6c6e802/mlir/include/mlir/Interfaces/VectorInterfaces.td#L243-L246
[2] https://mlir.llvm.org/docs/Dialects/Vector/#vectortransfer_read-vectortransferreadop
…te mandatory

Address comment from Diego and Han-Chung
@banach-space banach-space force-pushed the andrzej/make_in_bounds_mandatory branch from 5f82c64 to 1bc7464 Compare July 12, 2024 07:59
@banach-space
Copy link
Contributor Author

Thanks for taking a look @dcaballe and @hanhanW 🙏🏻 The latest fixup address all your comments ( great pointers - thanks!)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants