[mlir] Verify non-negative `offset` and `size` #72059

rikhuijzer · 2023-11-12T17:56:16Z

In #71153, the memref.subview canonicalizer crashes due to a negative size being passed as an operand. During SubViewOp::verify this negative size is not yet detectable since it is dynamic and only available after constant folding, which happens during the canonicalization passes. As discussed in https://discourse.llvm.org/t/rfc-more-opfoldresult-and-mixed-indices-in-ops-that-deal-with-shaped-values/72510, the verifier should not be extended as it should "only verify local aspects of an operation".

This patch fixes #71153 by not folding in aforementioned situation.

Also, this patch adds a basic offset and size check in the OffsetSizeAndStrideOpInterface verifier.

Note: only offset and size are checked because stride is allowed to be negative (54d81e4).

This now does catch the negative size inside the interface so that an `op.emitError` can be thrown. That works, but then continues to return an empty result? Instead, the interface can probably be refactored first because it's very restrictive in its current form.

llvmbot · 2023-11-12T17:56:43Z

@llvm/pr-subscribers-mlir
@llvm/pr-subscribers-mlir-tensor

@llvm/pr-subscribers-mlir-memref

Author: Rik Huijzer (rikhuijzer)

Changes

In #71153, the memref.subview canonicalizer crashes due to a negative size being passed as an operand. During SubViewOp::verify this negative size is not yet detectable since it is dynamic and only available after constant folding, which happens during the canonicalization passes. As discussed in <https://discourse.llvm.org/t/rfc-more-opfoldresult-and-mixed-indices-in-ops-that-deal-with-shaped-values/72510>, the verifier should not be extended as it should "only verify local aspects of an operation". Furthermore, the discussion talks about possible solutions here, but it seems to me that there is no consensus yet?

This patch proposes to add a basic offset and size check in SubViewOp::verify and add a slightly more clear assertion error after the constant folding. Without this assertion, it would crash with the following message:

&lt;unknown&gt;:0: error: invalid memref size
Assertion failed: (succeeded(ConcreteT::verify(getDefaultDiagnosticEmitFn(ctx), args...))), function get, file StorageUniquerSupport.h, line 181.

Note: this patch only checks offset and size because stride is allowed to be negative (54d81e4).

Full diff: https://github.com/llvm/llvm-project/pull/72059.diff

3 Files Affected:

(modified) mlir/lib/Dialect/MemRef/IR/MemRefOps.cpp (+20-1)
(modified) mlir/lib/Dialect/Tensor/IR/TensorOps.cpp (+1-1)
(modified) mlir/test/Dialect/MemRef/invalid.mlir (+16)

diff --git a/mlir/lib/Dialect/MemRef/IR/MemRefOps.cpp b/mlir/lib/Dialect/MemRef/IR/MemRefOps.cpp
index 215a8f5e7d18be0..86d6f6bf6ad5388 100644
--- a/mlir/lib/Dialect/MemRef/IR/MemRefOps.cpp
+++ b/mlir/lib/Dialect/MemRef/IR/MemRefOps.cpp
@@ -2621,6 +2621,15 @@ Type SubViewOp::inferResultType(MemRefType sourceMemRefType,
   dispatchIndexOpFoldResults(offsets, dynamicOffsets, staticOffsets);
   dispatchIndexOpFoldResults(sizes, dynamicSizes, staticSizes);
   dispatchIndexOpFoldResults(strides, dynamicStrides, staticStrides);
+
+  for (int64_t offset : staticOffsets) {
+    if (!ShapedType::isDynamic(offset))
+      assert(offset >= 0 && "expected subview offsets to be non-negative");
+  }
+  for (int64_t size : staticSizes) {
+    if (!ShapedType::isDynamic(size))
+      assert(size >= 0 && "expected subview sizes to be non-negative");
+  }
   return SubViewOp::inferResultType(sourceMemRefType, staticOffsets,
                                     staticSizes, staticStrides);
 }
@@ -2842,8 +2851,18 @@ static LogicalResult produceSubViewErrorMsg(SliceVerificationResult result,
   llvm_unreachable("unexpected subview verification result");
 }
 
-/// Verifier for SubViewOp.
 LogicalResult SubViewOp::verify() {
+  for (int64_t offset : getStaticOffsets()) {
+    if (offset < 0 && !ShapedType::isDynamic(offset))
+      return emitError("expected subview offsets to be non-negative, but got ")
+             << offset;
+  }
+  for (int64_t size : getStaticSizes()) {
+    if (size < 0 && !ShapedType::isDynamic(size))
+      return emitError("expected subview sizes to be non-negative, but got ")
+             << size;
+  }
+
   MemRefType baseType = getSourceType();
   MemRefType subViewType = getType();
 
diff --git a/mlir/lib/Dialect/Tensor/IR/TensorOps.cpp b/mlir/lib/Dialect/Tensor/IR/TensorOps.cpp
index 6fc45379111fc34..ab915c0e786aeb5 100644
--- a/mlir/lib/Dialect/Tensor/IR/TensorOps.cpp
+++ b/mlir/lib/Dialect/Tensor/IR/TensorOps.cpp
@@ -1242,7 +1242,7 @@ struct StaticTensorGenerate : public OpRewritePattern<GenerateOp> {
 
     for (int64_t newdim : newShape) {
       // This check also occurs in the verifier, but we need it here too
-      // since intermediate passes may have some replaced dynamic dimensions
+      // since intermediate passes may have replaced some dynamic dimensions
       // by constants.
       if (newdim < 0 && !ShapedType::isDynamic(newdim))
         return failure();
diff --git a/mlir/test/Dialect/MemRef/invalid.mlir b/mlir/test/Dialect/MemRef/invalid.mlir
index cb5977e302a993f..38c0bcc3f2491c2 100644
--- a/mlir/test/Dialect/MemRef/invalid.mlir
+++ b/mlir/test/Dialect/MemRef/invalid.mlir
@@ -611,6 +611,22 @@ func.func @invalid_view(%arg0 : index, %arg1 : index, %arg2 : index) {
 
 // -----
 
+func.func @invalid_subview(%input: memref<4x1024xf32>) -> memref<2x256xf32, strided<[1024, 1], offset: 2304>> {
+  // expected-error@+1 {{expected subview offsets to be non-negative, but got -1}}
+  %0 = memref.subview %input[-1, 256] [2, 256] [1, 1] : memref<4x1024xf32> to memref<2x256xf32, strided<[1024, 1], offset: 2304>>
+  return %0 : memref<2x256xf32, strided<[1024, 1], offset: 2304>>
+}
+
+// -----
+
+func.func @invalid_subview(%input: memref<4x1024xf32>) -> memref<2x256xf32, strided<[1024, 1], offset: 2304>> {
+  // expected-error@+1 {{expected subview sizes to be non-negative, but got -1}}
+  %0 = memref.subview %input[2, 256] [-1, 256] [1, 1] : memref<4x1024xf32> to memref<2x256xf32, strided<[1024, 1], offset: 2304>>
+  return %0 : memref<2x256xf32, strided<[1024, 1], offset: 2304>>
+}
+
+// -----
+
 func.func @invalid_subview(%arg0 : index, %arg1 : index, %arg2 : index) {
   %0 = memref.alloc() : memref<8x16x4xf32>
   // expected-error@+1 {{expected mixed offsets rank to match mixed sizes rank (2 vs 3) so the rank of the result type is well-formed}}

matthias-springer · 2023-11-13T02:40:41Z

mlir/lib/Dialect/MemRef/IR/MemRefOps.cpp

@@ -2621,6 +2621,15 @@ Type SubViewOp::inferResultType(MemRefType sourceMemRefType,
  dispatchIndexOpFoldResults(offsets, dynamicOffsets, staticOffsets);
  dispatchIndexOpFoldResults(sizes, dynamicSizes, staticSizes);
  dispatchIndexOpFoldResults(strides, dynamicStrides, staticStrides);
+
+  for (int64_t offset : staticOffsets) {


nit: loops can be wrapped in #ifndef NDEBUG

Because loops are relatively expensive you mean? Is this comment outdated when the logic moves to the interface and is specified as an invariant?

matthias-springer · 2023-11-13T02:48:38Z

mlir/lib/Dialect/MemRef/IR/MemRefOps.cpp

 LogicalResult SubViewOp::verify() {
+  for (int64_t offset : getStaticOffsets()) {


Can we move those checks to mlir::detail::verifyOffsetSizeAndStrideOp and mention in the interface description that offsets and sizes must be non-negative?

Yes that makes a lot of sense IMO. I find it quite impressive from the MLIR developers that adding the check (e4e40f2) causes none of the tests to fail! Hopefully nobody downstream depends on this logic even though it shouldn't work anyway.

EDIT: On second thought. Probably not. The builders for MemRef and such are quite strict. So maybe the logic wouldn't fail in the verifier, it would still fail at a later point.

joker-eph · 2023-11-13T04:19:28Z

This looks like a good fix for the verifier, but we should also fix the canonicalization to not create invalid IR!

rikhuijzer · 2023-11-13T08:15:10Z

This looks like a good fix for the verifier, but we should also fix the canonicalization to not create invalid IR!

Then I think that I'll implement the suggestions from Matthias in this PR and leave the canonicalization for a future PR. For that future PR, could you tell me what the preferred result would be? What would be the valid IR here that the canonicalization should return?

joker-eph · 2023-11-13T11:24:34Z

This looks like a good fix for the verifier, but we should also fix the canonicalization to not create invalid IR!

Then I think that I'll implement the suggestions from Matthias in this PR and leave the canonicalization for a future PR. For that future PR, could you tell me what the preferred result would be? What would be the valid IR here that the canonicalization should return?

It is related to this PR in the sense that we can’t create invalid IR: so if you make it illegal in the verifier then the canonicalizer can’t fold to this form and must check before doing so.
If we want to leave the canonicalizer folding, then the verifier should accept it instead.

Both options are correct, but they should be consistent. Not folding negative dimension seems like a straightforward thing I think?

rikhuijzer · 2023-11-14T19:52:42Z

Both options are correct, but they should be consistent. Not folding negative dimension seems like a straightforward thing I think?

After finding the right minimal reproducer, it was yes. Thanks.

I've updated the first comment of this PR now that folding on negative dimensions will be avoided.

joker-eph

Nice, thanks!

In llvm#71153, the `memref.subview` canonicalizer crashes due to a negative `size` being passed as an operand. During `SubViewOp::verify` this negative `size` is not yet detectable since it is dynamic and only available after constant folding, which happens during the canonicalization passes. As discussed in <https://discourse.llvm.org/t/rfc-more-opfoldresult-and-mixed-indices-in-ops-that-deal-with-shaped-values/72510>, the verifier should not be extended as it should "only verify local aspects of an operation". This patch fixes llvm#71153 by not folding in aforementioned situation. Also, this patch adds a basic offset and size check in the `OffsetSizeAndStrideOpInterface` verifier. Note: only `offset` and `size` are checked because `stride` is allowed to be negative (llvm@54d81e4).

rikhuijzer added 8 commits November 11, 2023 20:43

[mlir][memref] Verify subview offsets and sizes

8f90ae9

Allow zero sizes

32adf97

Add assertion in outer `inferResultType

7323aa0

Update comment

1859a9b

Revert some changes

68778fb

Revert some changes

6ef2dda

Remove comment

4036cd1

rikhuijzer requested review from matthias-springer and dcaballe November 12, 2023 17:56

llvmbot added mlir mlir:tensor mlir:memref labels Nov 12, 2023

matthias-springer reviewed Nov 13, 2023

View reviewed changes

rikhuijzer added 2 commits November 13, 2023 09:40

Merge branch 'llvm:main' into rh/verify-subview-sizes-offsets

e3a222b

Move logic to verifyOffsetSizeAndStrideOp

e4e40f2

rikhuijzer requested a review from matthias-springer November 13, 2023 11:13

rikhuijzer changed the title ~~[mlir][memref] Detect negative offset or size for subview~~ [mlir][memref] Detect negative offset or size Nov 13, 2023

rikhuijzer changed the title ~~[mlir][memref] Detect negative offset or size~~ [mlir] Verify non-negative offset or size Nov 13, 2023

rikhuijzer changed the title ~~[mlir] Verify non-negative offset or size~~ [mlir] Verify non-negative offset and size Nov 13, 2023

rikhuijzer added 3 commits November 13, 2023 15:25

Remove outdated assertion

6a2f5b8

Put old comment back to avoid file touch

d59fe3a

Fix crash in canonicalizer

83934f0

rikhuijzer requested a review from joker-eph November 14, 2023 19:52

joker-eph approved these changes Nov 15, 2023

View reviewed changes

rikhuijzer merged commit 1949fe9 into llvm:main Nov 16, 2023
3 checks passed

rikhuijzer deleted the rh/verify-subview-sizes-offsets branch November 16, 2023 06:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[mlir] Verify non-negative `offset` and `size` #72059

[mlir] Verify non-negative `offset` and `size` #72059

rikhuijzer commented Nov 12, 2023 •

edited

Loading

llvmbot commented Nov 12, 2023 •

edited

Loading

matthias-springer Nov 13, 2023

rikhuijzer Nov 13, 2023

matthias-springer Nov 13, 2023

rikhuijzer Nov 13, 2023 •

edited

Loading

joker-eph commented Nov 13, 2023

rikhuijzer commented Nov 13, 2023

joker-eph commented Nov 13, 2023

rikhuijzer commented Nov 14, 2023

joker-eph left a comment

		LogicalResult SubViewOp::verify() {
		for (int64_t offset : getStaticOffsets()) {

[mlir] Verify non-negative offset and size #72059

[mlir] Verify non-negative offset and size #72059

Conversation

rikhuijzer commented Nov 12, 2023 • edited Loading

llvmbot commented Nov 12, 2023 • edited Loading

matthias-springer Nov 13, 2023

Choose a reason for hiding this comment

rikhuijzer Nov 13, 2023

Choose a reason for hiding this comment

matthias-springer Nov 13, 2023

Choose a reason for hiding this comment

rikhuijzer Nov 13, 2023 • edited Loading

Choose a reason for hiding this comment

joker-eph commented Nov 13, 2023

rikhuijzer commented Nov 13, 2023

joker-eph commented Nov 13, 2023

rikhuijzer commented Nov 14, 2023

joker-eph left a comment

Choose a reason for hiding this comment

[mlir] Verify non-negative `offset` and `size` #72059

[mlir] Verify non-negative `offset` and `size` #72059

rikhuijzer commented Nov 12, 2023 •

edited

Loading

llvmbot commented Nov 12, 2023 •

edited

Loading

rikhuijzer Nov 13, 2023 •

edited

Loading