Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[mlir] Verify non-negative offset and size #72059

Merged
merged 13 commits into from
Nov 16, 2023
Merged

[mlir] Verify non-negative offset and size #72059

merged 13 commits into from
Nov 16, 2023

Conversation

rikhuijzer
Copy link
Member

@rikhuijzer rikhuijzer commented Nov 12, 2023

In #71153, the memref.subview canonicalizer crashes due to a negative size being passed as an operand. During SubViewOp::verify this negative size is not yet detectable since it is dynamic and only available after constant folding, which happens during the canonicalization passes. As discussed in https://discourse.llvm.org/t/rfc-more-opfoldresult-and-mixed-indices-in-ops-that-deal-with-shaped-values/72510, the verifier should not be extended as it should "only verify local aspects of an operation".

This patch fixes #71153 by not folding in aforementioned situation.

Also, this patch adds a basic offset and size check in the OffsetSizeAndStrideOpInterface verifier.

Note: only offset and size are checked because stride is allowed to be negative (54d81e4).

This now does catch the negative size inside the interface so that an
`op.emitError` can be thrown. That works, but then continues to
return an empty result?

Instead, the interface can probably be refactored first because it's
very restrictive in its current form.
@llvmbot
Copy link
Collaborator

llvmbot commented Nov 12, 2023

@llvm/pr-subscribers-mlir
@llvm/pr-subscribers-mlir-tensor

@llvm/pr-subscribers-mlir-memref

Author: Rik Huijzer (rikhuijzer)

Changes

In #71153, the memref.subview canonicalizer crashes due to a negative size being passed as an operand. During SubViewOp::verify this negative size is not yet detectable since it is dynamic and only available after constant folding, which happens during the canonicalization passes. As discussed in <https://discourse.llvm.org/t/rfc-more-opfoldresult-and-mixed-indices-in-ops-that-deal-with-shaped-values/72510>, the verifier should not be extended as it should "only verify local aspects of an operation". Furthermore, the discussion talks about possible solutions here, but it seems to me that there is no consensus yet?

This patch proposes to add a basic offset and size check in SubViewOp::verify and add a slightly more clear assertion error after the constant folding. Without this assertion, it would crash with the following message:

&lt;unknown&gt;:0: error: invalid memref size
Assertion failed: (succeeded(ConcreteT::verify(getDefaultDiagnosticEmitFn(ctx), args...))), function get, file StorageUniquerSupport.h, line 181.

Note: this patch only checks offset and size because stride is allowed to be negative (54d81e4).


Full diff: https://github.com/llvm/llvm-project/pull/72059.diff

3 Files Affected:

  • (modified) mlir/lib/Dialect/MemRef/IR/MemRefOps.cpp (+20-1)
  • (modified) mlir/lib/Dialect/Tensor/IR/TensorOps.cpp (+1-1)
  • (modified) mlir/test/Dialect/MemRef/invalid.mlir (+16)
diff --git a/mlir/lib/Dialect/MemRef/IR/MemRefOps.cpp b/mlir/lib/Dialect/MemRef/IR/MemRefOps.cpp
index 215a8f5e7d18be0..86d6f6bf6ad5388 100644
--- a/mlir/lib/Dialect/MemRef/IR/MemRefOps.cpp
+++ b/mlir/lib/Dialect/MemRef/IR/MemRefOps.cpp
@@ -2621,6 +2621,15 @@ Type SubViewOp::inferResultType(MemRefType sourceMemRefType,
   dispatchIndexOpFoldResults(offsets, dynamicOffsets, staticOffsets);
   dispatchIndexOpFoldResults(sizes, dynamicSizes, staticSizes);
   dispatchIndexOpFoldResults(strides, dynamicStrides, staticStrides);
+
+  for (int64_t offset : staticOffsets) {
+    if (!ShapedType::isDynamic(offset))
+      assert(offset >= 0 && "expected subview offsets to be non-negative");
+  }
+  for (int64_t size : staticSizes) {
+    if (!ShapedType::isDynamic(size))
+      assert(size >= 0 && "expected subview sizes to be non-negative");
+  }
   return SubViewOp::inferResultType(sourceMemRefType, staticOffsets,
                                     staticSizes, staticStrides);
 }
@@ -2842,8 +2851,18 @@ static LogicalResult produceSubViewErrorMsg(SliceVerificationResult result,
   llvm_unreachable("unexpected subview verification result");
 }
 
-/// Verifier for SubViewOp.
 LogicalResult SubViewOp::verify() {
+  for (int64_t offset : getStaticOffsets()) {
+    if (offset < 0 && !ShapedType::isDynamic(offset))
+      return emitError("expected subview offsets to be non-negative, but got ")
+             << offset;
+  }
+  for (int64_t size : getStaticSizes()) {
+    if (size < 0 && !ShapedType::isDynamic(size))
+      return emitError("expected subview sizes to be non-negative, but got ")
+             << size;
+  }
+
   MemRefType baseType = getSourceType();
   MemRefType subViewType = getType();
 
diff --git a/mlir/lib/Dialect/Tensor/IR/TensorOps.cpp b/mlir/lib/Dialect/Tensor/IR/TensorOps.cpp
index 6fc45379111fc34..ab915c0e786aeb5 100644
--- a/mlir/lib/Dialect/Tensor/IR/TensorOps.cpp
+++ b/mlir/lib/Dialect/Tensor/IR/TensorOps.cpp
@@ -1242,7 +1242,7 @@ struct StaticTensorGenerate : public OpRewritePattern<GenerateOp> {
 
     for (int64_t newdim : newShape) {
       // This check also occurs in the verifier, but we need it here too
-      // since intermediate passes may have some replaced dynamic dimensions
+      // since intermediate passes may have replaced some dynamic dimensions
       // by constants.
       if (newdim < 0 && !ShapedType::isDynamic(newdim))
         return failure();
diff --git a/mlir/test/Dialect/MemRef/invalid.mlir b/mlir/test/Dialect/MemRef/invalid.mlir
index cb5977e302a993f..38c0bcc3f2491c2 100644
--- a/mlir/test/Dialect/MemRef/invalid.mlir
+++ b/mlir/test/Dialect/MemRef/invalid.mlir
@@ -611,6 +611,22 @@ func.func @invalid_view(%arg0 : index, %arg1 : index, %arg2 : index) {
 
 // -----
 
+func.func @invalid_subview(%input: memref<4x1024xf32>) -> memref<2x256xf32, strided<[1024, 1], offset: 2304>> {
+  // expected-error@+1 {{expected subview offsets to be non-negative, but got -1}}
+  %0 = memref.subview %input[-1, 256] [2, 256] [1, 1] : memref<4x1024xf32> to memref<2x256xf32, strided<[1024, 1], offset: 2304>>
+  return %0 : memref<2x256xf32, strided<[1024, 1], offset: 2304>>
+}
+
+// -----
+
+func.func @invalid_subview(%input: memref<4x1024xf32>) -> memref<2x256xf32, strided<[1024, 1], offset: 2304>> {
+  // expected-error@+1 {{expected subview sizes to be non-negative, but got -1}}
+  %0 = memref.subview %input[2, 256] [-1, 256] [1, 1] : memref<4x1024xf32> to memref<2x256xf32, strided<[1024, 1], offset: 2304>>
+  return %0 : memref<2x256xf32, strided<[1024, 1], offset: 2304>>
+}
+
+// -----
+
 func.func @invalid_subview(%arg0 : index, %arg1 : index, %arg2 : index) {
   %0 = memref.alloc() : memref<8x16x4xf32>
   // expected-error@+1 {{expected mixed offsets rank to match mixed sizes rank (2 vs 3) so the rank of the result type is well-formed}}

@@ -2621,6 +2621,15 @@ Type SubViewOp::inferResultType(MemRefType sourceMemRefType,
dispatchIndexOpFoldResults(offsets, dynamicOffsets, staticOffsets);
dispatchIndexOpFoldResults(sizes, dynamicSizes, staticSizes);
dispatchIndexOpFoldResults(strides, dynamicStrides, staticStrides);

for (int64_t offset : staticOffsets) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: loops can be wrapped in #ifndef NDEBUG

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because loops are relatively expensive you mean? Is this comment outdated when the logic moves to the interface and is specified as an invariant?

LogicalResult SubViewOp::verify() {
for (int64_t offset : getStaticOffsets()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we move those checks to mlir::detail::verifyOffsetSizeAndStrideOp and mention in the interface description that offsets and sizes must be non-negative?

Copy link
Member Author

@rikhuijzer rikhuijzer Nov 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that makes a lot of sense IMO. I find it quite impressive from the MLIR developers that adding the check (e4e40f2) causes none of the tests to fail! Hopefully nobody downstream depends on this logic even though it shouldn't work anyway.

EDIT: On second thought. Probably not. The builders for MemRef and such are quite strict. So maybe the logic wouldn't fail in the verifier, it would still fail at a later point.

@joker-eph
Copy link
Collaborator

This looks like a good fix for the verifier, but we should also fix the canonicalization to not create invalid IR!

@rikhuijzer
Copy link
Member Author

This looks like a good fix for the verifier, but we should also fix the canonicalization to not create invalid IR!

Then I think that I'll implement the suggestions from Matthias in this PR and leave the canonicalization for a future PR. For that future PR, could you tell me what the preferred result would be? What would be the valid IR here that the canonicalization should return?

@rikhuijzer rikhuijzer changed the title [mlir][memref] Detect negative offset or size for subview [mlir][memref] Detect negative offset or size Nov 13, 2023
@rikhuijzer rikhuijzer changed the title [mlir][memref] Detect negative offset or size [mlir] Verify non-negative offset or size Nov 13, 2023
@rikhuijzer rikhuijzer changed the title [mlir] Verify non-negative offset or size [mlir] Verify non-negative offset and size Nov 13, 2023
@joker-eph
Copy link
Collaborator

This looks like a good fix for the verifier, but we should also fix the canonicalization to not create invalid IR!

Then I think that I'll implement the suggestions from Matthias in this PR and leave the canonicalization for a future PR. For that future PR, could you tell me what the preferred result would be? What would be the valid IR here that the canonicalization should return?

It is related to this PR in the sense that we can’t create invalid IR: so if you make it illegal in the verifier then the canonicalizer can’t fold to this form and must check before doing so.
If we want to leave the canonicalizer folding, then the verifier should accept it instead.

Both options are correct, but they should be consistent. Not folding negative dimension seems like a straightforward thing I think?

@rikhuijzer
Copy link
Member Author

Both options are correct, but they should be consistent. Not folding negative dimension seems like a straightforward thing I think?

After finding the right minimal reproducer, it was yes. Thanks.

I've updated the first comment of this PR now that folding on negative dimensions will be avoided.

Copy link
Collaborator

@joker-eph joker-eph left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, thanks!

@rikhuijzer rikhuijzer merged commit 1949fe9 into llvm:main Nov 16, 2023
3 checks passed
@rikhuijzer rikhuijzer deleted the rh/verify-subview-sizes-offsets branch November 16, 2023 06:42
sr-tream pushed a commit to sr-tream/llvm-project that referenced this pull request Nov 20, 2023
In llvm#71153, the `memref.subview` canonicalizer crashes due to a negative
`size` being passed as an operand. During `SubViewOp::verify` this
negative `size` is not yet detectable since it is dynamic and only
available after constant folding, which happens during the
canonicalization passes. As discussed in
<https://discourse.llvm.org/t/rfc-more-opfoldresult-and-mixed-indices-in-ops-that-deal-with-shaped-values/72510>,
the verifier should not be extended as it should "only verify local
aspects of an operation".

This patch fixes llvm#71153 by not folding in aforementioned situation.

Also, this patch adds a basic offset and size check in the
`OffsetSizeAndStrideOpInterface` verifier.

Note: only `offset` and `size` are checked because `stride` is allowed
to be negative
(llvm@54d81e4).
zahiraam pushed a commit to zahiraam/llvm-project that referenced this pull request Nov 20, 2023
In llvm#71153, the `memref.subview` canonicalizer crashes due to a negative
`size` being passed as an operand. During `SubViewOp::verify` this
negative `size` is not yet detectable since it is dynamic and only
available after constant folding, which happens during the
canonicalization passes. As discussed in
<https://discourse.llvm.org/t/rfc-more-opfoldresult-and-mixed-indices-in-ops-that-deal-with-shaped-values/72510>,
the verifier should not be extended as it should "only verify local
aspects of an operation".

This patch fixes llvm#71153 by not folding in aforementioned situation.

Also, this patch adds a basic offset and size check in the
`OffsetSizeAndStrideOpInterface` verifier.

Note: only `offset` and `size` are checked because `stride` is allowed
to be negative
(llvm@54d81e4).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[mlir] Canonicalizer crashed with assertion failure in mlir::memref::SubViewOp::inferResultType
4 participants