Skip to content

[mlir] [scf] fix crash when conversion from scf to control flow #107221

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Sep 6, 2024

Conversation

cxy-1993
Copy link
Contributor

@cxy-1993 cxy-1993 commented Sep 4, 2024

This patch fixed a crash when scf.parallel's region donesn't terminate with reduce op. This can happend in dialect conversion.

This patch fixed a crash when scf.parallel's region donesn't terminate with
reduce op. This can happend in dialect conversion.
@llvmbot
Copy link
Member

llvmbot commented Sep 4, 2024

@llvm/pr-subscribers-mlir

Author: donald chen (cxy-1993)

Changes

This patch fixed a crash when scf.parallel's region donesn't terminate with reduce op. This can happend in dialect conversion.


Full diff: https://github.com/llvm/llvm-project/pull/107221.diff

1 Files Affected:

  • (modified) mlir/lib/Conversion/SCFToControlFlow/SCFToControlFlow.cpp (+4-1)
diff --git a/mlir/lib/Conversion/SCFToControlFlow/SCFToControlFlow.cpp b/mlir/lib/Conversion/SCFToControlFlow/SCFToControlFlow.cpp
index 2372ab5b82a772..45f3bcfa393be8 100644
--- a/mlir/lib/Conversion/SCFToControlFlow/SCFToControlFlow.cpp
+++ b/mlir/lib/Conversion/SCFToControlFlow/SCFToControlFlow.cpp
@@ -482,7 +482,10 @@ LogicalResult
 ParallelLowering::matchAndRewrite(ParallelOp parallelOp,
                                   PatternRewriter &rewriter) const {
   Location loc = parallelOp.getLoc();
-  auto reductionOp = cast<ReduceOp>(parallelOp.getBody()->getTerminator());
+  auto reductionOp = dyn_cast<ReduceOp>(parallelOp.getBody()->getTerminator());
+  if (!reductionOp) {
+    return failure();
+  }
 
   // For a parallel loop, we essentially need to create an n-dimensional loop
   // nest. We do this by translating to scf.for ops and have those lowered in

Copy link
Member

@matthias-springer matthias-springer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

scf.parallel should always have a terminator. Are you running the pass with invalid IR?

This is what the documentation says: The body region must contain exactly one block that terminates with a scf.reduce operation. If an scf.parallel op has no reductions, the terminator has no operands and no regions.

@cxy-1993
Copy link
Contributor Author

cxy-1993 commented Sep 5, 2024

scf.parallel should always have a terminator. Are you running the pass with invalid IR?

This is what the documentation says: The body region must contain exactly one block that terminates with a scf.reduce operation. If an scf.parallel op has no reductions, the terminator has no operands and no regions.

This is generally true, but core dumps can occur during the conversion process. For example, consider a scenario where we mix affine.parallel -> scf.parallel patterns with descending scf.parallel patterns and use partial conversion for dialect conversion: When descending from affine.parallel to scf.parallel, the deletion of affine.yield has not yet been committed, leading to a situation where the body terminator of scf.parallel is not an scf.reduce. In this case, we avoid such core dumps.

Copy link
Member

@matthias-springer matthias-springer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SCFToControlFlow are RewritePatterns. These should not even be used in a dialect conversion. getBody()->getTerminator() is also not safe to use in a dialect conversion as you pointed out. So there are multiple issues with this code. But let's merge this, as it at least improves the situation a little bit.

@cxy-1993
Copy link
Contributor Author

cxy-1993 commented Sep 5, 2024

SCFToControlFlow are RewritePatterns. These should not even be used in a dialect conversion. getBody()->getTerminator() is also not safe to use in a dialect conversion as you pointed out. So there are multiple issues with this code. But let's merge this, as it at least improves the situation a little bit.

Thanks for the context!

However, I don't quite understand the statement that rewrite patterns cannot be used in dialect conversion. Isn't the conversion from scf to controlflow a dialect conversion, and what is the definition of dialect conversion? Are you suggesting that we should be cautious about using partial conversion?

@matthias-springer
Copy link
Member

matthias-springer commented Sep 5, 2024

The patterns in a dialect conversion should be ConversionPatterns. The way the API is built suggests that conversion patterns and rewrite patterns can be mixed in a greedy pattern rewrite or dialect conversion, but that is not actually the case. E.g., you are supposed to lookup operands in the adaptor during a dialect conversion, but rewrite patterns don't have an adaptor.

@cxy-1993
Copy link
Contributor Author

cxy-1993 commented Sep 5, 2024

Thank you for pointing that out. I hadn't noticed the difference between conversion patterns and rewrite patterns before.

But in the above case, we're just chaining the affine->scf and scf->cf rewrite patterns together without mixing them with other conversion patterns. Furthermore, these patterns already use partial conversion (lower-affine, scf-to-cf). Does this mean that using methods like getBody()->getTerminator() is inherently dangerous for all patterns, whether they're conversion patterns or rewrite patterns?

@matthias-springer
Copy link
Member

matthias-springer commented Sep 5, 2024

getBody()->getTerminator() is fine in a RewritePattern. Generally speaking, looking at the IR is fine in a RewritePattern because all IR changes materialize immediately.

Using a RewritePattern in a dialect conversion (e.g., partial conversion) or using getBody()->getTerminator() in a ConversionPattern is dangerous because what you see may be outdated IR.

The SCFToCF patterns are RewritePatterns but they are used in a dialect conversion. That's the problem that I'm seeing.

@cxy-1993
Copy link
Contributor Author

cxy-1993 commented Sep 6, 2024

getBody()->getTerminator() is fine in a RewritePattern. Generally speaking, looking at the IR is fine in a RewritePattern because all IR changes materialize immediately.

Using a RewritePattern in a dialect conversion (e.g., partial conversion) or using getBody()->getTerminator() in a ConversionPattern is dangerous because what you see may be outdated IR.

The SCFToCF patterns are RewritePatterns but they are used in a dialect conversion. That's the problem that I'm seeing.

Thanks for your explaination!

@cxy-1993 cxy-1993 merged commit c02fd17 into llvm:main Sep 6, 2024
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants