-
Notifications
You must be signed in to change notification settings - Fork 14.8k
[mlir] [scf] fix crash when conversion from scf to control flow #107221
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This patch fixed a crash when scf.parallel's region donesn't terminate with reduce op. This can happend in dialect conversion.
@llvm/pr-subscribers-mlir Author: donald chen (cxy-1993) ChangesThis patch fixed a crash when scf.parallel's region donesn't terminate with reduce op. This can happend in dialect conversion. Full diff: https://github.com/llvm/llvm-project/pull/107221.diff 1 Files Affected:
diff --git a/mlir/lib/Conversion/SCFToControlFlow/SCFToControlFlow.cpp b/mlir/lib/Conversion/SCFToControlFlow/SCFToControlFlow.cpp
index 2372ab5b82a772..45f3bcfa393be8 100644
--- a/mlir/lib/Conversion/SCFToControlFlow/SCFToControlFlow.cpp
+++ b/mlir/lib/Conversion/SCFToControlFlow/SCFToControlFlow.cpp
@@ -482,7 +482,10 @@ LogicalResult
ParallelLowering::matchAndRewrite(ParallelOp parallelOp,
PatternRewriter &rewriter) const {
Location loc = parallelOp.getLoc();
- auto reductionOp = cast<ReduceOp>(parallelOp.getBody()->getTerminator());
+ auto reductionOp = dyn_cast<ReduceOp>(parallelOp.getBody()->getTerminator());
+ if (!reductionOp) {
+ return failure();
+ }
// For a parallel loop, we essentially need to create an n-dimensional loop
// nest. We do this by translating to scf.for ops and have those lowered in
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
scf.parallel
should always have a terminator. Are you running the pass with invalid IR?
This is what the documentation says: The body region must contain exactly one block that terminates with a scf.reduce operation. If an scf.parallel op has no reductions, the terminator has no operands and no regions.
This is generally true, but core dumps can occur during the conversion process. For example, consider a scenario where we mix affine.parallel -> scf.parallel patterns with descending scf.parallel patterns and use partial conversion for dialect conversion: When descending from affine.parallel to scf.parallel, the deletion of affine.yield has not yet been committed, leading to a situation where the body terminator of scf.parallel is not an scf.reduce. In this case, we avoid such core dumps. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SCFToControlFlow
are RewritePatterns. These should not even be used in a dialect conversion. getBody()->getTerminator()
is also not safe to use in a dialect conversion as you pointed out. So there are multiple issues with this code. But let's merge this, as it at least improves the situation a little bit.
Thanks for the context! However, I don't quite understand the statement that rewrite patterns cannot be used in dialect conversion. Isn't the conversion from scf to controlflow a dialect conversion, and what is the definition of dialect conversion? Are you suggesting that we should be cautious about using partial conversion? |
The patterns in a dialect conversion should be |
Thank you for pointing that out. I hadn't noticed the difference between conversion patterns and rewrite patterns before. But in the above case, we're just chaining the affine->scf and scf->cf rewrite patterns together without mixing them with other conversion patterns. Furthermore, these patterns already use partial conversion (lower-affine, scf-to-cf). Does this mean that using methods like getBody()->getTerminator() is inherently dangerous for all patterns, whether they're conversion patterns or rewrite patterns? |
Using a RewritePattern in a dialect conversion (e.g., partial conversion) or using The |
Thanks for your explaination! |
This patch fixed a crash when scf.parallel's region donesn't terminate with reduce op. This can happend in dialect conversion.