-
Notifications
You must be signed in to change notification settings - Fork 11.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[mlir][llvm] Drop unreachable basic block during import #78467
Conversation
06db79b
to
e2e5051
Compare
@llvm/pr-subscribers-mlir @llvm/pr-subscribers-mlir-llvm Author: Tobias Gysi (gysit) ChangesThis revision updates the LLVM IR import to support unreachable basic blocks. An unreachable block may dominate itself and a value defined inside the block may thus be used before its definition. The import does not support such dependencies. We thus delete the unreachable basic block since MLIR does not have basic block labels that can be reached using an indirect call and unreachable blocks are indeed dead for us. Additionally, add a small poison constant import test. Full diff: https://github.com/llvm/llvm-project/pull/78467.diff 4 Files Affected:
diff --git a/mlir/lib/Target/LLVMIR/ModuleImport.cpp b/mlir/lib/Target/LLVMIR/ModuleImport.cpp
index e905408a1e0897..928d8077175ccf 100644
--- a/mlir/lib/Target/LLVMIR/ModuleImport.cpp
+++ b/mlir/lib/Target/LLVMIR/ModuleImport.cpp
@@ -26,6 +26,7 @@
#include "mlir/Interfaces/DataLayoutInterfaces.h"
#include "mlir/Tools/mlir-translate/Translation.h"
+#include "llvm/ADT/DepthFirstIterator.h"
#include "llvm/ADT/PostOrderIterator.h"
#include "llvm/ADT/ScopeExit.h"
#include "llvm/ADT/StringSet.h"
@@ -132,18 +133,17 @@ static LogicalResult convertInstructionImpl(OpBuilder &odsBuilder,
return failure();
}
-/// Get a topologically sorted list of blocks for the given function.
+/// Get a topologically sorted list of blocks for the given basic blocks.
static SetVector<llvm::BasicBlock *>
-getTopologicallySortedBlocks(llvm::Function *func) {
+getTopologicallySortedBlocks(ArrayRef<llvm::BasicBlock *> basicBlocks) {
SetVector<llvm::BasicBlock *> blocks;
- for (llvm::BasicBlock &bb : *func) {
- if (!blocks.contains(&bb)) {
- llvm::ReversePostOrderTraversal<llvm::BasicBlock *> traversal(&bb);
+ for (llvm::BasicBlock *basicBlock : basicBlocks) {
+ if (!blocks.contains(basicBlock)) {
+ llvm::ReversePostOrderTraversal<llvm::BasicBlock *> traversal(basicBlock);
blocks.insert(traversal.begin(), traversal.end());
}
}
- assert(blocks.size() == func->size() && "some blocks are not sorted");
-
+ assert(blocks.size() == basicBlocks.size() && "some blocks are not sorted");
return blocks;
}
@@ -1859,11 +1859,26 @@ LogicalResult ModuleImport::processFunction(llvm::Function *func) {
if (func->isDeclaration())
return success();
- // Eagerly create all blocks.
- for (llvm::BasicBlock &bb : *func) {
- Block *block =
- builder.createBlock(&funcOp.getBody(), funcOp.getBody().end());
- mapBlock(&bb, block);
+ // Collect the set of basic blocks reachable from the function's entry block.
+ // This step is crucial as LLVM IR can contain unreachable blocks that
+ // self-dominate. As a result, an operation might utilize a variable it
+ // defines, which the import does not support. Given that MLIR lacks block
+ // label support, we can safely remove unreachable blocks, as there are no
+ // indirect branch instructions that could potentially target these blocks.
+ llvm::df_iterator_default_set<llvm::BasicBlock *> reachable;
+ for (llvm::BasicBlock *basicBlock : llvm::depth_first_ext(func, reachable))
+ (void)basicBlock;
+
+ // Eagerly create all reachable blocks.
+ SmallVector<llvm::BasicBlock *> reachableBasicBlocks;
+ for (llvm::BasicBlock &basicBlock : *func) {
+ // Skip unreachable blocks.
+ if (!reachable.contains(&basicBlock))
+ continue;
+ Region &body = funcOp.getBody();
+ Block *block = builder.createBlock(&body, body.end());
+ mapBlock(&basicBlock, block);
+ reachableBasicBlocks.push_back(&basicBlock);
}
// Add function arguments to the entry block.
@@ -1876,10 +1891,11 @@ LogicalResult ModuleImport::processFunction(llvm::Function *func) {
// Process the blocks in topological order. The ordered traversal ensures
// operands defined in a dominating block have a valid mapping to an MLIR
// value once a block is translated.
- SetVector<llvm::BasicBlock *> blocks = getTopologicallySortedBlocks(func);
+ SetVector<llvm::BasicBlock *> blocks =
+ getTopologicallySortedBlocks(reachableBasicBlocks);
setConstantInsertionPointToStart(lookupBlock(blocks.front()));
- for (llvm::BasicBlock *bb : blocks)
- if (failed(processBasicBlock(bb, lookupBlock(bb))))
+ for (llvm::BasicBlock *basicBlock : blocks)
+ if (failed(processBasicBlock(basicBlock, lookupBlock(basicBlock))))
return failure();
// Process the debug intrinsics that require a delayed conversion after
diff --git a/mlir/test/Target/LLVMIR/Import/constant.ll b/mlir/test/Target/LLVMIR/Import/constant.ll
index cd2d00ec0aa783..3c46f5b20c31ca 100644
--- a/mlir/test/Target/LLVMIR/Import/constant.ll
+++ b/mlir/test/Target/LLVMIR/Import/constant.ll
@@ -47,6 +47,16 @@ define void @undef_constant(i32 %arg0) {
; // -----
+; CHECK-LABEL: @poison_constant
+define void @poison_constant(double %arg0) {
+ ; CHECK: %[[POISON:.+]] = llvm.mlir.poison : f64
+ ; CHECK: llvm.fadd %[[POISON]], %{{.*}} : f64
+ %1 = fadd double poison, %arg0
+ ret void
+}
+
+; // -----
+
; CHECK-LABEL: @null_constant
define ptr @null_constant() {
; CHECK: %[[NULL:[0-9]+]] = llvm.mlir.zero : !llvm.ptr
diff --git a/mlir/test/Target/LLVMIR/Import/exception.ll b/mlir/test/Target/LLVMIR/Import/exception.ll
index de227645cc15bb..440d89ec147f76 100644
--- a/mlir/test/Target/LLVMIR/Import/exception.ll
+++ b/mlir/test/Target/LLVMIR/Import/exception.ll
@@ -12,34 +12,35 @@ define i32 @invokeLandingpad() personality ptr @__gxx_personality_v0 {
; CHECK: %[[a1:[0-9]+]] = llvm.mlir.addressof @_ZTIii : !llvm.ptr
; CHECK: %[[a3:[0-9]+]] = llvm.alloca %{{[0-9]+}} x i8 {alignment = 1 : i64} : (i32) -> !llvm.ptr
%1 = alloca i8
- ; CHECK: llvm.invoke @foo(%[[a3]]) to ^bb2 unwind ^bb1 : (!llvm.ptr) -> ()
- invoke void @foo(ptr %1) to label %4 unwind label %2
+ ; CHECK: llvm.invoke @foo(%[[a3]]) to ^[[bb1:.*]] unwind ^[[bb4:.*]] : (!llvm.ptr) -> ()
+ invoke void @foo(ptr %1) to label %bb1 unwind label %bb4
-; CHECK: ^bb1:
+; CHECK: ^[[bb1]]:
+bb1:
+ ; CHECK: %{{[0-9]+}} = llvm.invoke @bar(%[[a3]]) to ^[[bb2:.*]] unwind ^[[bb4]] : (!llvm.ptr) -> !llvm.ptr
+ %2 = invoke ptr @bar(ptr %1) to label %bb2 unwind label %bb4
+
+; CHECK: ^[[bb2]]:
+bb2:
+ ; CHECK: llvm.invoke @vararg_foo(%[[a3]], %{{.*}}) to ^[[bb3:.*]] unwind ^[[bb4]] vararg(!llvm.func<void (ptr, ...)>) : (!llvm.ptr, i32) -> ()
+ invoke void (ptr, ...) @vararg_foo(ptr %1, i32 0) to label %bb3 unwind label %bb4
+
+; CHECK: ^[[bb3]]:
+bb3:
+ ; CHECK: llvm.invoke %{{.*}}(%[[a3]], %{{.*}}) to ^[[bb5:.*]] unwind ^[[bb4]] vararg(!llvm.func<void (ptr, ...)>) : !llvm.ptr, (!llvm.ptr, i32) -> ()
+ invoke void (ptr, ...) undef(ptr %1, i32 0) to label %bb5 unwind label %bb4
+
+; CHECK: ^[[bb4]]:
+bb4:
; CHECK: %{{[0-9]+}} = llvm.landingpad (catch %{{[0-9]+}} : !llvm.ptr) (catch %[[a1]] : !llvm.ptr) (filter %{{[0-9]+}} : !llvm.array<1 x i1>) : !llvm.struct<(ptr, i32)>
%3 = landingpad { ptr, i32 } catch ptr @_ZTIi catch ptr @_ZTIii
filter [1 x i1] [i1 1]
resume { ptr, i32 } %3
-; CHECK: ^bb2:
+; CHECK: ^[[bb5]]:
+bb5:
; CHECK: llvm.return %{{[0-9]+}} : i32
ret i32 1
-
-; CHECK: ^bb3:
- ; CHECK: %{{[0-9]+}} = llvm.invoke @bar(%[[a3]]) to ^bb2 unwind ^bb1 : (!llvm.ptr) -> !llvm.ptr
- %6 = invoke ptr @bar(ptr %1) to label %4 unwind label %2
-
-; CHECK: ^bb4:
- ; CHECK: llvm.invoke @vararg_foo(%[[a3]], %{{.*}}) to ^bb2 unwind ^bb1 vararg(!llvm.func<void (ptr, ...)>) : (!llvm.ptr, i32) -> ()
- invoke void (ptr, ...) @vararg_foo(ptr %1, i32 0) to label %4 unwind label %2
-
-; CHECK: ^bb5:
- ; CHECK: llvm.invoke %{{.*}}(%[[a3]], %{{.*}}) to ^bb2 unwind ^bb1 vararg(!llvm.func<void (ptr, ...)>) : !llvm.ptr, (!llvm.ptr, i32) -> ()
- invoke void (ptr, ...) undef(ptr %1, i32 0) to label %4 unwind label %2
-
-; CHECK: ^bb6:
- ; CHECK: llvm.return %{{[0-9]+}} : i32
- ret i32 0
}
declare i32 @foo2()
diff --git a/mlir/test/Target/LLVMIR/Import/unreachable-blocks.ll b/mlir/test/Target/LLVMIR/Import/unreachable-blocks.ll
new file mode 100644
index 00000000000000..15d7f2f3f09c89
--- /dev/null
+++ b/mlir/test/Target/LLVMIR/Import/unreachable-blocks.ll
@@ -0,0 +1,13 @@
+; RUN: mlir-translate -import-llvm %s | FileCheck %s
+
+; CHECK-LABEL: llvm.func @unreachable_block
+; CHECK-NEXT: llvm.return
+; CHECK-NOT: llvm.fadd
+define void @unreachable_block(float %0) {
+.entry:
+ ret void
+
+unreachable:
+ %1 = fadd float %0, %1
+ br label %unreachable
+}
|
I'm confused because the title says "Import unreachable basic blocks" but the description says "We thus delete the unreachable basic block"? |
I tried to improve title and comment. That was indeed a failed attempt of coming up with a short title. The revision indeed deletes unreachable basic blocks before the import since importing an unreachable block is not possible in the general case. The problem is that instructions can use a value that they define themselves. This is possible since an unreachable block can dominate itself. The fadd in the following code is an example for such a self dependency.
|
I know about self-referencing instructions, but I don't quite get why we can't import this in MLIR? |
Turns out the MLIR also supports this. I was honestly not aware of this, and I fail to see any usecase for this, as the conversion framework doesn't touch unreachable blocks. It seems adding support for that might be possible by relaxing some assumptions, but the canonicalizer will afterwards just remove these unreachable blocks. Thus, I'm in favor of taking the path of least resistance: |
The import in its current form cannot support this since all operand values need to exist before the operation is converted. There is currently no mechanism to create temporary values that then later on are magically replaced. A possibly solution may be to have an LLVM pass before the import that introduces phi instructions for such values. That sounds really complex to get right in the general case though. Is there a use case for such unreachable blocks in MLIR? Canonicalize immediately deletes such a block since there to the best of my knowledge there is no way to reach an unreachable block in MLIR (since there is no indirect branch or similar). I would thus strongly favor the current solution which is much simpler and leads to the same result. |
This revision updates the LLVM IR import to support unreachable basic blocks. An unreachable block may dominate itself and a value defined inside the block may thus be used before its definition. The import does not support such dependencies. We thus delete the unreachable basic block since MLIR does not have basic block labels that can be reached using an indirect call and unreachable blocks are indeed dead for us. Additionally, add a small poison constant import test.
e2e5051
to
9699192
Compare
Sure, but such mechanism could be implemented fairly easily right?
There may not be actual use-cases to import them in practice. It just makes me always a bit nervous when the translation isn't "faithful" ;) |
It is definitely more complex than the current solution. It requires at least one additional mapping from the initially unknown values to all their uses and some dummy poison values that are used initially. Plus I guess we can only check at the end of the translation if all unknown values have been set. If possible I would like to avoid that complexity. |
I prototyped a version that imports the code as is without deleting the unreachable blocks. It turns out to be quite tricky to get right since the change interferes with the error handling of the import. In particular, the As this PR fixes an existing bug, I suggest to land it (it is definitely an improvement over the status quo). Once we have a concrete use case, we can follow up with an version that introduces placeholder variables to support self-referencing instructions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for addressing the comments. I'm fine with fixing this bug and deferring a potential support for unreachable basic blocks to a later revision.
This revision updates the LLVM IR import to support unreachable basic blocks. An unreachable block may dominate itself and a value defined inside the block may thus be used before its definition. The import does not support such dependencies. We thus delete the unreachable basic blocks before the import. This is possible since MLIR does not have basic block labels that can be reached using an indirect call and unreachable blocks can indeed be deleted safely.
Additionally, add a small poison constant import test.