[mlir][llvm] Drop unreachable basic block during import #78467

gysit · 2024-01-17T16:37:03Z

This revision updates the LLVM IR import to support unreachable basic blocks. An unreachable block may dominate itself and a value defined inside the block may thus be used before its definition. The import does not support such dependencies. We thus delete the unreachable basic blocks before the import. This is possible since MLIR does not have basic block labels that can be reached using an indirect call and unreachable blocks can indeed be deleted safely.

Additionally, add a small poison constant import test.

llvmbot · 2024-01-17T18:36:48Z

@llvm/pr-subscribers-mlir

@llvm/pr-subscribers-mlir-llvm

Author: Tobias Gysi (gysit)

Changes

This revision updates the LLVM IR import to support unreachable basic blocks. An unreachable block may dominate itself and a value defined inside the block may thus be used before its definition. The import does not support such dependencies. We thus delete the unreachable basic block since MLIR does not have basic block labels that can be reached using an indirect call and unreachable blocks are indeed dead for us.

Additionally, add a small poison constant import test.

Full diff: https://github.com/llvm/llvm-project/pull/78467.diff

4 Files Affected:

(modified) mlir/lib/Target/LLVMIR/ModuleImport.cpp (+31-15)
(modified) mlir/test/Target/LLVMIR/Import/constant.ll (+10)
(modified) mlir/test/Target/LLVMIR/Import/exception.ll (+21-20)
(added) mlir/test/Target/LLVMIR/Import/unreachable-blocks.ll (+13)

diff --git a/mlir/lib/Target/LLVMIR/ModuleImport.cpp b/mlir/lib/Target/LLVMIR/ModuleImport.cpp
index e905408a1e0897..928d8077175ccf 100644
--- a/mlir/lib/Target/LLVMIR/ModuleImport.cpp
+++ b/mlir/lib/Target/LLVMIR/ModuleImport.cpp
@@ -26,6 +26,7 @@
 #include "mlir/Interfaces/DataLayoutInterfaces.h"
 #include "mlir/Tools/mlir-translate/Translation.h"
 
+#include "llvm/ADT/DepthFirstIterator.h"
 #include "llvm/ADT/PostOrderIterator.h"
 #include "llvm/ADT/ScopeExit.h"
 #include "llvm/ADT/StringSet.h"
@@ -132,18 +133,17 @@ static LogicalResult convertInstructionImpl(OpBuilder &odsBuilder,
   return failure();
 }
 
-/// Get a topologically sorted list of blocks for the given function.
+/// Get a topologically sorted list of blocks for the given basic blocks.
 static SetVector<llvm::BasicBlock *>
-getTopologicallySortedBlocks(llvm::Function *func) {
+getTopologicallySortedBlocks(ArrayRef<llvm::BasicBlock *> basicBlocks) {
   SetVector<llvm::BasicBlock *> blocks;
-  for (llvm::BasicBlock &bb : *func) {
-    if (!blocks.contains(&bb)) {
-      llvm::ReversePostOrderTraversal<llvm::BasicBlock *> traversal(&bb);
+  for (llvm::BasicBlock *basicBlock : basicBlocks) {
+    if (!blocks.contains(basicBlock)) {
+      llvm::ReversePostOrderTraversal<llvm::BasicBlock *> traversal(basicBlock);
       blocks.insert(traversal.begin(), traversal.end());
     }
   }
-  assert(blocks.size() == func->size() && "some blocks are not sorted");
-
+  assert(blocks.size() == basicBlocks.size() && "some blocks are not sorted");
   return blocks;
 }
 
@@ -1859,11 +1859,26 @@ LogicalResult ModuleImport::processFunction(llvm::Function *func) {
   if (func->isDeclaration())
     return success();
 
-  // Eagerly create all blocks.
-  for (llvm::BasicBlock &bb : *func) {
-    Block *block =
-        builder.createBlock(&funcOp.getBody(), funcOp.getBody().end());
-    mapBlock(&bb, block);
+  // Collect the set of basic blocks reachable from the function's entry block.
+  // This step is crucial as LLVM IR can contain unreachable blocks that
+  // self-dominate. As a result, an operation might utilize a variable it
+  // defines, which the import does not support. Given that MLIR lacks block
+  // label support, we can safely remove unreachable blocks, as there are no
+  // indirect branch instructions that could potentially target these blocks.
+  llvm::df_iterator_default_set<llvm::BasicBlock *> reachable;
+  for (llvm::BasicBlock *basicBlock : llvm::depth_first_ext(func, reachable))
+    (void)basicBlock;
+
+  // Eagerly create all reachable blocks.
+  SmallVector<llvm::BasicBlock *> reachableBasicBlocks;
+  for (llvm::BasicBlock &basicBlock : *func) {
+    // Skip unreachable blocks.
+    if (!reachable.contains(&basicBlock))
+      continue;
+    Region &body = funcOp.getBody();
+    Block *block = builder.createBlock(&body, body.end());
+    mapBlock(&basicBlock, block);
+    reachableBasicBlocks.push_back(&basicBlock);
   }
 
   // Add function arguments to the entry block.
@@ -1876,10 +1891,11 @@ LogicalResult ModuleImport::processFunction(llvm::Function *func) {
   // Process the blocks in topological order. The ordered traversal ensures
   // operands defined in a dominating block have a valid mapping to an MLIR
   // value once a block is translated.
-  SetVector<llvm::BasicBlock *> blocks = getTopologicallySortedBlocks(func);
+  SetVector<llvm::BasicBlock *> blocks =
+      getTopologicallySortedBlocks(reachableBasicBlocks);
   setConstantInsertionPointToStart(lookupBlock(blocks.front()));
-  for (llvm::BasicBlock *bb : blocks)
-    if (failed(processBasicBlock(bb, lookupBlock(bb))))
+  for (llvm::BasicBlock *basicBlock : blocks)
+    if (failed(processBasicBlock(basicBlock, lookupBlock(basicBlock))))
       return failure();
 
   // Process the debug intrinsics that require a delayed conversion after
diff --git a/mlir/test/Target/LLVMIR/Import/constant.ll b/mlir/test/Target/LLVMIR/Import/constant.ll
index cd2d00ec0aa783..3c46f5b20c31ca 100644
--- a/mlir/test/Target/LLVMIR/Import/constant.ll
+++ b/mlir/test/Target/LLVMIR/Import/constant.ll
@@ -47,6 +47,16 @@ define void @undef_constant(i32 %arg0) {
 
 ; // -----
 
+; CHECK-LABEL: @poison_constant
+define void @poison_constant(double %arg0) {
+  ; CHECK:  %[[POISON:.+]] = llvm.mlir.poison : f64
+  ; CHECK:  llvm.fadd %[[POISON]], %{{.*}} : f64
+  %1 = fadd double poison, %arg0
+  ret void
+}
+
+; // -----
+
 ; CHECK-LABEL: @null_constant
 define ptr @null_constant() {
   ; CHECK:  %[[NULL:[0-9]+]] = llvm.mlir.zero : !llvm.ptr
diff --git a/mlir/test/Target/LLVMIR/Import/exception.ll b/mlir/test/Target/LLVMIR/Import/exception.ll
index de227645cc15bb..440d89ec147f76 100644
--- a/mlir/test/Target/LLVMIR/Import/exception.ll
+++ b/mlir/test/Target/LLVMIR/Import/exception.ll
@@ -12,34 +12,35 @@ define i32 @invokeLandingpad() personality ptr @__gxx_personality_v0 {
   ; CHECK: %[[a1:[0-9]+]] = llvm.mlir.addressof @_ZTIii : !llvm.ptr
   ; CHECK: %[[a3:[0-9]+]] = llvm.alloca %{{[0-9]+}} x i8 {alignment = 1 : i64} : (i32) -> !llvm.ptr
   %1 = alloca i8
-  ; CHECK: llvm.invoke @foo(%[[a3]]) to ^bb2 unwind ^bb1 : (!llvm.ptr) -> ()
-  invoke void @foo(ptr %1) to label %4 unwind label %2
+  ; CHECK: llvm.invoke @foo(%[[a3]]) to ^[[bb1:.*]] unwind ^[[bb4:.*]] : (!llvm.ptr) -> ()
+  invoke void @foo(ptr %1) to label %bb1 unwind label %bb4
 
-; CHECK: ^bb1:
+; CHECK: ^[[bb1]]:
+bb1:
+  ; CHECK: %{{[0-9]+}} = llvm.invoke @bar(%[[a3]]) to ^[[bb2:.*]] unwind ^[[bb4]] : (!llvm.ptr) -> !llvm.ptr
+  %2 = invoke ptr @bar(ptr %1) to label %bb2 unwind label %bb4
+
+; CHECK: ^[[bb2]]:
+bb2:
+  ; CHECK: llvm.invoke @vararg_foo(%[[a3]], %{{.*}}) to ^[[bb3:.*]] unwind ^[[bb4]] vararg(!llvm.func<void (ptr, ...)>) : (!llvm.ptr, i32) -> ()
+  invoke void (ptr, ...) @vararg_foo(ptr %1, i32 0) to label %bb3 unwind label %bb4
+
+; CHECK: ^[[bb3]]:
+bb3:
+  ; CHECK: llvm.invoke %{{.*}}(%[[a3]], %{{.*}}) to ^[[bb5:.*]] unwind ^[[bb4]] vararg(!llvm.func<void (ptr, ...)>) : !llvm.ptr, (!llvm.ptr, i32) -> ()
+  invoke void (ptr, ...) undef(ptr %1, i32 0) to label %bb5 unwind label %bb4
+
+; CHECK: ^[[bb4]]:
+bb4:
   ; CHECK: %{{[0-9]+}} = llvm.landingpad (catch %{{[0-9]+}} : !llvm.ptr) (catch %[[a1]] : !llvm.ptr) (filter %{{[0-9]+}} : !llvm.array<1 x i1>) : !llvm.struct<(ptr, i32)>
   %3 = landingpad { ptr, i32 } catch ptr @_ZTIi catch ptr @_ZTIii
           filter [1 x i1] [i1 1]
   resume { ptr, i32 } %3
 
-; CHECK: ^bb2:
+; CHECK: ^[[bb5]]:
+bb5:
   ; CHECK: llvm.return %{{[0-9]+}} : i32
   ret i32 1
-
-; CHECK: ^bb3:
-  ; CHECK: %{{[0-9]+}} = llvm.invoke @bar(%[[a3]]) to ^bb2 unwind ^bb1 : (!llvm.ptr) -> !llvm.ptr
-  %6 = invoke ptr @bar(ptr %1) to label %4 unwind label %2
-
-; CHECK: ^bb4:
-  ; CHECK: llvm.invoke @vararg_foo(%[[a3]], %{{.*}}) to ^bb2 unwind ^bb1 vararg(!llvm.func<void (ptr, ...)>) : (!llvm.ptr, i32) -> ()
-  invoke void (ptr, ...) @vararg_foo(ptr %1, i32 0) to label %4 unwind label %2
-
-; CHECK: ^bb5:
-  ; CHECK: llvm.invoke %{{.*}}(%[[a3]], %{{.*}}) to ^bb2 unwind ^bb1 vararg(!llvm.func<void (ptr, ...)>) : !llvm.ptr, (!llvm.ptr, i32) -> ()
-  invoke void (ptr, ...) undef(ptr %1, i32 0) to label %4 unwind label %2
-
-; CHECK: ^bb6:
-  ; CHECK: llvm.return %{{[0-9]+}} : i32
-  ret i32 0
 }
 
 declare i32 @foo2()
diff --git a/mlir/test/Target/LLVMIR/Import/unreachable-blocks.ll b/mlir/test/Target/LLVMIR/Import/unreachable-blocks.ll
new file mode 100644
index 00000000000000..15d7f2f3f09c89
--- /dev/null
+++ b/mlir/test/Target/LLVMIR/Import/unreachable-blocks.ll
@@ -0,0 +1,13 @@
+; RUN: mlir-translate -import-llvm %s | FileCheck %s
+
+; CHECK-LABEL: llvm.func @unreachable_block
+; CHECK-NEXT: llvm.return
+; CHECK-NOT: llvm.fadd
+define void @unreachable_block(float %0) {
+.entry:
+  ret void
+
+unreachable:
+  %1 = fadd float %0, %1
+  br label %unreachable
+}

joker-eph · 2024-01-17T18:46:07Z

I'm confused because the title says "Import unreachable basic blocks" but the description says "We thus delete the unreachable basic block"?

gysit · 2024-01-17T19:22:11Z

I'm confused because the title says "Import unreachable basic blocks" but the description says "We thus delete the unreachable basic block"?

I tried to improve title and comment. That was indeed a failed attempt of coming up with a short title.

The revision indeed deletes unreachable basic blocks before the import since importing an unreachable block is not possible in the general case. The problem is that instructions can use a value that they define themselves. This is possible since an unreachable block can dominate itself.

The fadd in the following code is an example for such a self dependency.

define void @unreachable_block(float %0) {
.entry:
  ret void

unreachable:
  %1 = fadd float %0, %1
  br label %unreachable
}

joker-eph · 2024-01-17T22:22:59Z

The revision indeed deletes unreachable basic blocks before the import since importing an unreachable block is not possible in the general case. The problem is that instructions can use a value that they define themselves. This is possible since an unreachable block can dominate itself.

I know about self-referencing instructions, but I don't quite get why we can't import this in MLIR?

Dinistro · 2024-01-18T06:53:36Z

Turns out the MLIR also supports this. I was honestly not aware of this, and I fail to see any usecase for this, as the conversion framework doesn't touch unreachable blocks. It seems adding support for that might be possible by relaxing some assumptions, but the canonicalizer will afterwards just remove these unreachable blocks.

Thus, I'm in favor of taking the path of least resistance:
If this can be done easily, lets import these self-referencing operations, otherwise drop these constructs directly in the import, as done in this PR.

gysit · 2024-01-18T07:33:20Z

I know about self-referencing instructions, but I don't quite get why we can't import this in MLIR?

The import in its current form cannot support this since all operand values need to exist before the operation is converted. There is currently no mechanism to create temporary values that then later on are magically replaced. A possibly solution may be to have an LLVM pass before the import that introduces phi instructions for such values. That sounds really complex to get right in the general case though.

Is there a use case for such unreachable blocks in MLIR? Canonicalize immediately deletes such a block since there to the best of my knowledge there is no way to reach an unreachable block in MLIR (since there is no indirect branch or similar). I would thus strongly favor the current solution which is much simpler and leads to the same result.

mlir/test/Target/LLVMIR/Import/constant.ll

mlir/test/Target/LLVMIR/Import/unreachable-blocks.ll

This revision updates the LLVM IR import to support unreachable basic blocks. An unreachable block may dominate itself and a value defined inside the block may thus be used before its definition. The import does not support such dependencies. We thus delete the unreachable basic block since MLIR does not have basic block labels that can be reached using an indirect call and unreachable blocks are indeed dead for us. Additionally, add a small poison constant import test.

joker-eph · 2024-01-18T20:27:19Z

The import in its current form cannot support this since all operand values need to exist before the operation is converted. There is currently no mechanism to create temporary values that then later on are magically replaced.

Sure, but such mechanism could be implemented fairly easily right?

Is there a use case for such unreachable blocks in MLIR?

There may not be actual use-cases to import them in practice. It just makes me always a bit nervous when the translation isn't "faithful" ;)

gysit · 2024-01-18T20:53:08Z

Sure, but such mechanism could be implemented fairly easily right?

It is definitely more complex than the current solution. It requires at least one additional mapping from the initially unknown values to all their uses and some dummy poison values that are used initially. Plus I guess we can only check at the end of the translation if all unknown values have been set. If possible I would like to avoid that complexity.

gysit · 2024-01-19T09:45:09Z

I prototyped a version that imports the code as is without deleting the unreachable blocks. It turns out to be quite tricky to get right since the change interferes with the error handling of the import. In particular, the convertValue method is used in different contexts and only in some of them it is ok to introduce a placeholder value. After the import, the unreachable blocks either way need to be deleted since the conversion infrastructure does not support them AFAIK.

As this PR fixes an existing bug, I suggest to land it (it is definitely an improvement over the status quo). Once we have a concrete use case, we can follow up with an version that introduces placeholder variables to support self-referencing instructions.

Dinistro

Thanks for addressing the comments. I'm fine with fixing this bug and deferring a potential support for unreachable basic blocks to a later revision.

gysit force-pushed the unreachable-blocks branch 2 times, most recently from 06db79b to e2e5051 Compare January 17, 2024 16:46

gysit requested a review from Dinistro January 17, 2024 16:51

gysit marked this pull request as ready for review January 17, 2024 18:36

llvmbot added mlir:llvm mlir labels Jan 17, 2024

gysit changed the title ~~[mlir][llvm] Import unreachable basic blocks~~ [mlir][llvm] Drop unreachable basic block during import Jan 17, 2024

Dinistro reviewed Jan 18, 2024

View reviewed changes

mlir/test/Target/LLVMIR/Import/constant.ll Show resolved Hide resolved

mlir/test/Target/LLVMIR/Import/unreachable-blocks.ll Outdated Show resolved Hide resolved

gysit force-pushed the unreachable-blocks branch from e2e5051 to 9699192 Compare January 18, 2024 20:14

gysit requested a review from Dinistro January 18, 2024 20:16

Dinistro approved these changes Jan 19, 2024

View reviewed changes

gysit merged commit 9dd0eb9 into llvm:main Jan 19, 2024
3 of 4 checks passed

gysit deleted the unreachable-blocks branch January 19, 2024 10:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[mlir][llvm] Drop unreachable basic block during import #78467

[mlir][llvm] Drop unreachable basic block during import #78467

gysit commented Jan 17, 2024 •

edited

Loading

llvmbot commented Jan 17, 2024 •

edited

Loading

joker-eph commented Jan 17, 2024

gysit commented Jan 17, 2024

joker-eph commented Jan 17, 2024

Dinistro commented Jan 18, 2024

gysit commented Jan 18, 2024

joker-eph commented Jan 18, 2024

gysit commented Jan 18, 2024

gysit commented Jan 19, 2024

Dinistro left a comment

[mlir][llvm] Drop unreachable basic block during import #78467

[mlir][llvm] Drop unreachable basic block during import #78467

Conversation

gysit commented Jan 17, 2024 • edited Loading

llvmbot commented Jan 17, 2024 • edited Loading

joker-eph commented Jan 17, 2024

gysit commented Jan 17, 2024

joker-eph commented Jan 17, 2024

Dinistro commented Jan 18, 2024

gysit commented Jan 18, 2024

joker-eph commented Jan 18, 2024

gysit commented Jan 18, 2024

gysit commented Jan 19, 2024

Dinistro left a comment

Choose a reason for hiding this comment

gysit commented Jan 17, 2024 •

edited

Loading

llvmbot commented Jan 17, 2024 •

edited

Loading