Skip to content

Conversation

spaits
Copy link
Contributor

@spaits spaits commented Sep 17, 2025

Fixes #148052 .

Last PR did not account for the scenario, when more than one instruction used the catchpad label.
In that case I have deleted uses, which were already "choosen to be iterated over" by the early increment iterator. This issue was not visible in normal release build on x86, but luckily later on the address sanitizer build it has found it on the buildbot.

Here is the diff from the last version of this PR: #158435

diff --git a/llvm/lib/Transforms/Utils/BasicBlockUtils.cpp b/llvm/lib/Transforms/Utils/BasicBlockUtils.cpp
index 91e245e5e8f5..1dd8cb4ee584 100644
--- a/llvm/lib/Transforms/Utils/BasicBlockUtils.cpp
+++ b/llvm/lib/Transforms/Utils/BasicBlockUtils.cpp
@@ -106,7 +106,8 @@ void llvm::detachDeadBlocks(ArrayRef<BasicBlock *> BBs,
       // first block, the we would have possible cleanupret and catchret
       // instructions with poison arguments, which wouldn't be valid.
       if (isa<FuncletPadInst>(I)) {
-        for (User *User : make_early_inc_range(I.users())) {
+        SmallPtrSet<BasicBlock *, 4> UniqueEHRetBlocksToDelete;
+        for (User *User : I.users()) {
           Instruction *ReturnInstr = dyn_cast<Instruction>(User);
           // If we have a cleanupret or catchret block, replace it with just an
           // unreachable. The other alternative, that may use a catchpad is a
@@ -114,33 +115,12 @@ void llvm::detachDeadBlocks(ArrayRef<BasicBlock *> BBs,
           if (isa<CatchReturnInst>(ReturnInstr) ||
               isa<CleanupReturnInst>(ReturnInstr)) {
             BasicBlock *ReturnInstrBB = ReturnInstr->getParent();
-            // This catchret or catchpad basic block is detached now. Let the
-            // successors know it.
-            // This basic block also may have some predecessors too. For
-            // example the following LLVM-IR is valid:
-            //
-            //   [cleanuppad_block]
-            //            |
-            //    [regular_block]
-            //            |
-            //   [cleanupret_block]
-            //
-            // The IR after the cleanup will look like this:
-            //
-            //   [cleanuppad_block]
-            //            |
-            //    [regular_block]
-            //            |
-            //     [unreachable]
-            //
-            // So regular_block will lead to an unreachable block, which is also
-            // valid. There is no need to replace regular_block with unreachable
-            // in this context now.
-            // On the other hand, the cleanupret/catchret block's successors
-            // need to know about the deletion of their predecessors.
-            emptyAndDetachBlock(ReturnInstrBB, Updates, KeepOneInputPHIs);
+            UniqueEHRetBlocksToDelete.insert(ReturnInstrBB);
           }
         }
+        for (BasicBlock *EHRetBB :
+             make_early_inc_range(UniqueEHRetBlocksToDelete))
+          emptyAndDetachBlock(EHRetBB, Updates, KeepOneInputPHIs);
       }
     }

…llvm#157363)

Fixes llvm#148052 .

When removing EH Pad blocks, the value defined by them becomes poison. These poison values are then used by `catchret` and `cleanupret`, which is invalid. This commit replaces those unreachable `catchret` and `cleanupret` instructions with `unreachable`.
@spaits spaits marked this pull request as ready for review September 17, 2025 20:34
@spaits spaits requested a review from rnk September 17, 2025 20:35
@llvmbot
Copy link
Member

llvmbot commented Sep 17, 2025

@llvm/pr-subscribers-llvm-transforms

Author: Gábor Spaits (spaits)

Changes

Fixes #148052 .

Last PR did not account for the scenario, when more than one instruction used the catchpad label.
In that case I have deleted uses, which were already "choosen to be iterated over" by the early increment iterator. This issue was not visible in normal release build on x86, but luckily later on the address sanitizer build it has found it on the buildbot.

Here is the diff from the last version of this PR: #158435

diff --git a/llvm/lib/Transforms/Utils/BasicBlockUtils.cpp b/llvm/lib/Transforms/Utils/BasicBlockUtils.cpp
index 91e245e5e8f5..1dd8cb4ee584 100644
--- a/llvm/lib/Transforms/Utils/BasicBlockUtils.cpp
+++ b/llvm/lib/Transforms/Utils/BasicBlockUtils.cpp
@@ -106,7 +106,8 @@ void llvm::detachDeadBlocks(ArrayRef&lt;BasicBlock *&gt; BBs,
       // first block, the we would have possible cleanupret and catchret
       // instructions with poison arguments, which wouldn't be valid.
       if (isa&lt;FuncletPadInst&gt;(I)) {
-        for (User *User : make_early_inc_range(I.users())) {
+        SmallPtrSet&lt;BasicBlock *, 4&gt; UniqueEHRetBlocksToDelete;
+        for (User *User : I.users()) {
           Instruction *ReturnInstr = dyn_cast&lt;Instruction&gt;(User);
           // If we have a cleanupret or catchret block, replace it with just an
           // unreachable. The other alternative, that may use a catchpad is a
@@ -114,33 +115,12 @@ void llvm::detachDeadBlocks(ArrayRef&lt;BasicBlock *&gt; BBs,
           if (isa&lt;CatchReturnInst&gt;(ReturnInstr) ||
               isa&lt;CleanupReturnInst&gt;(ReturnInstr)) {
             BasicBlock *ReturnInstrBB = ReturnInstr-&gt;getParent();
-            // This catchret or catchpad basic block is detached now. Let the
-            // successors know it.
-            // This basic block also may have some predecessors too. For
-            // example the following LLVM-IR is valid:
-            //
-            //   [cleanuppad_block]
-            //            |
-            //    [regular_block]
-            //            |
-            //   [cleanupret_block]
-            //
-            // The IR after the cleanup will look like this:
-            //
-            //   [cleanuppad_block]
-            //            |
-            //    [regular_block]
-            //            |
-            //     [unreachable]
-            //
-            // So regular_block will lead to an unreachable block, which is also
-            // valid. There is no need to replace regular_block with unreachable
-            // in this context now.
-            // On the other hand, the cleanupret/catchret block's successors
-            // need to know about the deletion of their predecessors.
-            emptyAndDetachBlock(ReturnInstrBB, Updates, KeepOneInputPHIs);
+            UniqueEHRetBlocksToDelete.insert(ReturnInstrBB);
           }
         }
+        for (BasicBlock *EHRetBB :
+             make_early_inc_range(UniqueEHRetBlocksToDelete))
+          emptyAndDetachBlock(EHRetBB, Updates, KeepOneInputPHIs);
       }
     }

Full diff: https://github.com/llvm/llvm-project/pull/159379.diff

2 Files Affected:

  • (modified) llvm/lib/Transforms/Utils/BasicBlockUtils.cpp (+65-28)
  • (added) llvm/test/Transforms/SimplifyCFG/unreachable-multi-basic-block-funclet.ll (+236)
diff --git a/llvm/lib/Transforms/Utils/BasicBlockUtils.cpp b/llvm/lib/Transforms/Utils/BasicBlockUtils.cpp
index cad0b4c12b54e..1dd8cb4ee584c 100644
--- a/llvm/lib/Transforms/Utils/BasicBlockUtils.cpp
+++ b/llvm/lib/Transforms/Utils/BasicBlockUtils.cpp
@@ -58,37 +58,74 @@ static cl::opt<unsigned> MaxDeoptOrUnreachableSuccessorCheckDepth(
              "is followed by a block that either has a terminating "
              "deoptimizing call or is terminated with an unreachable"));
 
-void llvm::detachDeadBlocks(
-    ArrayRef<BasicBlock *> BBs,
-    SmallVectorImpl<DominatorTree::UpdateType> *Updates,
-    bool KeepOneInputPHIs) {
+/// Zap all the instructions in the block and replace them with an unreachable
+/// instruction and notify the basic block's successors that one of their
+/// predecessors is going away.
+static void
+emptyAndDetachBlock(BasicBlock *BB,
+                    SmallVectorImpl<DominatorTree::UpdateType> *Updates,
+                    bool KeepOneInputPHIs) {
+  // Loop through all of our successors and make sure they know that one
+  // of their predecessors is going away.
+  SmallPtrSet<BasicBlock *, 4> UniqueSuccessors;
+  for (BasicBlock *Succ : successors(BB)) {
+    Succ->removePredecessor(BB, KeepOneInputPHIs);
+    if (Updates && UniqueSuccessors.insert(Succ).second)
+      Updates->push_back({DominatorTree::Delete, BB, Succ});
+  }
+
+  // Zap all the instructions in the block.
+  while (!BB->empty()) {
+    Instruction &I = BB->back();
+    // If this instruction is used, replace uses with an arbitrary value.
+    // Because control flow can't get here, we don't care what we replace the
+    // value with. Note that since this block is unreachable, and all values
+    // contained within it must dominate their uses, that all uses will
+    // eventually be removed (they are themselves dead).
+    if (!I.use_empty())
+      I.replaceAllUsesWith(PoisonValue::get(I.getType()));
+    BB->back().eraseFromParent();
+  }
+  new UnreachableInst(BB->getContext(), BB);
+  assert(BB->size() == 1 && isa<UnreachableInst>(BB->getTerminator()) &&
+         "The successor list of BB isn't empty before "
+         "applying corresponding DTU updates.");
+}
+
+void llvm::detachDeadBlocks(ArrayRef<BasicBlock *> BBs,
+                            SmallVectorImpl<DominatorTree::UpdateType> *Updates,
+                            bool KeepOneInputPHIs) {
   for (auto *BB : BBs) {
-    // Loop through all of our successors and make sure they know that one
-    // of their predecessors is going away.
-    SmallPtrSet<BasicBlock *, 4> UniqueSuccessors;
-    for (BasicBlock *Succ : successors(BB)) {
-      Succ->removePredecessor(BB, KeepOneInputPHIs);
-      if (Updates && UniqueSuccessors.insert(Succ).second)
-        Updates->push_back({DominatorTree::Delete, BB, Succ});
+    auto NonFirstPhiIt = BB->getFirstNonPHIIt();
+    if (NonFirstPhiIt != BB->end()) {
+      Instruction &I = *NonFirstPhiIt;
+      // Exception handling funclets need to be explicitly addressed.
+      // These funclets must begin with cleanuppad or catchpad and end with
+      // cleanupred or catchret. The return instructions can be in different
+      // basic blocks than the pad instruction. If we would only delete the
+      // first block, the we would have possible cleanupret and catchret
+      // instructions with poison arguments, which wouldn't be valid.
+      if (isa<FuncletPadInst>(I)) {
+        SmallPtrSet<BasicBlock *, 4> UniqueEHRetBlocksToDelete;
+        for (User *User : I.users()) {
+          Instruction *ReturnInstr = dyn_cast<Instruction>(User);
+          // If we have a cleanupret or catchret block, replace it with just an
+          // unreachable. The other alternative, that may use a catchpad is a
+          // catchswitch. That does not need special handling for now.
+          if (isa<CatchReturnInst>(ReturnInstr) ||
+              isa<CleanupReturnInst>(ReturnInstr)) {
+            BasicBlock *ReturnInstrBB = ReturnInstr->getParent();
+            UniqueEHRetBlocksToDelete.insert(ReturnInstrBB);
+          }
+        }
+        for (BasicBlock *EHRetBB :
+             make_early_inc_range(UniqueEHRetBlocksToDelete))
+          emptyAndDetachBlock(EHRetBB, Updates, KeepOneInputPHIs);
+      }
     }
 
-    // Zap all the instructions in the block.
-    while (!BB->empty()) {
-      Instruction &I = BB->back();
-      // If this instruction is used, replace uses with an arbitrary value.
-      // Because control flow can't get here, we don't care what we replace the
-      // value with.  Note that since this block is unreachable, and all values
-      // contained within it must dominate their uses, that all uses will
-      // eventually be removed (they are themselves dead).
-      if (!I.use_empty())
-        I.replaceAllUsesWith(PoisonValue::get(I.getType()));
-      BB->back().eraseFromParent();
-    }
-    new UnreachableInst(BB->getContext(), BB);
-    assert(BB->size() == 1 &&
-           isa<UnreachableInst>(BB->getTerminator()) &&
-           "The successor list of BB isn't empty before "
-           "applying corresponding DTU updates.");
+    // Detaching and emptying the current basic block.
+    emptyAndDetachBlock(BB, Updates, KeepOneInputPHIs);
   }
 }
 
diff --git a/llvm/test/Transforms/SimplifyCFG/unreachable-multi-basic-block-funclet.ll b/llvm/test/Transforms/SimplifyCFG/unreachable-multi-basic-block-funclet.ll
new file mode 100644
index 0000000000000..0f0fc78ec7add
--- /dev/null
+++ b/llvm/test/Transforms/SimplifyCFG/unreachable-multi-basic-block-funclet.ll
@@ -0,0 +1,236 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
+; RUN: opt -passes=simplifycfg -S < %s | FileCheck %s
+
+; cleanuppad/cleanupret
+
+define void @unreachable_cleanuppad_linear(i64 %shapes.1) personality ptr null {
+; CHECK-LABEL: define void @unreachable_cleanuppad_linear(
+; CHECK-SAME: i64 [[SHAPES_1:%.*]]) personality ptr null {
+; CHECK-NEXT:  [[START:.*:]]
+; CHECK-NEXT:    [[_7:%.*]] = icmp ult i64 0, [[SHAPES_1]]
+; CHECK-NEXT:    ret void
+;
+start:
+  %_7 = icmp ult i64 0, %shapes.1
+  ret void
+
+funclet:
+  %cleanuppad = cleanuppad within none []
+  br label %funclet_end
+
+funclet_end:
+  cleanupret from %cleanuppad unwind to caller
+}
+
+define void @unreachable_cleanuppad_linear_middle_block(i64 %shapes.1) personality ptr null {
+; CHECK-LABEL: define void @unreachable_cleanuppad_linear_middle_block(
+; CHECK-SAME: i64 [[SHAPES_1:%.*]]) personality ptr null {
+; CHECK-NEXT:  [[START:.*:]]
+; CHECK-NEXT:    [[_7:%.*]] = icmp ult i64 0, [[SHAPES_1]]
+; CHECK-NEXT:    ret void
+;
+start:
+  %_7 = icmp ult i64 0, %shapes.1
+  ret void
+
+funclet:
+  %cleanuppad = cleanuppad within none []
+  br label %middle_block
+
+middle_block:
+  %tmp1 = add i64 %shapes.1, 42
+  br label %funclet_end
+
+funclet_end:
+  cleanupret from %cleanuppad unwind to caller
+}
+
+define void @unreachable_cleanuppad_multiple_predecessors(i64 %shapes.1) personality ptr null {
+; CHECK-LABEL: define void @unreachable_cleanuppad_multiple_predecessors(
+; CHECK-SAME: i64 [[SHAPES_1:%.*]]) personality ptr null {
+; CHECK-NEXT:  [[START:.*:]]
+; CHECK-NEXT:    [[_7:%.*]] = icmp ult i64 0, [[SHAPES_1]]
+; CHECK-NEXT:    ret void
+;
+start:
+  %_7 = icmp ult i64 0, %shapes.1
+  ret void
+
+funclet:
+  %cleanuppad = cleanuppad within none []
+  switch i64 %shapes.1, label %otherwise [ i64 0, label %one
+  i64 1, label %two
+  i64 42, label %three ]
+one:
+  br label %funclet_end
+
+two:
+  br label %funclet_end
+
+three:
+  br label %funclet_end
+
+otherwise:
+  br label %funclet_end
+
+funclet_end:
+  cleanupret from %cleanuppad unwind to caller
+}
+
+; catchpad/catchret
+
+define void @unreachable_catchpad_linear(i64 %shapes.1) personality ptr null {
+; CHECK-LABEL: define void @unreachable_catchpad_linear(
+; CHECK-SAME: i64 [[SHAPES_1:%.*]]) personality ptr null {
+; CHECK-NEXT:  [[START:.*:]]
+; CHECK-NEXT:    [[_7:%.*]] = icmp ult i64 0, [[SHAPES_1]]
+; CHECK-NEXT:    ret void
+;
+start:
+  %_7 = icmp ult i64 0, %shapes.1
+  ret void
+
+dispatch:
+  %cs = catchswitch within none [label %funclet] unwind to caller
+
+funclet:
+  %cleanuppad = catchpad within %cs []
+  br label %funclet_end
+
+
+funclet_end:
+  catchret from %cleanuppad to label %unreachable
+
+unreachable:
+  unreachable
+}
+
+define void @unreachable_catchpad_multiple_predecessors(i64 %shapes.1) personality ptr null {
+; CHECK-LABEL: define void @unreachable_catchpad_multiple_predecessors(
+; CHECK-SAME: i64 [[SHAPES_1:%.*]]) personality ptr null {
+; CHECK-NEXT:  [[START:.*:]]
+; CHECK-NEXT:    [[_7:%.*]] = icmp ult i64 0, [[SHAPES_1]]
+; CHECK-NEXT:    ret void
+;
+start:
+  %_7 = icmp ult i64 0, %shapes.1
+  ret void
+
+dispatch:
+  %cs = catchswitch within none [label %funclet] unwind to caller
+
+funclet:
+  %cleanuppad = catchpad within %cs []
+  switch i64 %shapes.1, label %otherwise [ i64 0, label %one
+  i64 1, label %two
+  i64 42, label %three ]
+one:
+  br label %funclet_end
+
+two:
+  br label %funclet_end
+
+three:
+  br label %funclet_end
+
+otherwise:
+  br label %funclet_end
+
+funclet_end:
+  catchret from %cleanuppad to label %unreachable
+
+unreachable:
+  unreachable
+}
+
+; Issue reproducer
+
+define void @gh148052(i64 %shapes.1) personality ptr null {
+; CHECK-LABEL: define void @gh148052(
+; CHECK-SAME: i64 [[SHAPES_1:%.*]]) personality ptr null {
+; CHECK-NEXT:  [[START:.*:]]
+; CHECK-NEXT:    [[_7:%.*]] = icmp ult i64 0, [[SHAPES_1]]
+; CHECK-NEXT:    call void @llvm.assume(i1 [[_7]])
+; CHECK-NEXT:    ret void
+;
+start:
+  %_7 = icmp ult i64 0, %shapes.1
+  br i1 %_7, label %bb1, label %panic
+
+bb1:
+  %_11 = icmp ult i64 0, %shapes.1
+  br i1 %_11, label %bb3, label %panic1
+
+panic:
+  unreachable
+
+bb3:
+  ret void
+
+panic1:
+  invoke void @func(i64 0, i64 0, ptr null)
+  to label %unreachable unwind label %funclet_bb14
+
+funclet_bb14:
+  %cleanuppad = cleanuppad within none []
+  br label %bb13
+
+unreachable:
+  unreachable
+
+bb10:
+  cleanupret from %cleanuppad5 unwind to caller
+
+funclet_bb10:
+  %cleanuppad5 = cleanuppad within none []
+  br label %bb10
+
+bb13:
+  cleanupret from %cleanuppad unwind label %funclet_bb10
+}
+
+%struct.foo = type { ptr, %struct.eggs, ptr }
+%struct.eggs = type { ptr, ptr, ptr }
+
+declare x86_thiscallcc ptr @quux(ptr, ptr, i32)
+
+define x86_thiscallcc ptr @baz(ptr %arg, ptr %arg1, ptr %arg2, i1 %arg3, ptr %arg4) personality ptr null {
+; CHECK-LABEL: define x86_thiscallcc ptr @baz(
+; CHECK-SAME: ptr [[ARG:%.*]], ptr [[ARG1:%.*]], ptr [[ARG2:%.*]], i1 [[ARG3:%.*]], ptr [[ARG4:%.*]]) personality ptr null {
+; CHECK-NEXT:  [[BB:.*:]]
+; CHECK-NEXT:    [[ALLOCA:%.*]] = alloca [2 x %struct.foo], align 4
+; CHECK-NEXT:    [[INVOKE:%.*]] = call x86_thiscallcc ptr @quux(ptr null, ptr null, i32 0) #[[ATTR1:[0-9]+]]
+; CHECK-NEXT:    unreachable
+;
+bb:
+  %alloca = alloca [2 x %struct.foo], align 4
+  %invoke = invoke x86_thiscallcc ptr @quux(ptr null, ptr null, i32 0)
+  to label %bb5 unwind label %bb10
+
+bb5:                                              ; preds = %bb
+  %getelementptr = getelementptr i8, ptr %arg, i32 20
+  %call = call x86_thiscallcc ptr null(ptr null, ptr null, i32 0)
+  br label %bb6
+
+bb6:                                              ; preds = %bb10, %bb5
+  %phi = phi ptr [ null, %bb10 ], [ null, %bb5 ]
+  ret ptr %phi
+
+bb7:                                              ; No predecessors!
+  %cleanuppad = cleanuppad within none []
+  %getelementptr8 = getelementptr i8, ptr %arg2, i32 -20
+  %icmp = icmp eq ptr %arg, null
+  br label %bb9
+
+bb9:                                              ; preds = %bb7
+  cleanupret from %cleanuppad unwind label %bb10
+
+bb10:                                             ; preds = %bb9, %bb
+  %phi11 = phi ptr [ %arg, %bb9 ], [ null, %bb ]
+  %cleanuppad12 = cleanuppad within none []
+  %getelementptr13 = getelementptr i8, ptr %phi11, i32 -20
+  store i32 0, ptr %phi11, align 4
+  br label %bb6
+}
+
+declare void @func(i64, i64, ptr)

@spaits
Copy link
Contributor Author

spaits commented Sep 17, 2025

@rnk I am sorry, that my last PR didn't worked and sorry for wasting your time and requesting your review once again.

I have done some testing with the Address Sanitizer on the tests that failed on the buildbot. They work now.

I couldn't run check-llvm suite since I have limited computing resources (There are some OCaml binding tests, that fail even before my patch on my machine. When using a sanitizer build these failure logs are much larger. My machine just freezes. (It reports some missing symbols and crashes :( ) ).

Copy link
Collaborator

@rnk rnk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries, thanks for debugging it! I added a suggestion, please add that and run it through testing again.

@rnk
Copy link
Collaborator

rnk commented Sep 17, 2025

I couldn't run check-llvm suite since I have limited computing resources (There are some OCaml binding tests, that fail even before my patch on my machine. When using a sanitizer build these failure logs are much larger. My machine just freezes. (It reports some missing symbols and crashes :( ) ).

Yeah, LLVM's build and test requirements continue to go up and to the right. It is a real barrier to contribution, as Nuno highlighted the other week. I think our default build and test configuration could use a bit more curation, like paring back the OCaml binding tests. Those are integration tests that are likely to fail out of the box in new environments, and they probably shouldn't be in check-llvm by default.

@spaits
Copy link
Contributor Author

spaits commented Sep 18, 2025

Thank you @rnk .

  • I have removed the unnecessary make_early_inc_range.
  • I have moved UniqueEHRetBlocksToDelete a bit higher in scope and call clear on it before usage. (Maybe this way we can save some memory usage, probably the compiler would optimize this anyway :) ).
  • Run check-llvm-transforms and check-llvm-codegen test suites with ASan for targets RISC-V, X86 and WebAssembly.

Update: I have run the tests on all available targets with ASan and they are passing.

@rnk rnk merged commit 0a47e8c into llvm:main Sep 20, 2025
9 checks passed
@llvm-ci
Copy link
Collaborator

llvm-ci commented Sep 20, 2025

LLVM Buildbot has detected a new failure on builder mlir-nvidia running on mlir-nvidia while building llvm at step 7 "test-build-check-mlir-build-only-check-mlir".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/138/builds/19359

Here is the relevant piece of the build log for the reference
Step 7 (test-build-check-mlir-build-only-check-mlir) failure: test (failure)
******************** TEST 'MLIR :: Integration/GPU/CUDA/async.mlir' FAILED ********************
Exit Code: 1

Command Output (stdout):
--
# RUN: at line 1
/vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/bin/mlir-opt /vol/worker/mlir-nvidia/mlir-nvidia/llvm.src/mlir/test/Integration/GPU/CUDA/async.mlir  | /vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/bin/mlir-opt -gpu-kernel-outlining  | /vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/bin/mlir-opt -pass-pipeline='builtin.module(gpu.module(strip-debuginfo,convert-gpu-to-nvvm),nvvm-attach-target)'  | /vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/bin/mlir-opt -gpu-async-region -gpu-to-llvm -reconcile-unrealized-casts -gpu-module-to-binary="format=fatbin"  | /vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/bin/mlir-opt -async-to-async-runtime -async-runtime-ref-counting  | /vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/bin/mlir-opt -convert-async-to-llvm -convert-func-to-llvm -convert-arith-to-llvm -convert-cf-to-llvm -reconcile-unrealized-casts  | /vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/bin/mlir-runner    --shared-libs=/vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/lib/libmlir_cuda_runtime.so    --shared-libs=/vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/lib/libmlir_async_runtime.so    --shared-libs=/vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/lib/libmlir_runner_utils.so    --entry-point-result=void -O0  | /vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/bin/FileCheck /vol/worker/mlir-nvidia/mlir-nvidia/llvm.src/mlir/test/Integration/GPU/CUDA/async.mlir
# executed command: /vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/bin/mlir-opt /vol/worker/mlir-nvidia/mlir-nvidia/llvm.src/mlir/test/Integration/GPU/CUDA/async.mlir
# executed command: /vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/bin/mlir-opt -gpu-kernel-outlining
# executed command: /vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/bin/mlir-opt '-pass-pipeline=builtin.module(gpu.module(strip-debuginfo,convert-gpu-to-nvvm),nvvm-attach-target)'
# executed command: /vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/bin/mlir-opt -gpu-async-region -gpu-to-llvm -reconcile-unrealized-casts -gpu-module-to-binary=format=fatbin
# executed command: /vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/bin/mlir-opt -async-to-async-runtime -async-runtime-ref-counting
# executed command: /vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/bin/mlir-opt -convert-async-to-llvm -convert-func-to-llvm -convert-arith-to-llvm -convert-cf-to-llvm -reconcile-unrealized-casts
# executed command: /vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/bin/mlir-runner --shared-libs=/vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/lib/libmlir_cuda_runtime.so --shared-libs=/vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/lib/libmlir_async_runtime.so --shared-libs=/vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/lib/libmlir_runner_utils.so --entry-point-result=void -O0
# .---command stderr------------
# | 'cuStreamWaitEvent(stream, event, 0)' failed with 'CUDA_ERROR_CONTEXT_IS_DESTROYED'
# | 'cuEventDestroy(event)' failed with 'CUDA_ERROR_CONTEXT_IS_DESTROYED'
# | 'cuStreamWaitEvent(stream, event, 0)' failed with 'CUDA_ERROR_CONTEXT_IS_DESTROYED'
# | 'cuEventDestroy(event)' failed with 'CUDA_ERROR_CONTEXT_IS_DESTROYED'
# | 'cuStreamWaitEvent(stream, event, 0)' failed with 'CUDA_ERROR_CONTEXT_IS_DESTROYED'
# | 'cuStreamWaitEvent(stream, event, 0)' failed with 'CUDA_ERROR_CONTEXT_IS_DESTROYED'
# | 'cuEventDestroy(event)' failed with 'CUDA_ERROR_CONTEXT_IS_DESTROYED'
# | 'cuEventDestroy(event)' failed with 'CUDA_ERROR_CONTEXT_IS_DESTROYED'
# | 'cuEventSynchronize(event)' failed with 'CUDA_ERROR_CONTEXT_IS_DESTROYED'
# | 'cuEventDestroy(event)' failed with 'CUDA_ERROR_CONTEXT_IS_DESTROYED'
# `-----------------------------
# executed command: /vol/worker/mlir-nvidia/mlir-nvidia/llvm.obj/bin/FileCheck /vol/worker/mlir-nvidia/mlir-nvidia/llvm.src/mlir/test/Integration/GPU/CUDA/async.mlir
# .---command stderr------------
# | /vol/worker/mlir-nvidia/mlir-nvidia/llvm.src/mlir/test/Integration/GPU/CUDA/async.mlir:68:12: error: CHECK: expected string not found in input
# |  // CHECK: [84, 84]
# |            ^
# | <stdin>:1:1: note: scanning from here
# | Unranked Memref base@ = 0x59f1d3014fd0 rank = 1 offset = 0 sizes = [2] strides = [1] data = 
# | ^
# | <stdin>:2:1: note: possible intended match here
# | [42, 42]
# | ^
# | 
# | Input file: <stdin>
# | Check file: /vol/worker/mlir-nvidia/mlir-nvidia/llvm.src/mlir/test/Integration/GPU/CUDA/async.mlir
# | 
# | -dump-input=help explains the following input dump.
# | 
# | Input was:
# | <<<<<<
# |             1: Unranked Memref base@ = 0x59f1d3014fd0 rank = 1 offset = 0 sizes = [2] strides = [1] data =  
# | check:68'0     X~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ error: no match found
# |             2: [42, 42] 
# | check:68'0     ~~~~~~~~~
# | check:68'1     ?         possible intended match
...

@llvm-ci
Copy link
Collaborator

llvm-ci commented Sep 20, 2025

LLVM Buildbot has detected a new failure on builder clang-aarch64-quick running on linaro-clang-aarch64-quick while building llvm at step 5 "ninja check 1".

Full details are available at: https://lab.llvm.org/buildbot/#/builders/65/builds/22898

Here is the relevant piece of the build log for the reference
Step 5 (ninja check 1) failure: stage 1 checked (failure)
******************** TEST 'Clangd Unit Tests :: ./ClangdTests/289/332' FAILED ********************
Script(shard):
--
GTEST_OUTPUT=json:/home/tcwg-buildbot/worker/clang-aarch64-quick/stage1/tools/clang/tools/extra/clangd/unittests/./ClangdTests-Clangd Unit Tests-3521222-289-332.json GTEST_SHUFFLE=0 GTEST_TOTAL_SHARDS=332 GTEST_SHARD_INDEX=289 /home/tcwg-buildbot/worker/clang-aarch64-quick/stage1/tools/clang/tools/extra/clangd/unittests/./ClangdTests
--

Note: This is test shard 290 of 332.
[==========] Running 4 tests from 4 test suites.
[----------] Global test environment set-up.
[----------] 1 test from ConfigCompileTests
[ RUN      ] ConfigCompileTests.CompileCommands
Config fragment: compiling <unknown>:0 -> 0x0000B8AEE1E63A60 (trusted=false)
[       OK ] ConfigCompileTests.CompileCommands (16 ms)
[----------] 1 test from ConfigCompileTests (16 ms total)

[----------] 1 test from HeaderSourceSwitchTest
[ RUN      ] HeaderSourceSwitchTest.ClangdServerIntegration
ASTWorker building file /clangd-test/src/lib/test.cpp version null with command 
[/clangd-test/src/lib]
clang -I/clangd-test/src/include /clangd-test/src/lib/test.cpp
Driver produced command: cc1 -cc1 -triple aarch64-unknown-linux-gnu -fsyntax-only -disable-free -clear-ast-before-backend -main-file-name test.cpp -mrelocation-model pic -pic-level 2 -pic-is-pie -mframe-pointer=non-leaf -fmath-errno -ffp-contract=on -fno-rounding-math -mconstructor-aliases -funwind-tables=2 -enable-tlsdesc -target-cpu generic -target-feature +v8a -target-feature +fp-armv8 -target-feature +neon -target-abi aapcs -debugger-tuning=gdb -fdebug-compilation-dir=/clangd-test/src/lib -fcoverage-compilation-dir=/clangd-test/src/lib -resource-dir lib/clang/22 -I /clangd-test/src/include -internal-isystem lib/clang/22/include -internal-isystem /usr/local/include -internal-externc-isystem /include -internal-externc-isystem /usr/include -fdeprecated-macro -ferror-limit 19 -fno-signed-char -fgnuc-version=4.2.1 -fskip-odr-check-in-gmf -fcxx-exceptions -fexceptions -no-round-trip-args -target-feature -fmv -faddrsig -D__GCC_HAVE_DWARF2_CFI_ASM=1 -x c++ /clangd-test/src/lib/test.cpp
Building first preamble for /clangd-test/src/lib/test.cpp version null
not idle after addDocument
UNREACHABLE executed at ../llvm/clang-tools-extra/clangd/unittests/SyncAPI.cpp:22!
Built preamble of size 820468 for file /clangd-test/src/lib/test.cpp version null in 10.01 seconds
indexed preamble AST for /clangd-test/src/lib/test.cpp version null:
  symbol slab: 1 symbols, 4448 bytes
  ref slab: 0 symbols, 0 refs, 128 bytes
  relations slab: 0 relations, 24 bytes
Build dynamic index for header symbols with estimated memory usage of 6956 bytes
indexed file AST for /clangd-test/src/lib/test.cpp version null:
  symbol slab: 1 symbols, 4448 bytes
  ref slab: 1 symbols, 1 refs, 4248 bytes
  relations slab: 0 relations, 24 bytes
Build dynamic index for main-file symbols with estimated memory usage of 11520 bytes
 #0 0x0000b8aed2c03724 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/home/tcwg-buildbot/worker/clang-aarch64-quick/stage1/tools/clang/tools/extra/clangd/unittests/./ClangdTests+0xc83724)
 #1 0x0000b8aed2c011ec llvm::sys::RunSignalHandlers() (/home/tcwg-buildbot/worker/clang-aarch64-quick/stage1/tools/clang/tools/extra/clangd/unittests/./ClangdTests+0xc811ec)
 #2 0x0000b8aed2c04580 SignalHandler(int, siginfo_t*, void*) Signals.cpp:0:0
 #3 0x0000efd91acd78f8 (linux-vdso.so.1+0x8f8)
 #4 0x0000efd91a7ef1f0 __pthread_kill_implementation ./nptl/./nptl/pthread_kill.c:44:76
 #5 0x0000efd91a7aa67c gsignal ./signal/../sysdeps/posix/raise.c:27:6
 #6 0x0000efd91a797130 abort ./stdlib/./stdlib/abort.c:81:7
 #7 0x0000b8aed2bb0260 llvm::RTTIRoot::anchor() (/home/tcwg-buildbot/worker/clang-aarch64-quick/stage1/tools/clang/tools/extra/clangd/unittests/./ClangdTests+0xc30260)
 #8 0x0000b8aed2a5a7e0 clang::clangd::runCodeComplete(clang::clangd::ClangdServer&, llvm::StringRef, clang::clangd::Position, clang::clangd::CodeCompleteOptions) (/home/tcwg-buildbot/worker/clang-aarch64-quick/stage1/tools/clang/tools/extra/clangd/unittests/./ClangdTests+0xada7e0)
 #9 0x0000b8aed2849390 clang::clangd::(anonymous namespace)::HeaderSourceSwitchTest_ClangdServerIntegration_Test::TestBody() HeaderSourceSwitchTests.cpp:0:0
#10 0x0000b8aed2c5c078 testing::Test::Run() (/home/tcwg-buildbot/worker/clang-aarch64-quick/stage1/tools/clang/tools/extra/clangd/unittests/./ClangdTests+0xcdc078)
#11 0x0000b8aed2c5d39c testing::TestInfo::Run() (/home/tcwg-buildbot/worker/clang-aarch64-quick/stage1/tools/clang/tools/extra/clangd/unittests/./ClangdTests+0xcdd39c)
#12 0x0000b8aed2c5dfd8 testing::TestSuite::Run() (/home/tcwg-buildbot/worker/clang-aarch64-quick/stage1/tools/clang/tools/extra/clangd/unittests/./ClangdTests+0xcddfd8)
#13 0x0000b8aed2c6e358 testing::internal::UnitTestImpl::RunAllTests() (/home/tcwg-buildbot/worker/clang-aarch64-quick/stage1/tools/clang/tools/extra/clangd/unittests/./ClangdTests+0xcee358)
#14 0x0000b8aed2c6dca4 testing::UnitTest::Run() (/home/tcwg-buildbot/worker/clang-aarch64-quick/stage1/tools/clang/tools/extra/clangd/unittests/./ClangdTests+0xcedca4)
...

SeongjaeP pushed a commit to SeongjaeP/llvm-project that referenced this pull request Sep 23, 2025
…llvm#159379)

Fixes llvm#148052 .

Last PR did not account for the scenario, when more than one instruction
used the `catchpad` label.
In that case I have deleted uses, which were already "choosen to be
iterated over" by the early increment iterator. This issue was not
visible in normal release build on x86, but luckily later on the address
sanitizer build it has found it on the buildbot.

Here is the diff from the last version of this PR: llvm#158435
```diff
diff --git a/llvm/lib/Transforms/Utils/BasicBlockUtils.cpp b/llvm/lib/Transforms/Utils/BasicBlockUtils.cpp
index 91e245e..1dd8cb4 100644
--- a/llvm/lib/Transforms/Utils/BasicBlockUtils.cpp
+++ b/llvm/lib/Transforms/Utils/BasicBlockUtils.cpp
@@ -106,7 +106,8 @@ void llvm::detachDeadBlocks(ArrayRef<BasicBlock *> BBs,
       // first block, the we would have possible cleanupret and catchret
       // instructions with poison arguments, which wouldn't be valid.
       if (isa<FuncletPadInst>(I)) {
-        for (User *User : make_early_inc_range(I.users())) {
+        SmallPtrSet<BasicBlock *, 4> UniqueEHRetBlocksToDelete;
+        for (User *User : I.users()) {
           Instruction *ReturnInstr = dyn_cast<Instruction>(User);
           // If we have a cleanupret or catchret block, replace it with just an
           // unreachable. The other alternative, that may use a catchpad is a
@@ -114,33 +115,12 @@ void llvm::detachDeadBlocks(ArrayRef<BasicBlock *> BBs,
           if (isa<CatchReturnInst>(ReturnInstr) ||
               isa<CleanupReturnInst>(ReturnInstr)) {
             BasicBlock *ReturnInstrBB = ReturnInstr->getParent();
-            // This catchret or catchpad basic block is detached now. Let the
-            // successors know it.
-            // This basic block also may have some predecessors too. For
-            // example the following LLVM-IR is valid:
-            //
-            //   [cleanuppad_block]
-            //            |
-            //    [regular_block]
-            //            |
-            //   [cleanupret_block]
-            //
-            // The IR after the cleanup will look like this:
-            //
-            //   [cleanuppad_block]
-            //            |
-            //    [regular_block]
-            //            |
-            //     [unreachable]
-            //
-            // So regular_block will lead to an unreachable block, which is also
-            // valid. There is no need to replace regular_block with unreachable
-            // in this context now.
-            // On the other hand, the cleanupret/catchret block's successors
-            // need to know about the deletion of their predecessors.
-            emptyAndDetachBlock(ReturnInstrBB, Updates, KeepOneInputPHIs);
+            UniqueEHRetBlocksToDelete.insert(ReturnInstrBB);
           }
         }
+        for (BasicBlock *EHRetBB :
+             make_early_inc_range(UniqueEHRetBlocksToDelete))
+          emptyAndDetachBlock(EHRetBB, Updates, KeepOneInputPHIs);
       }
     }
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Bug in SimplifyCFG causes ICE
4 participants