Skip to content

Conversation

@fhahn
Copy link
Contributor

@fhahn fhahn commented Nov 7, 2025

Currently sinking assumes in instcombine drops assumes if they would prevent sinking. Removing dereferenceable assumptions earlier on can inhibit vectorization of early-exit loops in practice.

Special-case deferenceable assumptions so that they block sinking. This can be combined with a separate change to drop dereferencebale assumptions after vectorization: https://clang.godbolt.org/z/jGqcx3sbs

Not sure if there is a better solution.

@llvmbot llvmbot added llvm:instcombine Covers the InstCombine, InstSimplify and AggressiveInstCombine passes llvm:transforms labels Nov 7, 2025
@llvmbot
Copy link
Member

llvmbot commented Nov 7, 2025

@llvm/pr-subscribers-llvm-transforms

Author: Florian Hahn (fhahn)

Changes

Currently sinking assumes in instcombine drops assumes if they would prevent sinking. Removing dereferenceable assumptions earlier on can inhibit vectorization of early-exit loops in practice.

Special-case deferenceable assumptions so that they block sinking. This can be combined with a separate change to drop dereferencebale assumptions after vectorization: https://clang.godbolt.org/z/jGqcx3sbs

Not sure if there is a better solution.


Full diff: https://github.com/llvm/llvm-project/pull/166945.diff

3 Files Affected:

  • (modified) llvm/lib/Transforms/InstCombine/InstructionCombining.cpp (+12)
  • (modified) llvm/test/Transforms/InstCombine/sink-dereferenceable-assume.ll (+4-3)
  • (modified) llvm/test/Transforms/PhaseOrdering/AArch64/std-find.ll (+46-56)
diff --git a/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp b/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
index b158e0f626850..cea5be2468feb 100644
--- a/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstructionCombining.cpp
@@ -5413,6 +5413,18 @@ bool InstCombinerImpl::tryToSinkInstruction(Instruction *I,
         return false;
   }
 
+  // Do not sink if there are dereferenceable assumes that would be removed.
+  for (User *User : I->users()) {
+    auto *CI = dyn_cast<CallInst>(User);
+    if (!CI || CI->getParent() == DestBlock)
+      continue;
+
+    if (CI->getIntrinsicID() == Intrinsic::assume &&
+        CI->getOperandBundle("dereferenceable")) {
+      return false;
+    }
+  }
+
   I->dropDroppableUses([&](const Use *U) {
     auto *I = dyn_cast<Instruction>(U->getUser());
     if (I && I->getParent() != DestBlock) {
diff --git a/llvm/test/Transforms/InstCombine/sink-dereferenceable-assume.ll b/llvm/test/Transforms/InstCombine/sink-dereferenceable-assume.ll
index 953132309900b..353e0bb191ed1 100644
--- a/llvm/test/Transforms/InstCombine/sink-dereferenceable-assume.ll
+++ b/llvm/test/Transforms/InstCombine/sink-dereferenceable-assume.ll
@@ -5,11 +5,12 @@ define i64 @test_sink_with_dereferenceable_assume(ptr %p, ptr %q, i1 %cond) {
 ; CHECK-LABEL: define i64 @test_sink_with_dereferenceable_assume(
 ; CHECK-SAME: ptr [[P:%.*]], ptr [[Q:%.*]], i1 [[COND:%.*]]) {
 ; CHECK-NEXT:  [[ENTRY:.*:]]
-; CHECK-NEXT:    br i1 [[COND]], label %[[THEN:.*]], label %[[ELSE:.*]]
-; CHECK:       [[THEN]]:
-; CHECK-NEXT:    [[Q_INT:%.*]] = ptrtoint ptr [[Q]] to i64
 ; CHECK-NEXT:    [[P_INT:%.*]] = ptrtoint ptr [[P]] to i64
+; CHECK-NEXT:    [[Q_INT:%.*]] = ptrtoint ptr [[Q]] to i64
 ; CHECK-NEXT:    [[DIFF:%.*]] = sub i64 [[Q_INT]], [[P_INT]]
+; CHECK-NEXT:    call void @llvm.assume(i1 true) [ "dereferenceable"(ptr [[P]], i64 [[DIFF]]) ]
+; CHECK-NEXT:    br i1 [[COND]], label %[[THEN:.*]], label %[[ELSE:.*]]
+; CHECK:       [[THEN]]:
 ; CHECK-NEXT:    ret i64 [[DIFF]]
 ; CHECK:       [[ELSE]]:
 ; CHECK-NEXT:    ret i64 0
diff --git a/llvm/test/Transforms/PhaseOrdering/AArch64/std-find.ll b/llvm/test/Transforms/PhaseOrdering/AArch64/std-find.ll
index 33e3e83770e7f..e9149795954ec 100644
--- a/llvm/test/Transforms/PhaseOrdering/AArch64/std-find.ll
+++ b/llvm/test/Transforms/PhaseOrdering/AArch64/std-find.ll
@@ -133,75 +133,65 @@ define ptr @std_find_caller(ptr noundef %first, ptr noundef %last) {
 ; CHECK-LABEL: define noundef ptr @std_find_caller(
 ; CHECK-SAME: ptr noundef [[FIRST:%.*]], ptr noundef [[LAST:%.*]]) local_unnamed_addr #[[ATTR0]] {
 ; CHECK-NEXT:  [[ENTRY:.*]]:
+; CHECK-NEXT:    [[FIRST3:%.*]] = ptrtoint ptr [[FIRST]] to i64
+; CHECK-NEXT:    [[LAST_I64:%.*]] = ptrtoint ptr [[LAST]] to i64
+; CHECK-NEXT:    [[PTR_SUB:%.*]] = sub i64 [[LAST_I64]], [[FIRST3]]
 ; CHECK-NEXT:    call void @llvm.assume(i1 true) [ "align"(ptr [[FIRST]], i64 2) ]
 ; CHECK-NEXT:    call void @llvm.assume(i1 true) [ "align"(ptr [[LAST]], i64 2) ]
+; CHECK-NEXT:    call void @llvm.assume(i1 true) [ "dereferenceable"(ptr [[FIRST]], i64 [[PTR_SUB]]) ]
 ; CHECK-NEXT:    [[PRE_I:%.*]] = icmp eq ptr [[FIRST]], [[LAST]]
 ; CHECK-NEXT:    br i1 [[PRE_I]], label %[[STD_FIND_GENERIC_IMPL_EXIT:.*]], label %[[LOOP_HEADER_I_PREHEADER:.*]]
 ; CHECK:       [[LOOP_HEADER_I_PREHEADER]]:
-; CHECK-NEXT:    [[LAST2:%.*]] = ptrtoint ptr [[LAST]] to i64
-; CHECK-NEXT:    [[FIRST3:%.*]] = ptrtoint ptr [[FIRST]] to i64
-; CHECK-NEXT:    [[LAST_I64:%.*]] = ptrtoint ptr [[LAST]] to i64
-; CHECK-NEXT:    [[FIRST1:%.*]] = ptrtoint ptr [[FIRST]] to i64
-; CHECK-NEXT:    [[PTR_SUB:%.*]] = sub i64 [[LAST_I64]], [[FIRST1]]
 ; CHECK-NEXT:    [[SCEVGEP:%.*]] = getelementptr i8, ptr [[FIRST]], i64 [[PTR_SUB]]
-; CHECK-NEXT:    [[TMP0:%.*]] = add i64 [[LAST2]], -2
+; CHECK-NEXT:    [[TMP0:%.*]] = add i64 [[LAST_I64]], -2
 ; CHECK-NEXT:    [[TMP1:%.*]] = sub i64 [[TMP0]], [[FIRST3]]
 ; CHECK-NEXT:    [[TMP2:%.*]] = lshr exact i64 [[TMP1]], 1
 ; CHECK-NEXT:    [[TMP3:%.*]] = add nuw i64 [[TMP2]], 1
-; CHECK-NEXT:    [[XTRAITER:%.*]] = and i64 [[TMP3]], 3
-; CHECK-NEXT:    [[TMP4:%.*]] = and i64 [[TMP1]], 6
-; CHECK-NEXT:    [[LCMP_MOD_NOT:%.*]] = icmp eq i64 [[TMP4]], 6
-; CHECK-NEXT:    br i1 [[LCMP_MOD_NOT]], label %[[LOOP_HEADER_I_PROL_LOOPEXIT:.*]], label %[[LOOP_HEADER_I_PROL:.*]]
-; CHECK:       [[LOOP_HEADER_I_PROL]]:
-; CHECK-NEXT:    [[PTR_IV_I_PROL:%.*]] = phi ptr [ [[PTR_IV_NEXT_I_PROL:%.*]], %[[LOOP_LATCH_I_PROL:.*]] ], [ [[FIRST]], %[[LOOP_HEADER_I_PREHEADER]] ]
-; CHECK-NEXT:    [[PROL_ITER:%.*]] = phi i64 [ [[PROL_ITER_NEXT:%.*]], %[[LOOP_LATCH_I_PROL]] ], [ 0, %[[LOOP_HEADER_I_PREHEADER]] ]
-; CHECK-NEXT:    [[L_I_PROL:%.*]] = load i16, ptr [[PTR_IV_I_PROL]], align 2
-; CHECK-NEXT:    [[C_1_I_PROL:%.*]] = icmp eq i16 [[L_I_PROL]], 1
-; CHECK-NEXT:    br i1 [[C_1_I_PROL]], label %[[STD_FIND_GENERIC_IMPL_EXIT]], label %[[LOOP_LATCH_I_PROL]]
-; CHECK:       [[LOOP_LATCH_I_PROL]]:
-; CHECK-NEXT:    [[PTR_IV_NEXT_I_PROL]] = getelementptr inbounds nuw i8, ptr [[PTR_IV_I_PROL]], i64 2
-; CHECK-NEXT:    [[PROL_ITER_NEXT]] = add i64 [[PROL_ITER]], 1
+; CHECK-NEXT:    [[MIN_ITERS_CHECK:%.*]] = icmp ult i64 [[TMP1]], 158
+; CHECK-NEXT:    br i1 [[MIN_ITERS_CHECK]], label %[[LOOP_HEADER_I_PREHEADER2:.*]], label %[[VECTOR_PH:.*]]
+; CHECK:       [[VECTOR_PH]]:
+; CHECK-NEXT:    [[XTRAITER:%.*]] = and i64 [[TMP3]], -8
+; CHECK-NEXT:    br label %[[VECTOR_BODY:.*]]
+; CHECK:       [[VECTOR_BODY]]:
+; CHECK-NEXT:    [[INDEX:%.*]] = phi i64 [ 0, %[[VECTOR_PH]] ], [ [[PROL_ITER_NEXT:%.*]], %[[VECTOR_BODY]] ]
+; CHECK-NEXT:    [[OFFSET_IDX:%.*]] = shl i64 [[INDEX]], 1
+; CHECK-NEXT:    [[NEXT_GEP:%.*]] = getelementptr i8, ptr [[FIRST]], i64 [[OFFSET_IDX]]
+; CHECK-NEXT:    [[WIDE_LOAD:%.*]] = load <8 x i16>, ptr [[NEXT_GEP]], align 2
+; CHECK-NEXT:    [[WIDE_LOAD_FR:%.*]] = freeze <8 x i16> [[WIDE_LOAD]]
+; CHECK-NEXT:    [[TMP4:%.*]] = icmp eq <8 x i16> [[WIDE_LOAD_FR]], splat (i16 1)
+; CHECK-NEXT:    [[PROL_ITER_NEXT]] = add nuw i64 [[INDEX]], 8
+; CHECK-NEXT:    [[TMP5:%.*]] = bitcast <8 x i1> [[TMP4]] to i8
+; CHECK-NEXT:    [[TMP6:%.*]] = icmp ne i8 [[TMP5]], 0
 ; CHECK-NEXT:    [[PROL_ITER_CMP_NOT:%.*]] = icmp eq i64 [[PROL_ITER_NEXT]], [[XTRAITER]]
-; CHECK-NEXT:    br i1 [[PROL_ITER_CMP_NOT]], label %[[LOOP_HEADER_I_PROL_LOOPEXIT]], label %[[LOOP_HEADER_I_PROL]], !llvm.loop [[LOOP3:![0-9]+]]
-; CHECK:       [[LOOP_HEADER_I_PROL_LOOPEXIT]]:
-; CHECK-NEXT:    [[PTR_IV_I_UNR:%.*]] = phi ptr [ [[FIRST]], %[[LOOP_HEADER_I_PREHEADER]] ], [ [[PTR_IV_NEXT_I_PROL]], %[[LOOP_LATCH_I_PROL]] ]
-; CHECK-NEXT:    [[TMP5:%.*]] = icmp ult i64 [[TMP1]], 6
-; CHECK-NEXT:    br i1 [[TMP5]], label %[[STD_FIND_GENERIC_IMPL_EXIT]], label %[[LOOP_HEADER_I:.*]]
+; CHECK-NEXT:    [[TMP8:%.*]] = or i1 [[TMP6]], [[PROL_ITER_CMP_NOT]]
+; CHECK-NEXT:    br i1 [[TMP8]], label %[[MIDDLE_SPLIT:.*]], label %[[VECTOR_BODY]], !llvm.loop [[LOOP3:![0-9]+]]
+; CHECK:       [[MIDDLE_SPLIT]]:
+; CHECK-NEXT:    [[TMP9:%.*]] = shl i64 [[XTRAITER]], 1
+; CHECK-NEXT:    [[TMP10:%.*]] = getelementptr i8, ptr [[FIRST]], i64 [[TMP9]]
+; CHECK-NEXT:    br i1 [[TMP6]], label %[[VECTOR_EARLY_EXIT:.*]], label %[[MIDDLE_BLOCK:.*]]
+; CHECK:       [[MIDDLE_BLOCK]]:
+; CHECK-NEXT:    [[CMP_N:%.*]] = icmp eq i64 [[TMP3]], [[XTRAITER]]
+; CHECK-NEXT:    br i1 [[CMP_N]], label %[[STD_FIND_GENERIC_IMPL_EXIT]], label %[[LOOP_HEADER_I_PREHEADER2]]
+; CHECK:       [[LOOP_HEADER_I_PREHEADER2]]:
+; CHECK-NEXT:    [[PTR_IV_I_PH:%.*]] = phi ptr [ [[FIRST]], %[[LOOP_HEADER_I_PREHEADER]] ], [ [[TMP10]], %[[MIDDLE_BLOCK]] ]
+; CHECK-NEXT:    br label %[[LOOP_HEADER_I:.*]]
+; CHECK:       [[VECTOR_EARLY_EXIT]]:
+; CHECK-NEXT:    [[TMP11:%.*]] = tail call i64 @llvm.experimental.cttz.elts.i64.v8i1(<8 x i1> [[TMP4]], i1 true)
+; CHECK-NEXT:    [[TMP12:%.*]] = add i64 [[INDEX]], [[TMP11]]
+; CHECK-NEXT:    [[TMP13:%.*]] = shl i64 [[TMP12]], 1
+; CHECK-NEXT:    [[TMP14:%.*]] = getelementptr i8, ptr [[FIRST]], i64 [[TMP13]]
+; CHECK-NEXT:    br label %[[STD_FIND_GENERIC_IMPL_EXIT]]
 ; CHECK:       [[LOOP_HEADER_I]]:
-; CHECK-NEXT:    [[PTR_IV_I:%.*]] = phi ptr [ [[PTR_IV_NEXT_I_3:%.*]], %[[LOOP_LATCH_I_3:.*]] ], [ [[PTR_IV_I_UNR]], %[[LOOP_HEADER_I_PROL_LOOPEXIT]] ]
+; CHECK-NEXT:    [[PTR_IV_I:%.*]] = phi ptr [ [[PTR_IV_NEXT_I:%.*]], %[[LOOP_LATCH_I:.*]] ], [ [[PTR_IV_I_PH]], %[[LOOP_HEADER_I_PREHEADER2]] ]
 ; CHECK-NEXT:    [[L_I:%.*]] = load i16, ptr [[PTR_IV_I]], align 2
 ; CHECK-NEXT:    [[C_1_I:%.*]] = icmp eq i16 [[L_I]], 1
-; CHECK-NEXT:    br i1 [[C_1_I]], label %[[STD_FIND_GENERIC_IMPL_EXIT]], label %[[LOOP_LATCH_I:.*]]
+; CHECK-NEXT:    br i1 [[C_1_I]], label %[[STD_FIND_GENERIC_IMPL_EXIT]], label %[[LOOP_LATCH_I]]
 ; CHECK:       [[LOOP_LATCH_I]]:
-; CHECK-NEXT:    [[PTR_IV_NEXT_I:%.*]] = getelementptr inbounds nuw i8, ptr [[PTR_IV_I]], i64 2
-; CHECK-NEXT:    [[L_I_1:%.*]] = load i16, ptr [[PTR_IV_NEXT_I]], align 2
-; CHECK-NEXT:    [[C_1_I_1:%.*]] = icmp eq i16 [[L_I_1]], 1
-; CHECK-NEXT:    br i1 [[C_1_I_1]], label %[[STD_FIND_GENERIC_IMPL_EXIT_LOOPEXIT_UNR_LCSSA_LOOPEXIT_SPLIT_LOOP_EXIT11:.*]], label %[[LOOP_LATCH_I_1:.*]]
-; CHECK:       [[LOOP_LATCH_I_1]]:
-; CHECK-NEXT:    [[PTR_IV_NEXT_I_1:%.*]] = getelementptr inbounds nuw i8, ptr [[PTR_IV_I]], i64 4
-; CHECK-NEXT:    [[L_I_2:%.*]] = load i16, ptr [[PTR_IV_NEXT_I_1]], align 2
-; CHECK-NEXT:    [[C_1_I_2:%.*]] = icmp eq i16 [[L_I_2]], 1
-; CHECK-NEXT:    br i1 [[C_1_I_2]], label %[[STD_FIND_GENERIC_IMPL_EXIT_LOOPEXIT_UNR_LCSSA_LOOPEXIT_SPLIT_LOOP_EXIT9:.*]], label %[[LOOP_LATCH_I_2:.*]]
-; CHECK:       [[LOOP_LATCH_I_2]]:
-; CHECK-NEXT:    [[PTR_IV_NEXT_I_2:%.*]] = getelementptr inbounds nuw i8, ptr [[PTR_IV_I]], i64 6
-; CHECK-NEXT:    [[L_I_3:%.*]] = load i16, ptr [[PTR_IV_NEXT_I_2]], align 2
-; CHECK-NEXT:    [[C_1_I_3:%.*]] = icmp eq i16 [[L_I_3]], 1
-; CHECK-NEXT:    br i1 [[C_1_I_3]], label %[[STD_FIND_GENERIC_IMPL_EXIT_LOOPEXIT_UNR_LCSSA_LOOPEXIT_SPLIT_LOOP_EXIT7:.*]], label %[[LOOP_LATCH_I_3]]
-; CHECK:       [[LOOP_LATCH_I_3]]:
-; CHECK-NEXT:    [[PTR_IV_NEXT_I_3]] = getelementptr inbounds nuw i8, ptr [[PTR_IV_I]], i64 8
-; CHECK-NEXT:    [[EC_I_3:%.*]] = icmp eq ptr [[PTR_IV_NEXT_I_3]], [[LAST]]
-; CHECK-NEXT:    br i1 [[EC_I_3]], label %[[STD_FIND_GENERIC_IMPL_EXIT]], label %[[LOOP_HEADER_I]]
-; CHECK:       [[STD_FIND_GENERIC_IMPL_EXIT_LOOPEXIT_UNR_LCSSA_LOOPEXIT_SPLIT_LOOP_EXIT7]]:
-; CHECK-NEXT:    [[PTR_IV_NEXT_I_2_LE:%.*]] = getelementptr inbounds nuw i8, ptr [[PTR_IV_I]], i64 6
-; CHECK-NEXT:    br label %[[STD_FIND_GENERIC_IMPL_EXIT]]
-; CHECK:       [[STD_FIND_GENERIC_IMPL_EXIT_LOOPEXIT_UNR_LCSSA_LOOPEXIT_SPLIT_LOOP_EXIT9]]:
-; CHECK-NEXT:    [[PTR_IV_NEXT_I_1_LE:%.*]] = getelementptr inbounds nuw i8, ptr [[PTR_IV_I]], i64 4
-; CHECK-NEXT:    br label %[[STD_FIND_GENERIC_IMPL_EXIT]]
-; CHECK:       [[STD_FIND_GENERIC_IMPL_EXIT_LOOPEXIT_UNR_LCSSA_LOOPEXIT_SPLIT_LOOP_EXIT11]]:
-; CHECK-NEXT:    [[PTR_IV_NEXT_I_LE:%.*]] = getelementptr inbounds nuw i8, ptr [[PTR_IV_I]], i64 2
-; CHECK-NEXT:    br label %[[STD_FIND_GENERIC_IMPL_EXIT]]
+; CHECK-NEXT:    [[PTR_IV_NEXT_I]] = getelementptr inbounds nuw i8, ptr [[PTR_IV_I]], i64 2
+; CHECK-NEXT:    [[EC_I:%.*]] = icmp eq ptr [[PTR_IV_NEXT_I]], [[LAST]]
+; CHECK-NEXT:    br i1 [[EC_I]], label %[[STD_FIND_GENERIC_IMPL_EXIT]], label %[[LOOP_HEADER_I]], !llvm.loop [[LOOP4:![0-9]+]]
 ; CHECK:       [[STD_FIND_GENERIC_IMPL_EXIT]]:
-; CHECK-NEXT:    [[RES_I:%.*]] = phi ptr [ [[FIRST]], %[[ENTRY]] ], [ [[SCEVGEP]], %[[LOOP_HEADER_I_PROL_LOOPEXIT]] ], [ [[PTR_IV_NEXT_I_2_LE]], %[[STD_FIND_GENERIC_IMPL_EXIT_LOOPEXIT_UNR_LCSSA_LOOPEXIT_SPLIT_LOOP_EXIT7]] ], [ [[PTR_IV_NEXT_I_1_LE]], %[[STD_FIND_GENERIC_IMPL_EXIT_LOOPEXIT_UNR_LCSSA_LOOPEXIT_SPLIT_LOOP_EXIT9]] ], [ [[PTR_IV_NEXT_I_LE]], %[[STD_FIND_GENERIC_IMPL_EXIT_LOOPEXIT_UNR_LCSSA_LOOPEXIT_SPLIT_LOOP_EXIT11]] ], [ [[SCEVGEP]], %[[LOOP_LATCH_I_3]] ], [ [[PTR_IV_I]], %[[LOOP_HEADER_I]] ], [ [[PTR_IV_I_PROL]], %[[LOOP_HEADER_I_PROL]] ]
+; CHECK-NEXT:    [[RES_I:%.*]] = phi ptr [ [[FIRST]], %[[ENTRY]] ], [ [[SCEVGEP]], %[[MIDDLE_BLOCK]] ], [ [[TMP14]], %[[VECTOR_EARLY_EXIT]] ], [ [[SCEVGEP]], %[[LOOP_LATCH_I]] ], [ [[PTR_IV_I]], %[[LOOP_HEADER_I]] ]
 ; CHECK-NEXT:    ret ptr [[RES_I]]
 ;
 entry:
@@ -241,6 +231,6 @@ declare void @llvm.assume(i1 noundef)
 ; CHECK: [[LOOP0]] = distinct !{[[LOOP0]], [[META1:![0-9]+]], [[META2:![0-9]+]]}
 ; CHECK: [[META1]] = !{!"llvm.loop.isvectorized", i32 1}
 ; CHECK: [[META2]] = !{!"llvm.loop.unroll.runtime.disable"}
-; CHECK: [[LOOP3]] = distinct !{[[LOOP3]], [[META4:![0-9]+]]}
-; CHECK: [[META4]] = !{!"llvm.loop.unroll.disable"}
+; CHECK: [[LOOP3]] = distinct !{[[LOOP3]], [[META1]], [[META2]]}
+; CHECK: [[LOOP4]] = distinct !{[[LOOP4]], [[META2]], [[META1]]}
 ;.

}

// Do not sink if there are dereferenceable assumes that would be removed.
for (User *User : I->users()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we bail out this case in getOptionalSinkBlockForInst?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep thanks, we can do better in getOptionalSinkBlockForInst, by just not skipping such assumes there we can still sink up to the deref assume. Also added some extra tests.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you forget to push the changes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, sorry about that. Should be pushed now

fhahn added 3 commits November 8, 2025 14:52
Currently sinking assumes in instcombine drops assumes if they would
prevent sinking. Removing dereferenceable assumptions earlier on can
inhibit vectorization of early-exit loops in practice.

Special-case deferenceable assumptions so that they block sinking. This
can be combined with a separate change to drop dereferencebale
assumptions after vectorization: https://clang.godbolt.org/z/jGqcx3sbs

Not sure if there is a better solution.
@fhahn fhahn force-pushed the ic-sinking-with-deref-assumptions branch from cdcf6e8 to ae29bfe Compare November 8, 2025 14:54
Copy link
Member

@dtcxzyw dtcxzyw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@fhahn fhahn enabled auto-merge (squash) November 9, 2025 20:40
@fhahn fhahn merged commit 700b77b into llvm:main Nov 9, 2025
9 of 10 checks passed
@fhahn fhahn deleted the ic-sinking-with-deref-assumptions branch November 9, 2025 21:23
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Nov 9, 2025
…f assumptions. (#166945)

Currently sinking assumes in instcombine drops assumes if they would
prevent sinking. Removing dereferenceable assumptions earlier on can
inhibit vectorization of early-exit loops in practice.

Special-case deferenceable assumptions so that they block sinking. This
can be combined with a separate change to drop dereferencebale
assumptions after vectorization: https://clang.godbolt.org/z/jGqcx3sbs

PR: llvm/llvm-project#166945
fhahn added a commit that referenced this pull request Nov 10, 2025
…166947)

This patch adds another run of DropUnnecessaryAssumes after
vectorization, to clean up assumes that are not longer needed after this
point.

The main example of such an assume is currently dereferenceable
assumptions. This complements
#166945, which avoids sinking
code if it would mean remove a dereferenceable assumption.

There are a few additional cases where some unneeded assumes are left
over after vectorization that also get cleaned up.

The main motivation is to work together with
#166945, but there may be a
better solution.

Adding another instance of this pass to the pipeline is not great, but
compile-time impact seems in the noise:
https://llvm-compile-time-tracker.com/compare.php?from=55e71fe08b6406ec7ce2c81ce042e48717acf204&to=85da4ee3a74126f557cdc74c7b40e048dacb3fc4&stat=instructions:u

PR: #166947
llvm-sync bot pushed a commit to arm/arm-toolchain that referenced this pull request Nov 10, 2025
…rization. (#166947)

This patch adds another run of DropUnnecessaryAssumes after
vectorization, to clean up assumes that are not longer needed after this
point.

The main example of such an assume is currently dereferenceable
assumptions. This complements
llvm/llvm-project#166945, which avoids sinking
code if it would mean remove a dereferenceable assumption.

There are a few additional cases where some unneeded assumes are left
over after vectorization that also get cleaned up.

The main motivation is to work together with
llvm/llvm-project#166945, but there may be a
better solution.

Adding another instance of this pass to the pipeline is not great, but
compile-time impact seems in the noise:
https://llvm-compile-time-tracker.com/compare.php?from=55e71fe08b6406ec7ce2c81ce042e48717acf204&to=85da4ee3a74126f557cdc74c7b40e048dacb3fc4&stat=instructions:u

PR: llvm/llvm-project#166947
fhahn added a commit to fhahn/llvm-project that referenced this pull request Nov 10, 2025
…ns. (llvm#166945)

Currently sinking assumes in instcombine drops assumes if they would
prevent sinking. Removing dereferenceable assumptions earlier on can
inhibit vectorization of early-exit loops in practice.

Special-case deferenceable assumptions so that they block sinking. This
can be combined with a separate change to drop dereferencebale
assumptions after vectorization: https://clang.godbolt.org/z/jGqcx3sbs

PR: llvm#166945

(cherry picked from commit 700b77b)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

llvm:instcombine Covers the InstCombine, InstSimplify and AggressiveInstCombine passes llvm:transforms

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants