Skip to content

Conversation

mtrofin
Copy link
Member

@mtrofin mtrofin commented Oct 11, 2025

There are 2 cases:

  • either the select​ condition is a vector of bools, case in which we don't currently have a way to represent the per-element branch probabilities anyway;
  • or the select condition is a scalar, for example from a llvm.vector.reduce​. We could potentially try and do more here - if the reduced vector contained conditions from other selects, for instance

In either case, IIUC, chances are the select​ doesn't get lowered to a branch, at least I'm not seeing any evidence of that in an internal complex application (CSFDO + ThinLTO). Seems sufficient to mark the selects are unknown (for profiled functions); since that metadata carries with it the pass name (DEBUG_TYPE​) that marked it as such, we can revisit this if we detect later lowerings of these selects that would have required an actual profile.

Issue #147390

Copy link
Member Author

mtrofin commented Oct 11, 2025

This stack of pull requests is managed by Graphite. Learn more about stacking.

@mtrofin mtrofin force-pushed the users/mtrofin/10-10-_slp_ branch from e2d5f62 to 1236326 Compare October 11, 2025 02:28
@mtrofin mtrofin marked this pull request as ready for review October 11, 2025 02:28
@mtrofin mtrofin changed the title [slp] [slp][profcheck] Mark selects as having unknown profile Oct 11, 2025
@llvmbot
Copy link
Member

llvmbot commented Oct 11, 2025

@llvm/pr-subscribers-vectorizers
@llvm/pr-subscribers-llvm-transforms

@llvm/pr-subscribers-llvm-ir

Author: Mircea Trofin (mtrofin)

Changes

Full diff: https://github.com/llvm/llvm-project/pull/162960.diff

4 Files Affected:

  • (modified) llvm/include/llvm/IR/IRBuilder.h (+5)
  • (modified) llvm/lib/IR/IRBuilder.cpp (+13)
  • (modified) llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp (+9-6)
  • (modified) llvm/utils/profcheck-xfail.txt (-76)
diff --git a/llvm/include/llvm/IR/IRBuilder.h b/llvm/include/llvm/IR/IRBuilder.h
index 041a4ce112275..dacda0afc7f03 100644
--- a/llvm/include/llvm/IR/IRBuilder.h
+++ b/llvm/include/llvm/IR/IRBuilder.h
@@ -2548,6 +2548,11 @@ class IRBuilderBase {
       std::optional<RoundingMode> Rounding = std::nullopt,
       std::optional<fp::ExceptionBehavior> Except = std::nullopt);
 
+  LLVM_ABI Value *CreateSelectWithUnknownProfile(Value *C, Value *True,
+                                                 Value *False,
+                                                 StringRef PassName,
+                                                 const Twine &Name = "");
+
   LLVM_ABI Value *CreateSelect(Value *C, Value *True, Value *False,
                                const Twine &Name = "",
                                Instruction *MDFrom = nullptr);
diff --git a/llvm/lib/IR/IRBuilder.cpp b/llvm/lib/IR/IRBuilder.cpp
index 614c3a9abb8d0..347bb66cb1145 100644
--- a/llvm/lib/IR/IRBuilder.cpp
+++ b/llvm/lib/IR/IRBuilder.cpp
@@ -25,6 +25,7 @@
 #include "llvm/IR/Module.h"
 #include "llvm/IR/NoFolder.h"
 #include "llvm/IR/Operator.h"
+#include "llvm/IR/ProfDataUtils.h"
 #include "llvm/IR/Statepoint.h"
 #include "llvm/IR/Type.h"
 #include "llvm/IR/Value.h"
@@ -1002,6 +1003,18 @@ CallInst *IRBuilderBase::CreateConstrainedFPCall(
   return C;
 }
 
+Value *IRBuilderBase::CreateSelectWithUnknownProfile(Value *C, Value *True,
+                                                     Value *False,
+                                                     StringRef PassName,
+                                                     const Twine &Name) {
+  Value *Ret = CreateSelectFMF(C, True, False, {}, Name);
+  if (auto *SI = dyn_cast<SelectInst>(Ret)) {
+    setExplicitlyUnknownBranchWeightsIfProfiled(
+        *SI, *SI->getParent()->getParent(), PassName);
+    }
+  return Ret;
+}
+
 Value *IRBuilderBase::CreateSelect(Value *C, Value *True, Value *False,
                                    const Twine &Name, Instruction *MDFrom) {
   return CreateSelectFMF(C, True, False, {}, Name, MDFrom);
diff --git a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
index cfa8d2703592e..08df8faad487c 100644
--- a/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
+++ b/llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
@@ -19414,7 +19414,8 @@ Value *BoUpSLP::vectorizeTree(TreeEntry *E) {
       }
       assert(getNumElements(Cond->getType()) == TrueNumElements &&
              "Cannot vectorize Instruction::Select");
-      Value *V = Builder.CreateSelect(Cond, True, False);
+      Value *V =
+          Builder.CreateSelectWithUnknownProfile(Cond, True, False, DEBUG_TYPE);
       V = FinalShuffle(V, E);
 
       E->VectorizedValue = V;
@@ -23532,18 +23533,19 @@ class HorizontalReduction {
     switch (Kind) {
     case RecurKind::Or: {
       if (UseSelect && OpTy == CmpInst::makeCmpResultType(OpTy))
-        return Builder.CreateSelect(
+        return Builder.CreateSelectWithUnknownProfile(
             LHS, ConstantInt::getAllOnesValue(CmpInst::makeCmpResultType(OpTy)),
-            RHS, Name);
+            RHS, DEBUG_TYPE, Name);
       unsigned RdxOpcode = RecurrenceDescriptor::getOpcode(Kind);
       return Builder.CreateBinOp((Instruction::BinaryOps)RdxOpcode, LHS, RHS,
                                  Name);
     }
     case RecurKind::And: {
       if (UseSelect && OpTy == CmpInst::makeCmpResultType(OpTy))
-        return Builder.CreateSelect(
+        return Builder.CreateSelectWithUnknownProfile(
             LHS, RHS,
-            ConstantInt::getNullValue(CmpInst::makeCmpResultType(OpTy)), Name);
+            ConstantInt::getNullValue(CmpInst::makeCmpResultType(OpTy)),
+            DEBUG_TYPE, Name);
       unsigned RdxOpcode = RecurrenceDescriptor::getOpcode(Kind);
       return Builder.CreateBinOp((Instruction::BinaryOps)RdxOpcode, LHS, RHS,
                                  Name);
@@ -23564,7 +23566,8 @@ class HorizontalReduction {
       if (UseSelect) {
         CmpInst::Predicate Pred = llvm::getMinMaxReductionPredicate(Kind);
         Value *Cmp = Builder.CreateICmp(Pred, LHS, RHS, Name);
-        return Builder.CreateSelect(Cmp, LHS, RHS, Name);
+        return Builder.CreateSelectWithUnknownProfile(Cmp, LHS, RHS, DEBUG_TYPE,
+                                                      Name);
       }
       [[fallthrough]];
     case RecurKind::FMax:
diff --git a/llvm/utils/profcheck-xfail.txt b/llvm/utils/profcheck-xfail.txt
index 092d63d2d40df..42b1293b08aec 100644
--- a/llvm/utils/profcheck-xfail.txt
+++ b/llvm/utils/profcheck-xfail.txt
@@ -1310,82 +1310,6 @@ Transforms/SimpleLoopUnswitch/pr60736.ll
 Transforms/SimpleLoopUnswitch/trivial-unswitch-freeze-individual-conditions.ll
 Transforms/SimpleLoopUnswitch/trivial-unswitch.ll
 Transforms/SimpleLoopUnswitch/trivial-unswitch-logical-and-or.ll
-Transforms/SLPVectorizer/AArch64/gather-root.ll
-Transforms/SLPVectorizer/AArch64/horizontal.ll
-Transforms/SLPVectorizer/AArch64/loadi8.ll
-Transforms/SLPVectorizer/AArch64/phi-node-bitwidt-op-not.ll
-Transforms/SLPVectorizer/AArch64/uselistorder.ll
-Transforms/SLPVectorizer/AArch64/vec3-reorder-reshuffle.ll
-Transforms/SLPVectorizer/AArch64/vectorizable-selects-min-max.ll
-Transforms/SLPVectorizer/AArch64/vectorizable-selects-uniform-cmps.ll
-Transforms/SLPVectorizer/AMDGPU/horizontal-store.ll
-Transforms/SLPVectorizer/bool-logical-op-reduction-with-poison.ll
-Transforms/SLPVectorizer/call-arg-reduced-by-minbitwidth.ll
-Transforms/SLPVectorizer/const-bool-logical-or-reduction.ll
-Transforms/SLPVectorizer/extracts-with-undefs.ll
-Transforms/SLPVectorizer/freeze-signedness-missed.ll
-Transforms/SLPVectorizer/gathered-consecutive-loads-different-types.ll
-Transforms/SLPVectorizer/gather_extract_from_vectorbuild.ll
-Transforms/SLPVectorizer/insert-element-build-vector-const.ll
-Transforms/SLPVectorizer/insert-element-build-vector-inseltpoison.ll
-Transforms/SLPVectorizer/insert-element-build-vector.ll
-Transforms/SLPVectorizer/logical-ops-poisonous-repeated.ll
-Transforms/SLPVectorizer/minbitwidth-node-with-multi-users.ll
-Transforms/SLPVectorizer/minbitwidth-user-not-min.ll
-Transforms/SLPVectorizer/partial-register-extract.ll
-Transforms/SLPVectorizer/reduction-gather-non-scheduled-extracts.ll
-Transforms/SLPVectorizer/reorder-node.ll
-Transforms/SLPVectorizer/reused-buildvector-matching-vectorized-node.ll
-Transforms/SLPVectorizer/revec.ll
-Transforms/SLPVectorizer/RISCV/remarks_cmp_sel_min_max.ll
-Transforms/SLPVectorizer/RISCV/remarks-insert-into-small-vector.ll
-Transforms/SLPVectorizer/RISCV/reordered-interleaved-loads.ll
-Transforms/SLPVectorizer/RISCV/revec.ll
-Transforms/SLPVectorizer/RISCV/select-profitability.ll
-Transforms/SLPVectorizer/RISCV/shuffled-gather-casted.ll
-Transforms/SLPVectorizer/RISCV/unsigned-node-trunc-with-signed-users.ll
-Transforms/SLPVectorizer/slp-deleted-inst.ll
-Transforms/SLPVectorizer/SystemZ/cmp-ptr-minmax.ll
-Transforms/SLPVectorizer/SystemZ/ext-not-resized-op-resized.ll
-Transforms/SLPVectorizer/SystemZ/minbitwidth-trunc.ll
-Transforms/SLPVectorizer/X86/bool-mask.ll
-Transforms/SLPVectorizer/X86/bv-root-part-of-graph.ll
-Transforms/SLPVectorizer/X86/cmp-after-intrinsic-call-minbitwidth.ll
-Transforms/SLPVectorizer/X86/cmp-as-alternate-ops.ll
-Transforms/SLPVectorizer/X86/cmp_sel.ll
-Transforms/SLPVectorizer/X86/crash_7zip.ll
-Transforms/SLPVectorizer/X86/crash_clear_undefs.ll
-Transforms/SLPVectorizer/X86/crash_cmpop.ll
-Transforms/SLPVectorizer/X86/debug-counter.ll
-Transforms/SLPVectorizer/X86/debug-info-salvage.ll
-Transforms/SLPVectorizer/X86/extractelement-single-use-many-nodes.ll
-Transforms/SLPVectorizer/X86/extracts-non-extendable.ll
-Transforms/SLPVectorizer/X86/ext-used-scalar-different-bitwidth.ll
-Transforms/SLPVectorizer/X86/gather-node-same-as-vect-but-order.ll
-Transforms/SLPVectorizer/X86/horizontal-minmax.ll
-Transforms/SLPVectorizer/X86/insert-after-bundle.ll
-Transforms/SLPVectorizer/X86/jumbled-load-multiuse.ll
-Transforms/SLPVectorizer/X86/minbitwidth-icmp-to-trunc.ll
-Transforms/SLPVectorizer/X86/minbw-user-non-sizable.ll
-Transforms/SLPVectorizer/X86/non-load-reduced-as-part-of-bv.ll
-Transforms/SLPVectorizer/X86/ordering-bug.ll
-Transforms/SLPVectorizer/X86/phi-node-bitwidt-op-not.ll
-Transforms/SLPVectorizer/X86/phi-node-reshuffled-part.ll
-Transforms/SLPVectorizer/X86/pr46983.ll
-Transforms/SLPVectorizer/X86/pr49933.ll
-Transforms/SLPVectorizer/X86/propagate_ir_flags.ll
-Transforms/SLPVectorizer/X86/reduction-bool-logic-op-inside.ll
-Transforms/SLPVectorizer/X86/reduction-logical.ll
-Transforms/SLPVectorizer/X86/resized-bv-values-non-power-of2-node.ll
-Transforms/SLPVectorizer/X86/reused-reductions-with-minbitwidth.ll
-Transforms/SLPVectorizer/X86/select-reduction-op.ll
-Transforms/SLPVectorizer/X86/shrink_after_reorder.ll
-Transforms/SLPVectorizer/X86/subvector-minbitwidth-unsigned-value.ll
-Transforms/SLPVectorizer/X86/undef_vect.ll
-Transforms/SLPVectorizer/X86/used-reduced-op.ll
-Transforms/SLPVectorizer/X86/vec3-reorder-reshuffle.ll
-Transforms/SLPVectorizer/X86/vectorize-widest-phis.ll
-Transforms/SLPVectorizer/X86/whole-registers-compare.ll
 Transforms/SROA/addrspacecast.ll
 Transforms/SROA/phi-and-select.ll
 Transforms/SROA/phi-gep.ll

Copy link

github-actions bot commented Oct 11, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@mtrofin mtrofin force-pushed the users/mtrofin/10-10-_slp_ branch from 1236326 to 025aef2 Compare October 11, 2025 02:35
Base automatically changed from users/mtrofin/10-10-_profcheck_update_xfail_list to main October 11, 2025 02:41
Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This generally LGTM, though I do wonder whether vector selects shouldn't just be excluded wholesale from profcheck? Vector selects are not going to be converted back to branches.

setExplicitlyUnknownBranchWeightsIfProfiled(
*SI, *SI->getParent()->getParent(), PassName);
}
return Ret;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could use this in 01e19e8 as well -- the tradeoff being having to pass the pass name, but getting to keep IRBuilder usage. The pass name requirement is annoying.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ack.

FWIW, I'm actively thinking how to avoid the explicit name and still get the property we want - namely, identification of origin of "unknowns", which we can use to farm out opportunities for better profiling. If there aren't too many cases, maybe it's not too bad, but I want to have an alternative ready.

@mtrofin
Copy link
Member Author

mtrofin commented Oct 11, 2025

This generally LGTM, though I do wonder whether vector selects shouldn't just be excluded wholesale from profcheck? Vector selects are not going to be converted back to branches.

I was thinking about that too, my worry is that a wholesale exclusion would allow anything to pass in the future (like, a branch). I realize, on one hand, that it should probably not happen in this case, but on the other hand, we're replacing a one-liner with another one-liner.

Copy link
Member

@alexey-bataev alexey-bataev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LG for SLP part

Copy link
Contributor

@boomanaiden154 boomanaiden154 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, although the xfail list looks a bit wonky.

tools/UpdateTestChecks/update_test_checks/various_ir_values_dbgrecords.test
Transforms/AggressiveInstCombine/lower-table-based-cttz-basics.ll
Transforms/AggressiveInstCombine/lower-table-based-cttz-dereferencing-pointer.ll
Transforms/AggressiveInstCombine/lower-table-based-cttz-non-argument-value.ll
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There seem to be quite a few removals in here that come from other patches? Might be fixed by a rebase/merge though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants