Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SelectOpt] Add handling for Select-like operations. #77284

Merged
merged 4 commits into from
Jan 22, 2024

Conversation

davemgreen
Copy link
Collaborator

Some operations behave like selects. For example or(zext(c), y) is the same as select(c, y|1, y)` and instcombine can canonicalize the select to the or form. These operations can still be worthwhile converting to branch as opposed to keeping as a select or or instruction.

This patch attempts to add some basic handling for them, creating a SelectLike abstraction in the select optimization pass. The backend can opt into handling or(zext(c),x) as a select if it could be profitable, and the select optimization pass attempts to handle them in much the same way as a select(c, x|1, x). The Or(x, 1) may need to be added as a new instruction, generated as the or is converted to branches.

This helps fix a regression from selects being converted to or's recently.

@llvmbot
Copy link
Collaborator

llvmbot commented Jan 8, 2024

@llvm/pr-subscribers-backend-aarch64

@llvm/pr-subscribers-llvm-analysis

Author: David Green (davemgreen)

Changes

Some operations behave like selects. For example or(zext(c), y) is the same as select(c, y|1, y)` and instcombine can canonicalize the select to the or form. These operations can still be worthwhile converting to branch as opposed to keeping as a select or or instruction.

This patch attempts to add some basic handling for them, creating a SelectLike abstraction in the select optimization pass. The backend can opt into handling or(zext(c),x) as a select if it could be profitable, and the select optimization pass attempts to handle them in much the same way as a select(c, x|1, x). The Or(x, 1) may need to be added as a new instruction, generated as the or is converted to branches.

This helps fix a regression from selects being converted to or's recently.


Patch is 36.50 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/77284.diff

6 Files Affected:

  • (modified) llvm/include/llvm/Analysis/TargetTransformInfo.h (+10)
  • (modified) llvm/include/llvm/Analysis/TargetTransformInfoImpl.h (+4)
  • (modified) llvm/lib/Analysis/TargetTransformInfo.cpp (+5)
  • (modified) llvm/lib/CodeGen/SelectOptimize.cpp (+270-114)
  • (modified) llvm/lib/Target/AArch64/AArch64TargetTransformInfo.h (+10)
  • (modified) llvm/test/CodeGen/AArch64/selectopt.ll (+24-6)
diff --git a/llvm/include/llvm/Analysis/TargetTransformInfo.h b/llvm/include/llvm/Analysis/TargetTransformInfo.h
index 048912beaba5a1..06a18616cc69f3 100644
--- a/llvm/include/llvm/Analysis/TargetTransformInfo.h
+++ b/llvm/include/llvm/Analysis/TargetTransformInfo.h
@@ -934,6 +934,12 @@ class TargetTransformInfo {
   /// Should the Select Optimization pass be enabled and ran.
   bool enableSelectOptimize() const;
 
+  /// Should the Select Optimization pass treat the given instruction like a
+  /// select, potentially converting it to a conditional branch. This can
+  /// include select-like instructions like or(zext(c), x) that can be converted
+  /// to selects.
+  bool shouldTreatInstructionLikeSelect(Instruction *I) const;
+
   /// Enable matching of interleaved access groups.
   bool enableInterleavedAccessVectorization() const;
 
@@ -1875,6 +1881,7 @@ class TargetTransformInfo::Concept {
   virtual MemCmpExpansionOptions
   enableMemCmpExpansion(bool OptSize, bool IsZeroCmp) const = 0;
   virtual bool enableSelectOptimize() = 0;
+  virtual bool shouldTreatInstructionLikeSelect(Instruction *I) = 0;
   virtual bool enableInterleavedAccessVectorization() = 0;
   virtual bool enableMaskedInterleavedAccessVectorization() = 0;
   virtual bool isFPVectorizationPotentiallyUnsafe() = 0;
@@ -2411,6 +2418,9 @@ class TargetTransformInfo::Model final : public TargetTransformInfo::Concept {
   bool enableSelectOptimize() override {
     return Impl.enableSelectOptimize();
   }
+  bool shouldTreatInstructionLikeSelect(Instruction *I) override {
+    return Impl.shouldTreatInstructionLikeSelect(I);
+  }
   bool enableInterleavedAccessVectorization() override {
     return Impl.enableInterleavedAccessVectorization();
   }
diff --git a/llvm/include/llvm/Analysis/TargetTransformInfoImpl.h b/llvm/include/llvm/Analysis/TargetTransformInfoImpl.h
index 2be7256423e422..6bfc747d52c1ab 100644
--- a/llvm/include/llvm/Analysis/TargetTransformInfoImpl.h
+++ b/llvm/include/llvm/Analysis/TargetTransformInfoImpl.h
@@ -378,6 +378,10 @@ class TargetTransformInfoImplBase {
 
   bool enableSelectOptimize() const { return true; }
 
+  bool shouldTreatInstructionLikeSelect(Instruction *I) {
+    return isa<SelectInst>(I);
+  }
+
   bool enableInterleavedAccessVectorization() const { return false; }
 
   bool enableMaskedInterleavedAccessVectorization() const { return false; }
diff --git a/llvm/lib/Analysis/TargetTransformInfo.cpp b/llvm/lib/Analysis/TargetTransformInfo.cpp
index 67246afa23147a..3aa2f741c22df0 100644
--- a/llvm/lib/Analysis/TargetTransformInfo.cpp
+++ b/llvm/lib/Analysis/TargetTransformInfo.cpp
@@ -600,6 +600,11 @@ bool TargetTransformInfo::enableSelectOptimize() const {
   return TTIImpl->enableSelectOptimize();
 }
 
+bool TargetTransformInfo::shouldTreatInstructionLikeSelect(
+    Instruction *I) const {
+  return TTIImpl->shouldTreatInstructionLikeSelect(I);
+}
+
 bool TargetTransformInfo::enableInterleavedAccessVectorization() const {
   return TTIImpl->enableInterleavedAccessVectorization();
 }
diff --git a/llvm/lib/CodeGen/SelectOptimize.cpp b/llvm/lib/CodeGen/SelectOptimize.cpp
index 1316919e65dacc..01bc3e62eb181c 100644
--- a/llvm/lib/CodeGen/SelectOptimize.cpp
+++ b/llvm/lib/CodeGen/SelectOptimize.cpp
@@ -42,6 +42,7 @@
 #include <stack>
 
 using namespace llvm;
+using namespace llvm::PatternMatch;
 
 #define DEBUG_TYPE "select-optimize"
 
@@ -114,12 +115,6 @@ class SelectOptimizeImpl {
   PreservedAnalyses run(Function &F, FunctionAnalysisManager &FAM);
   bool runOnFunction(Function &F, Pass &P);
 
-private:
-  // Select groups consist of consecutive select instructions with the same
-  // condition.
-  using SelectGroup = SmallVector<SelectInst *, 2>;
-  using SelectGroups = SmallVector<SelectGroup, 2>;
-
   using Scaled64 = ScaledNumber<uint64_t>;
 
   struct CostInfo {
@@ -129,6 +124,151 @@ class SelectOptimizeImpl {
     Scaled64 NonPredCost;
   };
 
+  /// SelectLike is an abstraction over SelectInst and other operations that can
+  /// act like selects. For example Or(Zext(icmp), X) can be treated like
+  /// select(icmp, X|1, X).
+  class SelectLike {
+  private:
+    SelectLike(Instruction *SI) : SI(SI) {}
+
+    Instruction *SI;
+
+  public:
+    /// Match a select or select-like instruction, returning a SelectLike.
+    static SelectLike match(Instruction *I) {
+      // Select instruction are what we are usually looking for. If the select
+      // is a logical-and/logical-or then it is better treated as a and/or by
+      // the backend.
+      if (isa<SelectInst>(I) &&
+          !PatternMatch::match(I,
+                               m_CombineOr(m_LogicalAnd(m_Value(), m_Value()),
+                                           m_LogicalOr(m_Value(), m_Value()))))
+        return SelectLike(I);
+
+      // An Or(zext(i1 X), Y) can also be treated like a select, with condition
+      // C and values Y|1 and Y.
+      Value *X;
+      if (PatternMatch::match(
+              I, m_c_Or(m_OneUse(m_ZExt(m_Value(X))), m_Value())) &&
+          X->getType()->isIntegerTy(1))
+        return SelectLike(I);
+
+      return SelectLike(nullptr);
+    }
+
+    bool isValid() { return SI; }
+    operator bool() { return isValid(); }
+
+    Instruction *getSI() { return SI; }
+    const Instruction *getSI() const { return SI; }
+
+    Type *getType() const { return SI->getType(); }
+
+    /// Return the condition for the SelectLike instruction. For example the
+    /// condition of a select or c in `or(zext(c), x)`
+    Value *getCondition() const {
+      if (auto *Sel = dyn_cast<SelectInst>(SI))
+        return Sel->getCondition();
+      // Or(zext) case
+      if (auto *BO = dyn_cast<BinaryOperator>(SI)) {
+        Value *X;
+        if (PatternMatch::match(BO->getOperand(0),
+                                m_OneUse(m_ZExt(m_Value(X)))))
+          return X;
+        if (PatternMatch::match(BO->getOperand(1),
+                                m_OneUse(m_ZExt(m_Value(X)))))
+          return X;
+      }
+
+      llvm_unreachable("Unhandled case in getCondition");
+    }
+
+    /// Return the true value for the SelectLike instruction. Note this may not
+    /// exist for all SelectLike instructions. For example, for `or(zext(c), x)`
+    /// the true value would be `or(x,1)`. As this value does not exist, nullptr
+    /// is returned.
+    Value *getTrueValue() const {
+      if (auto *Sel = dyn_cast<SelectInst>(SI))
+        return Sel->getTrueValue();
+      // Or(zext) case - The true value is Or(X), so return nullptr as the value
+      // does not yet exist.
+      if (isa<BinaryOperator>(SI))
+        return nullptr;
+
+      llvm_unreachable("Unhandled case in getTrueValue");
+    }
+
+    /// Return the false value for the SelectLike instruction. For example the
+    /// getFalseValue of a select or `x` in `or(zext(c), x)` (which is
+    /// `select(c, x|1, x)`)
+    Value *getFalseValue() const {
+      if (auto *Sel = dyn_cast<SelectInst>(SI))
+        return Sel->getFalseValue();
+      // Or(zext) case - return the operand which is not the zext.
+      if (auto *BO = dyn_cast<BinaryOperator>(SI)) {
+        Value *X;
+        if (PatternMatch::match(BO->getOperand(0),
+                                m_OneUse(m_ZExt(m_Value(X)))))
+          return BO->getOperand(1);
+        if (PatternMatch::match(BO->getOperand(1),
+                                m_OneUse(m_ZExt(m_Value(X)))))
+          return BO->getOperand(0);
+      }
+
+      llvm_unreachable("Unhandled case in getFalseValue");
+    }
+
+    /// Return the NonPredCost cost of the true op, given the costs in
+    /// InstCostMap. This may need to be generated for select-like instructions.
+    Scaled64 getTrueOpCost(DenseMap<const Instruction *, CostInfo> &InstCostMap,
+                           const TargetTransformInfo *TTI) {
+      if (auto *Sel = dyn_cast<SelectInst>(SI))
+        if (auto *I = dyn_cast<Instruction>(Sel->getTrueValue()))
+          return InstCostMap.contains(I) ? InstCostMap[I].NonPredCost
+                                         : Scaled64::getZero();
+
+      // Or case - add the cost of an extra Or to the cost of the False case.
+      if (isa<BinaryOperator>(SI))
+        if (auto I = dyn_cast<Instruction>(getFalseValue()))
+          if (InstCostMap.contains(I)) {
+            InstructionCost OrCost = TTI->getArithmeticInstrCost(
+                Instruction::Or, I->getType(), TargetTransformInfo::TCK_Latency,
+                {TargetTransformInfo::OK_AnyValue,
+                 TargetTransformInfo::OP_None},
+                {TTI::OK_UniformConstantValue, TTI::OP_PowerOf2});
+            return InstCostMap[I].NonPredCost +
+                   Scaled64::get(*OrCost.getValue());
+          }
+
+      return Scaled64::getZero();
+    }
+
+    /// Return the NonPredCost cost of the false op, given the costs in
+    /// InstCostMap. This may need to be generated for select-like instructions.
+    Scaled64
+    getFalseOpCost(DenseMap<const Instruction *, CostInfo> &InstCostMap,
+                   const TargetTransformInfo *TTI) {
+      if (auto *Sel = dyn_cast<SelectInst>(SI))
+        if (auto *I = dyn_cast<Instruction>(Sel->getFalseValue()))
+          return InstCostMap.contains(I) ? InstCostMap[I].NonPredCost
+                                         : Scaled64::getZero();
+
+      // Or case - return the cost of the false case
+      if (isa<BinaryOperator>(SI))
+        if (auto I = dyn_cast<Instruction>(getFalseValue()))
+          if (InstCostMap.contains(I))
+            return InstCostMap[I].NonPredCost;
+
+      return Scaled64::getZero();
+    }
+  };
+
+private:
+  // Select groups consist of consecutive select instructions with the same
+  // condition.
+  using SelectGroup = SmallVector<SelectLike, 2>;
+  using SelectGroups = SmallVector<SelectGroup, 2>;
+
   // Converts select instructions of a function to conditional jumps when deemed
   // profitable. Returns true if at least one select was converted.
   bool optimizeSelects(Function &F);
@@ -156,12 +296,12 @@ class SelectOptimizeImpl {
 
   // Determines if a select group should be converted to a branch (base
   // heuristics).
-  bool isConvertToBranchProfitableBase(const SmallVector<SelectInst *, 2> &ASI);
+  bool isConvertToBranchProfitableBase(const SelectGroup &ASI);
 
   // Returns true if there are expensive instructions in the cold value
   // operand's (if any) dependence slice of any of the selects of the given
   // group.
-  bool hasExpensiveColdOperand(const SmallVector<SelectInst *, 2> &ASI);
+  bool hasExpensiveColdOperand(const SelectGroup &ASI);
 
   // For a given source instruction, collect its backwards dependence slice
   // consisting of instructions exclusively computed for producing the operands
@@ -170,7 +310,7 @@ class SelectOptimizeImpl {
                              Instruction *SI, bool ForSinking = false);
 
   // Returns true if the condition of the select is highly predictable.
-  bool isSelectHighlyPredictable(const SelectInst *SI);
+  bool isSelectHighlyPredictable(SelectLike SI);
 
   // Loop-level checks to determine if a non-predicated version (with branches)
   // of the given loop is more profitable than its predicated version.
@@ -189,14 +329,14 @@ class SelectOptimizeImpl {
   std::optional<uint64_t> computeInstCost(const Instruction *I);
 
   // Returns the misprediction cost of a given select when converted to branch.
-  Scaled64 getMispredictionCost(const SelectInst *SI, const Scaled64 CondCost);
+  Scaled64 getMispredictionCost(SelectLike SI, const Scaled64 CondCost);
 
   // Returns the cost of a branch when the prediction is correct.
   Scaled64 getPredictedPathCost(Scaled64 TrueCost, Scaled64 FalseCost,
-                                const SelectInst *SI);
+                                SelectLike SI);
 
   // Returns true if the target architecture supports lowering a given select.
-  bool isSelectKindSupported(SelectInst *SI);
+  bool isSelectKindSupported(SelectLike SI);
 };
 
 class SelectOptimize : public FunctionPass {
@@ -368,15 +508,24 @@ void SelectOptimizeImpl::optimizeSelectsInnerLoops(Function &F,
 /// select instructions in \p Selects, look through the defining select
 /// instruction until the true/false value is not defined in \p Selects.
 static Value *
-getTrueOrFalseValue(SelectInst *SI, bool isTrue,
-                    const SmallPtrSet<const Instruction *, 2> &Selects) {
+getTrueOrFalseValue(SelectOptimizeImpl::SelectLike SI, bool isTrue,
+                    const SmallPtrSet<const Instruction *, 2> &Selects,
+                    IRBuilder<> &IB) {
   Value *V = nullptr;
-  for (SelectInst *DefSI = SI; DefSI != nullptr && Selects.count(DefSI);
+  for (SelectInst *DefSI = dyn_cast<SelectInst>(SI.getSI());
+       DefSI != nullptr && Selects.count(DefSI);
        DefSI = dyn_cast<SelectInst>(V)) {
-    assert(DefSI->getCondition() == SI->getCondition() &&
+    assert(DefSI->getCondition() == SI.getCondition() &&
            "The condition of DefSI does not match with SI");
     V = (isTrue ? DefSI->getTrueValue() : DefSI->getFalseValue());
   }
+
+  if (isa<BinaryOperator>(SI.getSI())) {
+    V = SI.getFalseValue();
+    if (isTrue)
+      V = IB.CreateOr(V, ConstantInt::get(V->getType(), 1));
+  }
+
   assert(V && "Failed to get select true/false value");
   return V;
 }
@@ -424,20 +573,22 @@ void SelectOptimizeImpl::convertProfitableSIGroups(SelectGroups &ProfSIGroups) {
     SmallVector<std::stack<Instruction *>, 2> TrueSlices, FalseSlices;
     typedef std::stack<Instruction *>::size_type StackSizeType;
     StackSizeType maxTrueSliceLen = 0, maxFalseSliceLen = 0;
-    for (SelectInst *SI : ASI) {
+    for (SelectLike SI : ASI) {
       // For each select, compute the sinkable dependence chains of the true and
       // false operands.
-      if (auto *TI = dyn_cast<Instruction>(SI->getTrueValue())) {
+      if (auto *TI = dyn_cast_or_null<Instruction>(SI.getTrueValue())) {
         std::stack<Instruction *> TrueSlice;
-        getExclBackwardsSlice(TI, TrueSlice, SI, true);
+        getExclBackwardsSlice(TI, TrueSlice, SI.getSI(), true);
         maxTrueSliceLen = std::max(maxTrueSliceLen, TrueSlice.size());
         TrueSlices.push_back(TrueSlice);
       }
-      if (auto *FI = dyn_cast<Instruction>(SI->getFalseValue())) {
-        std::stack<Instruction *> FalseSlice;
-        getExclBackwardsSlice(FI, FalseSlice, SI, true);
-        maxFalseSliceLen = std::max(maxFalseSliceLen, FalseSlice.size());
-        FalseSlices.push_back(FalseSlice);
+      if (auto *FI = dyn_cast_or_null<Instruction>(SI.getFalseValue())) {
+        if (isa<SelectInst>(SI.getSI()) || !FI->hasOneUse()) {
+          std::stack<Instruction *> FalseSlice;
+          getExclBackwardsSlice(FI, FalseSlice, SI.getSI(), true);
+          maxFalseSliceLen = std::max(maxFalseSliceLen, FalseSlice.size());
+          FalseSlices.push_back(FalseSlice);
+        }
       }
     }
     // In the case of multiple select instructions in the same group, the order
@@ -469,10 +620,10 @@ void SelectOptimizeImpl::convertProfitableSIGroups(SelectGroups &ProfSIGroups) {
     }
 
     // We split the block containing the select(s) into two blocks.
-    SelectInst *SI = ASI.front();
-    SelectInst *LastSI = ASI.back();
-    BasicBlock *StartBlock = SI->getParent();
-    BasicBlock::iterator SplitPt = ++(BasicBlock::iterator(LastSI));
+    SelectLike SI = ASI.front();
+    SelectLike LastSI = ASI.back();
+    BasicBlock *StartBlock = SI.getSI()->getParent();
+    BasicBlock::iterator SplitPt = ++(BasicBlock::iterator(LastSI.getSI()));
     BasicBlock *EndBlock = StartBlock->splitBasicBlock(SplitPt, "select.end");
     BFI->setBlockFreq(EndBlock, BFI->getBlockFreq(StartBlock));
     // Delete the unconditional branch that was just created by the split.
@@ -481,8 +632,8 @@ void SelectOptimizeImpl::convertProfitableSIGroups(SelectGroups &ProfSIGroups) {
     // Move any debug/pseudo instructions that were in-between the select
     // group to the newly-created end block.
     SmallVector<Instruction *, 2> DebugPseudoINS;
-    auto DIt = SI->getIterator();
-    while (&*DIt != LastSI) {
+    auto DIt = SI.getSI()->getIterator();
+    while (&*DIt != LastSI.getSI()) {
       if (DIt->isDebugOrPseudoInst())
         DebugPseudoINS.push_back(&*DIt);
       DIt++;
@@ -496,18 +647,19 @@ void SelectOptimizeImpl::convertProfitableSIGroups(SelectGroups &ProfSIGroups) {
     BasicBlock *TrueBlock = nullptr, *FalseBlock = nullptr;
     BranchInst *TrueBranch = nullptr, *FalseBranch = nullptr;
     if (!TrueSlicesInterleaved.empty()) {
-      TrueBlock = BasicBlock::Create(LastSI->getContext(), "select.true.sink",
+      TrueBlock = BasicBlock::Create(EndBlock->getContext(), "select.true.sink",
                                      EndBlock->getParent(), EndBlock);
       TrueBranch = BranchInst::Create(EndBlock, TrueBlock);
-      TrueBranch->setDebugLoc(LastSI->getDebugLoc());
+      TrueBranch->setDebugLoc(LastSI.getSI()->getDebugLoc());
       for (Instruction *TrueInst : TrueSlicesInterleaved)
         TrueInst->moveBefore(TrueBranch);
     }
     if (!FalseSlicesInterleaved.empty()) {
-      FalseBlock = BasicBlock::Create(LastSI->getContext(), "select.false.sink",
-                                      EndBlock->getParent(), EndBlock);
+      FalseBlock =
+          BasicBlock::Create(EndBlock->getContext(), "select.false.sink",
+                             EndBlock->getParent(), EndBlock);
       FalseBranch = BranchInst::Create(EndBlock, FalseBlock);
-      FalseBranch->setDebugLoc(LastSI->getDebugLoc());
+      FalseBranch->setDebugLoc(LastSI.getSI()->getDebugLoc());
       for (Instruction *FalseInst : FalseSlicesInterleaved)
         FalseInst->moveBefore(FalseBranch);
     }
@@ -517,10 +669,10 @@ void SelectOptimizeImpl::convertProfitableSIGroups(SelectGroups &ProfSIGroups) {
       assert(TrueBlock == nullptr &&
              "Unexpected basic block transform while optimizing select");
 
-      FalseBlock = BasicBlock::Create(SI->getContext(), "select.false",
+      FalseBlock = BasicBlock::Create(StartBlock->getContext(), "select.false",
                                       EndBlock->getParent(), EndBlock);
       auto *FalseBranch = BranchInst::Create(EndBlock, FalseBlock);
-      FalseBranch->setDebugLoc(SI->getDebugLoc());
+      FalseBranch->setDebugLoc(SI.getSI()->getDebugLoc());
     }
 
     // Insert the real conditional branch based on the original condition.
@@ -541,44 +693,36 @@ void SelectOptimizeImpl::convertProfitableSIGroups(SelectGroups &ProfSIGroups) {
       TT = TrueBlock;
       FT = FalseBlock;
     }
-    IRBuilder<> IB(SI);
-    auto *CondFr =
-        IB.CreateFreeze(SI->getCondition(), SI->getName() + ".frozen");
-    IB.CreateCondBr(CondFr, TT, FT, SI);
+    IRBuilder<> IB(SI.getSI());
+    auto *CondFr = IB.CreateFreeze(SI.getCondition(),
+                                   SI.getCondition()->getName() + ".frozen");
 
     SmallPtrSet<const Instruction *, 2> INS;
-    INS.insert(ASI.begin(), ASI.end());
+    for (auto SI : ASI)
+      INS.insert(SI.getSI());
+
     // Use reverse iterator because later select may use the value of the
     // earlier select, and we need to propagate value through earlier select
     // to get the PHI operand.
     for (auto It = ASI.rbegin(); It != ASI.rend(); ++It) {
-      SelectInst *SI = *It;
+      SelectLike SI = *It;
       // The select itself is replaced with a PHI Node.
-      PHINode *PN = PHINode::Create(SI->getType(), 2, "");
+      PHINode *PN = PHINode::Create(SI.getType(), 2, "");
       PN->insertBefore(EndBlock->begin());
-      PN->takeName(SI);
-      PN->addIncoming(getTrueOrFalseValue(SI, true, INS), TrueBlock);
-      PN->addIncoming(getTrueOrFalseValue(SI, false, INS), FalseBlock);
-      PN->setDebugLoc(SI->getDebugLoc());
-
-      SI->replaceAllUsesWith(PN);
-      SI->eraseFromParent();
-      INS.erase(SI);
+      PN->takeName(SI.getSI());
+      PN->addIncoming(getTrueOrFalseValue(SI, true, INS, IB), TrueBlock);
+      PN->addIncoming(getTrueOrFalseValue(SI, false, INS, IB), FalseBlock);
+      PN->setDebugLoc(SI.getSI()->getDebugLoc());
+      SI.getSI()->replaceAllUsesWith(PN);
+      INS.erase(SI.getSI());
       ++NumSelectsConverted;
     }
-  }
-}
-
-static bool isSpecialSelect(SelectInst *SI) {
-  using namespace llvm::PatternMatch;
+    IB.CreateCondBr(CondFr, TT, FT, SI.getSI());
 
-  // If the selec...
[truncated]

@davemgreen
Copy link
Collaborator Author

I've pushed a rebase to try and move away from the failing CI tests.

Copy link
Contributor

@david-arm david-arm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a couple of comments so far, but will review the SelectOptimize code soon as well!

llvm/lib/Target/AArch64/AArch64TargetTransformInfo.h Outdated Show resolved Hide resolved
llvm/lib/Target/AArch64/AArch64TargetTransformInfo.h Outdated Show resolved Hide resolved
llvm/lib/CodeGen/SelectOptimize.cpp Outdated Show resolved Hide resolved
llvm/lib/CodeGen/SelectOptimize.cpp Show resolved Hide resolved
llvm/lib/CodeGen/SelectOptimize.cpp Outdated Show resolved Hide resolved
llvm/lib/Target/AArch64/AArch64TargetTransformInfo.h Outdated Show resolved Hide resolved
Copy link
Contributor

@sapostolakis sapostolakis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. This sounds like a good idea.
Left some comments.
Main issue is that I would like to have a flag that could disable, if needed, the select-like (but not proper select) part. The rest of the comments are nits.

llvm/include/llvm/Analysis/TargetTransformInfoImpl.h Outdated Show resolved Hide resolved
llvm/lib/CodeGen/SelectOptimize.cpp Show resolved Hide resolved
llvm/lib/CodeGen/SelectOptimize.cpp Outdated Show resolved Hide resolved
llvm/lib/CodeGen/SelectOptimize.cpp Outdated Show resolved Hide resolved
@sjoerdmeijer
Copy link
Collaborator

Thanks for working on this! I will take this patch, see how it behaves on the Neoverse V2, and will let you know soon.

@sjoerdmeijer
Copy link
Collaborator

sjoerdmeijer commented Jan 17, 2024

This improves MCF from SPEC INT 2017 by more than 9% on the Neoverse V2, which is great!

@sjoerdmeijer
Copy link
Collaborator

sjoerdmeijer commented Jan 17, 2024

I might be wrong, but I suspect we could do with some more test-cases, especially negative tests when this shouldn't trigger. From a quick look, things like when the pattern doesn't exactly matches, one use, dbg instructions, if it is not a terminator, etc.

llvm/lib/CodeGen/SelectOptimize.cpp Outdated Show resolved Hide resolved
// An Or(zext(i1 X), Y) can also be treated like a select, with condition
// C and values Y|1 and Y.
Value *X;
if (PatternMatch::match(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is PatternMatch:: needed with using namespace llvm::PatternMatch; above?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, It needs to not match the match() function in this class.

Copy link
Collaborator Author

@davemgreen davemgreen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I might be wrong, but I suspect we could do with some more test-cases, especially negative tests when this shouldn't trigger. From a quick look, things like when the pattern doesn't exactly matches, one use, dbg instructions, if it is not a terminator, etc.

There should be tests for one uses select groups and the positions in selectopt.ll. See 365fbbf. Debug info tests we would not usually add unless we had good reason, they can be quite bulky, but I've added a run line. Anything else specific you think needs adding?

// An Or(zext(i1 X), Y) can also be treated like a select, with condition
// C and values Y|1 and Y.
Value *X;
if (PatternMatch::match(
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, It needs to not match the match() function in this class.

@sjoerdmeijer
Copy link
Collaborator

I might be wrong, but I suspect we could do with some more test-cases, especially negative tests when this shouldn't trigger. From a quick look, things like when the pattern doesn't exactly matches, one use, dbg instructions, if it is not a terminator, etc.

There should be tests for one uses select groups and the positions in selectopt.ll. See 365fbbf. Debug info tests we would not usually add unless we had good reason, they can be quite bulky, but I've added a run line. Anything else specific you think needs adding?

Ok, thanks for clarifying, looks fine then.

Copy link
Collaborator

@sjoerdmeijer sjoerdmeijer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a good change to me, but give it a few days in case there are more comments.

@sapostolakis
Copy link
Contributor

LGTM as well. No further comments from me.

Some operations behave like selects. For example `or(zext(c), y)` is the same
as select(c, y|1, y)` and instcombine can canonicalize the select to the or
form. These operations can still be worthwhile converting to branch as opposed
to keeping as a select or Or instruction.

This patch attempts to add some basic handling for them, creating a SelectLike
abstraction in the select optimization pass. The backend can opt into handling
`or(zext(c),x)` as a select if it could be profitable, and the select
optimization pass attempts to handle them in much the same way as a
`select(c, x|1, x)`. The Or(x, 1) may need to be added as a new instruction,
generated as the or is converted to branches.

This helps fix a regression from selects being converted to or's recently.
@davemgreen davemgreen merged commit a2d68b4 into llvm:main Jan 22, 2024
3 of 4 checks passed
@davemgreen davemgreen deleted the gh-selectoptor branch January 22, 2024 23:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants