Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[InstCombine] Improve handling of not and free inversion. #66787

Closed

Conversation

goldsteinn
Copy link
Contributor

@goldsteinn goldsteinn commented Sep 19, 2023

  • [InstCombine] Add tests for folding (X + Y) - (W + Z); NFC
  • [InstCombine] Add folds for (X + Y) - (W + Z)
  • [InstCombine] Make isFreeToInvert check recursively.
  • [InstCombine] Add additional tests for free inversion; NFC
  • [InstCombine] add getFreeInverted to perform folds for free inversion of op
  • [InstCombine] Replace isFreeToInvert + CreateNot with getFreelyInverted
  • [InstCombine] Add transform for ~X + ~Y) -> -2 - Y - X

@goldsteinn goldsteinn requested review from arsenm, rotateright, nikic and RKSimon and removed request for arsenm September 19, 2023 16:08
@RKSimon
Copy link
Collaborator

RKSimon commented Sep 19, 2023

I'm curious how you intend to commit this stack of PRs? Are you intending to squash + merge?

@goldsteinn
Copy link
Contributor Author

I'm curious how you intend to commit this stack of PRs? Are you intending to squash + merge?

I was/am intending to push each commit discretely.
As they are approved I generally create patches, apply to main, then push.
Not the best workflow for PRs?
Is there a better way to create PR with many independent commits?

@goldsteinn goldsteinn changed the title goldsteinn/and canonicalization [InstCombine] Improve (icmp pred (and X, Y), ...) fold. Sep 21, 2023
@RKSimon
Copy link
Collaborator

RKSimon commented Sep 21, 2023

I'm still trying to work out how best to create stacked PRs (especially when adding extra tests and want to show the diff) - everyone seems to have their own opinion on them right now :)

@@ -245,32 +246,38 @@ class LLVM_LIBRARY_VISIBILITY InstCombiner {
if (match(V, PatternMatch::m_AnyIntegralConstant()))
return true;

if (Depth++ >= MaxAnalysisRecursionDepth)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(style) For recursion methods add Depth on the recursive calls themselves (isFreeToInvert(X,Y, Depth + 1)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is something there is no consistency across LLVM with. I know in SelectionDAG we do Depth; Depth + 1 but somewhere like ValueTracking we use Depth++; Depth.

Personally I like the Depth++ way b.c you don't have to type Depth + 1 everywhere (easier to not mess up imo), and it seems more consistent with what a see elsewhere in InstCombine / the middle end.

Unless you feel strongly would prefer to keep as is.

@goldsteinn
Copy link
Contributor Author

I'm still trying to work out how best to create stacked PRs (especially when adding extra tests and want to show the diff) - everyone seems to have their own opinion on them right now :)

I don't really know either. If you know a better way than this I'm all ears.

@goldsteinn
Copy link
Contributor Author

I'm still trying to work out how best to create stacked PRs (especially when adding extra tests and want to show the diff) - everyone seems to have their own opinion on them right now :)

I don't really know either. If you know a better way than this I'm all ears.

What might improve the situation is if we could create branch on the llvm-project repo. I.e users/<gh-username>/<branch-name>. Then we could create series of PRs i.e
main <- users/<user>/feature-patch-0 <- users/<user>/feature-patch-1 <-... users/<user>/feature-patch-N

@goldsteinn
Copy link
Contributor Author

ping.

@goldsteinn goldsteinn force-pushed the goldsteinn/and-canonicalization branch from 216da55 to 0e6a750 Compare October 3, 2023 20:04
@goldsteinn
Copy link
Contributor Author

ping.

@goldsteinn goldsteinn force-pushed the goldsteinn/and-canonicalization branch from 0e6a750 to 98aa59b Compare October 22, 2023 05:30
@goldsteinn
Copy link
Contributor Author

ping.

@nikic
Copy link
Contributor

nikic commented Nov 15, 2023

Only looked at the first three commits, which imho could be a separate PR (maybe drop the rest from this one for now).

I think after adding getFreeInverted() it would be good to also go through existing users of isFreeToInvert() and replace them with getFreeInverted() where this is easy, e.g. in foldICmpOrXX(). Directly creating the inverted op instead of letting it get inverted separately does seem a lot more robust to me.

@goldsteinn
Copy link
Contributor Author

Only looked at the first three commits, which imho could be a separate PR (maybe drop the rest from this one for now).

I think after adding getFreeInverted() it would be good to also go through existing users of isFreeToInvert() and replace them with getFreeInverted() where this is easy, e.g. in foldICmpOrXX(). Directly creating the inverted op instead of letting it get inverted separately does seem a lot more robust to me.

Thank you for the review.

Regarding replacing getFreeInverted with isFreeToInvert + (creating a not). So I do add a fold for using getFreeInverted in not fold. My feeling was we have places where we create a not in a canonicalizing effort and more places recognize m_Not than isFreeToInvert, so it made sense to let the new pattern be checked by other folds before stripping the not. But not married to that idea.

@llvmbot
Copy link
Collaborator

llvmbot commented Nov 18, 2023

@llvm/pr-subscribers-llvm-transforms

Author: None (goldsteinn)

Changes
  • [InstCombine] Make isFreeToInvert check recursively.
  • [InstCombine] Add additional tests for free inversion; NFC
  • [InstCombine] add getFreeInverted to perform folds for free inversion of op
  • [InstCombine] Add tests for expanding foldICmpWithLowBitMaskedVal; NFC
  • [InstCombine] Improve mask detection in foldICmpWithLowBitMaskedVal
  • [InstCombine] Recognize (icmp eq/ne (and X, ~Mask), 0) pattern in foldICmpWithLowBitMaskedVal
  • [InstCombine] Make the (icmp eq/ne (and X, Y), X) canonicalization work for non-const operands
  • [InstCombine] Make getKnownSign a member function of InstCombiner; NFC
  • [InstCombine] Add transforms (icmp spred (and X, Y), X) if X or Y are known signed/unsigned
  • [InstCombine] Add tests for folding multiuse (icmp eq/ne (or X, Y), Y); NFC
  • [InstCombine] Folding multiuse (icmp eq/ne (or X, Y), Y) for 2 uses of Y
  • [InstCombine] Add tests for transforming (or/and (icmp eq/ne X,0),(icmp eq/ne X,Pow2OrZero)); NFC
  • [InstCombine] Add transforms for (or/and (icmp eq/ne X,0),(icmp eq/ne X,Pow2OrZero))

Patch is 38.32 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/66787.diff

12 Files Affected:

  • (modified) llvm/include/llvm/Transforms/InstCombine/InstCombiner.h (+96-31)
  • (modified) llvm/lib/Transforms/InstCombine/InstCombineAddSub.cpp (+8-6)
  • (modified) llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp (+36-19)
  • (modified) llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp (+6-6)
  • (modified) llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp (+11-10)
  • (modified) llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp (+10-10)
  • (added) llvm/test/Transforms/InstCombine/free-inversion.ll (+377)
  • (modified) llvm/test/Transforms/InstCombine/icmp-of-or-x.ll (+3-3)
  • (modified) llvm/test/Transforms/InstCombine/minmax-intrinsics.ll (+9-10)
  • (modified) llvm/test/Transforms/InstCombine/pr38915.ll (+3-3)
  • (modified) llvm/test/Transforms/InstCombine/xor.ll (+12-12)
  • (modified) llvm/test/Transforms/LoopVectorize/ARM/mve-selectandorcost.ll (+1-1)
diff --git a/llvm/include/llvm/Transforms/InstCombine/InstCombiner.h b/llvm/include/llvm/Transforms/InstCombine/InstCombiner.h
index f8b3874267ded3b..fcf72fd5ff9c678 100644
--- a/llvm/include/llvm/Transforms/InstCombine/InstCombiner.h
+++ b/llvm/include/llvm/Transforms/InstCombine/InstCombiner.h
@@ -233,49 +233,114 @@ class LLVM_LIBRARY_VISIBILITY InstCombiner {
                                                 PatternMatch::m_Value()));
   }
 
-  /// Return true if the specified value is free to invert (apply ~ to).
-  /// This happens in cases where the ~ can be eliminated.  If WillInvertAllUses
-  /// is true, work under the assumption that the caller intends to remove all
-  /// uses of V and only keep uses of ~V.
-  ///
-  /// See also: canFreelyInvertAllUsersOf()
-  static bool isFreeToInvert(Value *V, bool WillInvertAllUses) {
+  /// Return nonnull value if V is free to invert (with condition) regarding
+  /// WillInvertAllUses.
+  /// If Builder is nonnull, it will return a simplified ~V
+  /// If Builder is null, it will return an arbitrary nonnull value (not
+  /// dereferenceable).
+  static Value *getFreelyInverted(Value *V, bool WillInvertAllUses,
+                                  BuilderTy *Builder, unsigned Depth = 0) {
+    using namespace llvm::PatternMatch;
+    static Value *const NonNull = reinterpret_cast<Value *>(uintptr_t(1));
     // ~(~(X)) -> X.
-    if (match(V, m_Not(PatternMatch::m_Value())))
-      return true;
+    Value *A, *B;
+    if (match(V, m_Not(m_Value(A))))
+      return A;
 
+    Constant *C;
     // Constants can be considered to be not'ed values.
-    if (match(V, PatternMatch::m_AnyIntegralConstant()))
-      return true;
+    if (match(V, m_ImmConstant(C)))
+      return ConstantExpr::getNot(C);
+
+    if (Depth++ >= MaxAnalysisRecursionDepth)
+      return nullptr;
+
+    // The rest of the cases require that we invert all uses so don't bother
+    // doing the analysis if we know we can't use the result.
+    if (!WillInvertAllUses)
+      return nullptr;
 
     // Compares can be inverted if all of their uses are being modified to use
     // the ~V.
-    if (isa<CmpInst>(V))
-      return WillInvertAllUses;
+    if (auto *I = dyn_cast<CmpInst>(V)) {
+      if (Builder != nullptr)
+        return Builder->CreateCmp(I->getInversePredicate(), I->getOperand(0),
+                                  I->getOperand(1));
+      return NonNull;
+    }
 
     // If `V` is of the form `A + Constant` then `-1 - V` can be folded into
     // `(-1 - Constant) - A` if we are willing to invert all of the uses.
-    if (match(V, m_Add(PatternMatch::m_Value(), PatternMatch::m_ImmConstant())))
-      return WillInvertAllUses;
+    if (match(V, m_Add(m_Value(A), m_Value(B)))) {
+      if (auto *BV = getFreelyInverted(B, B->hasOneUse(), Builder, Depth))
+        return Builder ? Builder->CreateSub(BV, A) : NonNull;
+      if (auto *AV = getFreelyInverted(A, A->hasOneUse(), Builder, Depth))
+        return Builder ? Builder->CreateSub(AV, B) : NonNull;
+      return nullptr;
+    }
+
+    // If `V` is of the form `A ^ ~B` then `~(A ^ ~B)` can be folded
+    // into `A ^ B` if we are willing to invert all of the uses.
+    if (match(V, m_Xor(m_Value(A), m_Value(B)))) {
+      if (auto *BV = getFreelyInverted(B, B->hasOneUse(), Builder, Depth))
+        return Builder ? Builder->CreateXor(A, BV) : NonNull;
+      if (auto *AV = getFreelyInverted(A, A->hasOneUse(), Builder, Depth))
+        return Builder ? Builder->CreateXor(AV, B) : NonNull;
+      return nullptr;
+    }
 
     // If `V` is of the form `Constant - A` then `-1 - V` can be folded into
     // `A + (-1 - Constant)` if we are willing to invert all of the uses.
-    if (match(V, m_Sub(PatternMatch::m_ImmConstant(), PatternMatch::m_Value())))
-      return WillInvertAllUses;
-
-    // Selects with invertible operands are freely invertible
-    if (match(V,
-              m_Select(PatternMatch::m_Value(), m_Not(PatternMatch::m_Value()),
-                       m_Not(PatternMatch::m_Value()))))
-      return WillInvertAllUses;
-
-    // Min/max may be in the form of intrinsics, so handle those identically
-    // to select patterns.
-    if (match(V, m_MaxOrMin(m_Not(PatternMatch::m_Value()),
-                            m_Not(PatternMatch::m_Value()))))
-      return WillInvertAllUses;
-
-    return false;
+    if (match(V, m_Sub(m_Value(A), m_Value(B)))) {
+      if (auto *AV = getFreelyInverted(A, A->hasOneUse(), Builder, Depth))
+        return Builder ? Builder->CreateAdd(AV, B) : NonNull;
+      return nullptr;
+    }
+
+    // If `V` is of the form `(~A) s>> B` then `~((~A) s>> B)` can be folded
+    // into `A s>> B` if we are willing to invert all of the uses.
+    if (match(V, m_AShr(m_Value(A), m_Value(B)))) {
+      if (auto *AV = getFreelyInverted(A, A->hasOneUse(), Builder, Depth))
+        return Builder ? Builder->CreateAShr(AV, B) : NonNull;
+      return nullptr;
+    }
+
+    Value *Cond;
+    // LogicOps are special in that we canonicalize them at the cost of an
+    // instruction.
+    bool IsSelect = match(V, m_Select(m_Value(Cond), m_Value(A), m_Value(B))) &&
+                    !shouldAvoidAbsorbingNotIntoSelect(*cast<SelectInst>(V));
+    // Selects/min/max with invertible operands are freely invertible
+    if (IsSelect || match(V, m_MaxOrMin(m_Value(A), m_Value(B)))) {
+      if (!getFreelyInverted(A, A->hasOneUse(), /*Builder*/ nullptr, Depth))
+        return nullptr;
+      if (Value *NotB = getFreelyInverted(B, B->hasOneUse(), Builder, Depth)) {
+        if (Builder != nullptr) {
+          Value *NotA = getFreelyInverted(A, A->hasOneUse(), Builder, Depth);
+          assert(
+              NotA != nullptr &&
+              "Unable to build inverted value for known freely invertable op");
+          if (auto *II = dyn_cast<IntrinsicInst>(V))
+            return Builder->CreateBinaryIntrinsic(
+                getInverseMinMaxIntrinsic(II->getIntrinsicID()), NotA, NotB);
+          return Builder->CreateSelect(Cond, NotA, NotB);
+        }
+        return NonNull;
+      }
+    }
+
+    return nullptr;
+  }
+
+  /// Return true if the specified value is free to invert (apply ~ to).
+  /// This happens in cases where the ~ can be eliminated.  If WillInvertAllUses
+  /// is true, work under the assumption that the caller intends to remove all
+  /// uses of V and only keep uses of ~V.
+  ///
+  /// See also: canFreelyInvertAllUsersOf()
+  static bool isFreeToInvert(Value *V, bool WillInvertAllUses) {
+    return getFreelyInverted(V, WillInvertAllUses, /*Builder*/ nullptr,
+                             /*Depth*/ 0) != nullptr;
   }
 
   /// Given i1 V, can every user of V be freely adapted if V is changed to !V ?
diff --git a/llvm/lib/Transforms/InstCombine/InstCombineAddSub.cpp b/llvm/lib/Transforms/InstCombine/InstCombineAddSub.cpp
index 318992b55e4f9f8..a7d801c452e9ad7 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineAddSub.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineAddSub.cpp
@@ -2198,12 +2198,14 @@ Instruction *InstCombinerImpl::visitSub(BinaryOperator &I) {
   // (~X) - (~Y) --> Y - X
   // This is placed after the other reassociations and explicitly excludes a
   // sub-of-sub pattern to avoid infinite looping.
-  if (isFreeToInvert(Op0, Op0->hasOneUse()) &&
-      isFreeToInvert(Op1, Op1->hasOneUse()) &&
-      !match(Op0, m_Sub(m_ImmConstant(), m_Value()))) {
-    Value *NotOp0 = Builder.CreateNot(Op0);
-    Value *NotOp1 = Builder.CreateNot(Op1);
-    return BinaryOperator::CreateSub(NotOp1, NotOp0);
+  if (!match(Op0, m_Sub(m_ImmConstant(), m_Value())) &&
+      isFreeToInvert(Op0, Op0->hasOneUse())) {
+    if (Value *NotOp1 = getFreelyInverted(Op1, Op1->hasOneUse(), &Builder)) {
+      Value *NotOp0 = getFreelyInverted(Op0, Op0->hasOneUse(), &Builder);
+      assert(NotOp0 != nullptr &&
+             "isFreeToInvert desynced with getFreelyInverted");
+      return BinaryOperator::CreateSub(NotOp1, NotOp0);
+    }
   }
 
   auto m_AddRdx = [](Value *&Vec) {
diff --git a/llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp b/llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
index 46af9bf5eed003a..819bcb81f4a111d 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp
@@ -2503,16 +2503,26 @@ Instruction *InstCombinerImpl::visitAnd(BinaryOperator &I) {
       return BinaryOperator::CreateAnd(Op1, B);
 
     // (A ^ B) & ((B ^ C) ^ A) -> (A ^ B) & ~C
-    if (match(Op0, m_Xor(m_Value(A), m_Value(B))))
-      if (match(Op1, m_Xor(m_Xor(m_Specific(B), m_Value(C)), m_Specific(A))))
-        if (Op1->hasOneUse() || isFreeToInvert(C, C->hasOneUse()))
-          return BinaryOperator::CreateAnd(Op0, Builder.CreateNot(C));
+    if (match(Op0, m_Xor(m_Value(A), m_Value(B)))) {
+      if (match(Op1, m_Xor(m_Xor(m_Specific(B), m_Value(C)), m_Specific(A)))) {
+        Value *NotC = Op1->hasOneUse()
+                          ? Builder.CreateNot(C)
+                          : getFreelyInverted(C, C->hasOneUse(), &Builder);
+        if (NotC != nullptr)
+          return BinaryOperator::CreateAnd(Op0, NotC);
+      }
+    }
 
     // ((A ^ C) ^ B) & (B ^ A) -> (B ^ A) & ~C
-    if (match(Op0, m_Xor(m_Xor(m_Value(A), m_Value(C)), m_Value(B))))
-      if (match(Op1, m_Xor(m_Specific(B), m_Specific(A))))
-        if (Op0->hasOneUse() || isFreeToInvert(C, C->hasOneUse()))
+    if (match(Op0, m_Xor(m_Xor(m_Value(A), m_Value(C)), m_Value(B)))) {
+      if (match(Op1, m_Xor(m_Specific(B), m_Specific(A)))) {
+        Value *NotC = Op0->hasOneUse()
+                          ? Builder.CreateNot(C)
+                          : getFreelyInverted(C, C->hasOneUse(), &Builder);
+        if (NotC != nullptr)
           return BinaryOperator::CreateAnd(Op1, Builder.CreateNot(C));
+      }
+    }
 
     // (A | B) & (~A ^ B) -> A & B
     // (A | B) & (B ^ ~A) -> A & B
@@ -3997,14 +4007,14 @@ static Instruction *visitMaskedMerge(BinaryOperator &I,
 static Instruction *sinkNotIntoXor(BinaryOperator &I, Value *X, Value *Y,
                                    InstCombiner::BuilderTy &Builder) {
   // We only want to do the transform if it is free to do.
-  if (InstCombiner::isFreeToInvert(X, X->hasOneUse())) {
-    // Ok, good.
-  } else if (InstCombiner::isFreeToInvert(Y, Y->hasOneUse())) {
+  Value *NotX = InstCombiner::getFreelyInverted(X, X->hasOneUse(), &Builder);
+  if (NotX == nullptr) {
     std::swap(X, Y);
-  } else
+    NotX = InstCombiner::getFreelyInverted(X, X->hasOneUse(), &Builder);
+  }
+  if (NotX == nullptr)
     return nullptr;
 
-  Value *NotX = Builder.CreateNot(X, X->getName() + ".not");
   return BinaryOperator::CreateXor(NotX, Y, I.getName() + ".demorgan");
 }
 
@@ -4317,13 +4327,15 @@ Instruction *InstCombinerImpl::foldNot(BinaryOperator &I) {
   auto *II = dyn_cast<IntrinsicInst>(NotOp);
   if (II && II->hasOneUse()) {
     if (match(NotOp, m_MaxOrMin(m_Value(X), m_Value(Y))) &&
-        isFreeToInvert(X, X->hasOneUse()) &&
-        isFreeToInvert(Y, Y->hasOneUse())) {
-      Intrinsic::ID InvID = getInverseMinMaxIntrinsic(II->getIntrinsicID());
-      Value *NotX = Builder.CreateNot(X);
-      Value *NotY = Builder.CreateNot(Y);
-      Value *InvMaxMin = Builder.CreateBinaryIntrinsic(InvID, NotX, NotY);
-      return replaceInstUsesWith(I, InvMaxMin);
+        isFreeToInvert(X, X->hasOneUse())) {
+      if (Value *NotY = getFreelyInverted(Y, Y->hasOneUse(), &Builder)) {
+        Intrinsic::ID InvID = getInverseMinMaxIntrinsic(II->getIntrinsicID());
+        Value *NotX = getFreelyInverted(X, X->hasOneUse(), &Builder);
+        assert(NotX != nullptr &&
+               "isFreeToInvert desynced with getFreelyInverted");
+        Value *InvMaxMin = Builder.CreateBinaryIntrinsic(InvID, NotX, NotY);
+        return replaceInstUsesWith(I, InvMaxMin);
+      }
     }
     if (match(NotOp, m_c_MaxOrMin(m_Not(m_Value(X)), m_Value(Y)))) {
       Intrinsic::ID InvID = getInverseMinMaxIntrinsic(II->getIntrinsicID());
@@ -4374,6 +4386,11 @@ Instruction *InstCombinerImpl::foldNot(BinaryOperator &I) {
   if (Instruction *NewXor = foldNotXor(I, Builder))
     return NewXor;
 
+  // TODO: Could handle multi-use better by checking if all uses of NotOp (other
+  // than I) can be inverted.
+  if (Value *R = getFreelyInverted(NotOp, NotOp->hasOneUse(), &Builder))
+    return replaceInstUsesWith(I, R);
+
   return nullptr;
 }
 
diff --git a/llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp b/llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
index 64cbfebf0102876..f8346cd03849acf 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp
@@ -1698,12 +1698,12 @@ Instruction *InstCombinerImpl::visitCallInst(CallInst &CI) {
     auto moveNotAfterMinMax = [&](Value *X, Value *Y) -> Instruction * {
       Value *A;
       if (match(X, m_OneUse(m_Not(m_Value(A)))) &&
-          !isFreeToInvert(A, A->hasOneUse()) &&
-          isFreeToInvert(Y, Y->hasOneUse())) {
-        Value *NotY = Builder.CreateNot(Y);
-        Intrinsic::ID InvID = getInverseMinMaxIntrinsic(IID);
-        Value *InvMaxMin = Builder.CreateBinaryIntrinsic(InvID, A, NotY);
-        return BinaryOperator::CreateNot(InvMaxMin);
+          !isFreeToInvert(A, A->hasOneUse())) {
+        if (Value *NotY = getFreelyInverted(Y, Y->hasOneUse(), &Builder)) {
+          Intrinsic::ID InvID = getInverseMinMaxIntrinsic(IID);
+          Value *InvMaxMin = Builder.CreateBinaryIntrinsic(InvID, A, NotY);
+          return BinaryOperator::CreateNot(InvMaxMin);
+        }
       }
       return nullptr;
     };
diff --git a/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp b/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
index 9bc84c7dd6e1539..74865c487c2b2d2 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
@@ -3226,10 +3226,12 @@ Instruction *InstCombinerImpl::foldICmpBitCast(ICmpInst &Cmp) {
   // icmp eq/ne (bitcast (not X) to iN), -1 --> icmp eq/ne (bitcast X to iN), 0
   // Example: are all elements equal? --> are zero elements not equal?
   // TODO: Try harder to reduce compare of 2 freely invertible operands?
-  if (Cmp.isEquality() && C->isAllOnes() && Bitcast->hasOneUse() &&
-      isFreeToInvert(BCSrcOp, BCSrcOp->hasOneUse())) {
-    Value *Cast = Builder.CreateBitCast(Builder.CreateNot(BCSrcOp), DstType);
-    return new ICmpInst(Pred, Cast, ConstantInt::getNullValue(DstType));
+  if (Cmp.isEquality() && C->isAllOnes() && Bitcast->hasOneUse()) {
+    if (Value *NotBCSrcOp =
+            getFreelyInverted(BCSrcOp, BCSrcOp->hasOneUse(), &Builder)) {
+      Value *Cast = Builder.CreateBitCast(NotBCSrcOp, DstType);
+      return new ICmpInst(Pred, Cast, ConstantInt::getNullValue(DstType));
+    }
   }
 
   // If this is checking if all elements of an extended vector are clear or not,
@@ -4494,14 +4496,13 @@ static Instruction *foldICmpOrXX(ICmpInst &I, const SimplifyQuery &Q,
 
   if (ICmpInst::isEquality(Pred) && Op0->hasOneUse()) {
     // icmp (X | Y) eq/ne Y --> (X & ~Y) eq/ne 0 if Y is freely invertible
-    if (IC.isFreeToInvert(Op1, Op1->hasOneUse()))
-      return new ICmpInst(Pred,
-                          IC.Builder.CreateAnd(A, IC.Builder.CreateNot(Op1)),
+    if (Value *NotOp1 =
+            IC.getFreelyInverted(Op1, Op1->hasOneUse(), &IC.Builder))
+      return new ICmpInst(Pred, IC.Builder.CreateAnd(A, NotOp1),
                           Constant::getNullValue(Op1->getType()));
     // icmp (X | Y) eq/ne Y --> (~X | Y) eq/ne -1 if X  is freely invertible.
-    if (IC.isFreeToInvert(A, A->hasOneUse()))
-      return new ICmpInst(Pred,
-                          IC.Builder.CreateOr(Op1, IC.Builder.CreateNot(A)),
+    if (Value *NotA = IC.getFreelyInverted(A, A->hasOneUse(), &IC.Builder))
+      return new ICmpInst(Pred, IC.Builder.CreateOr(Op1, NotA),
                           Constant::getAllOnesValue(Op1->getType()));
   }
   return nullptr;
diff --git a/llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp b/llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
index 71c2d68881441ac..2dda46986f0fd08 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineSelect.cpp
@@ -3075,18 +3075,18 @@ Instruction *InstCombinerImpl::foldSelectOfBools(SelectInst &SI) {
     return SelectInst::Create(TrueVal, OrV, Zero);
   }
   // select (c & b), a, b -> select b, (select ~c, true, a), false
-  if (match(CondVal, m_OneUse(m_c_And(m_Value(C), m_Specific(FalseVal)))) &&
-      isFreeToInvert(C, C->hasOneUse())) {
-    Value *NotC = Builder.CreateNot(C);
-    Value *OrV = Builder.CreateSelect(NotC, One, TrueVal);
-    return SelectInst::Create(FalseVal, OrV, Zero);
+  if (match(CondVal, m_OneUse(m_c_And(m_Value(C), m_Specific(FalseVal))))) {
+    if (Value *NotC = getFreelyInverted(C, C->hasOneUse(), &Builder)) {
+      Value *OrV = Builder.CreateSelect(NotC, One, TrueVal);
+      return SelectInst::Create(FalseVal, OrV, Zero);
+    }
   }
   // select (a | c), a, b -> select a, true, (select ~c, b, false)
-  if (match(CondVal, m_OneUse(m_c_Or(m_Specific(TrueVal), m_Value(C)))) &&
-      isFreeToInvert(C, C->hasOneUse())) {
-    Value *NotC = Builder.CreateNot(C);
-    Value *AndV = Builder.CreateSelect(NotC, FalseVal, Zero);
-    return SelectInst::Create(TrueVal, One, AndV);
+  if (match(CondVal, m_OneUse(m_c_Or(m_Specific(TrueVal), m_Value(C))))) {
+    if (Value *NotC = getFreelyInverted(C, C->hasOneUse(), &Builder)) {
+      Value *AndV = Builder.CreateSelect(NotC, FalseVal, Zero);
+      return SelectInst::Create(TrueVal, One, AndV);
+    }
   }
   // select (c & ~b), a, b -> select b, true, (select c, a, false)
   if (match(CondVal,
diff --git a/llvm/test/Transforms/InstCombine/free-inversion.ll b/llvm/test/Transforms/InstCombine/free-inversion.ll
new file mode 100644
index 000000000000000..4e5eef2b69b4dc0
--- /dev/null
+++ b/llvm/test/Transforms/InstCombine/free-inversion.ll
@@ -0,0 +1,377 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
+; RUN: opt < %s -passes=instcombine -S | FileCheck %s
+
+declare i8 @llvm.smin.i8(i8, i8)
+declare i8 @llvm.umin.i8(i8, i8)
+declare i8 @llvm.smax.i8(i8, i8)
+declare i8 @llvm.umax.i8(i8, i8)
+
+declare void @use.i8(i8)
+
+define i8 @xor_1(i8 %a, i1 %c, i8 %x, i8 %y) {
+; CHECK-LABEL: @xor_1(
+; CHECK-NEXT:    [[TMP1:%.*]] = xor i8 [[Y:%.*]], -124
+; CHECK-NEXT:    [[TMP2:%.*]] = select i1 [[C:%.*]], i8 [[X:%.*]], i8 [[TMP1]]
+; CHECK-NEXT:    [[NOT_BA:%.*]] = xor i8 [[TMP2]], [[A:%.*]]
+; CHECK-NEXT:    ret i8 [[NOT_BA]]
+;
+  %nx = xor i8 %x, -1
+  %yy = xor i8 %y, 123
+  %b = select i1 %c, i8 %nx, i8 %yy
+  %ba = xor i8 %b, %a
+  %not_ba = xor i8 %ba, -1
+  ret i8 %not_ba
+}
+
+define i8 @xor_2(i8 %a, i1 %c, i8 %x, i8 %y) {
+; CHECK-LABEL: @xor_2(
+; CHECK-NEXT:    [[TMP1:%.*]] = xor i8 [[Y:%.*]], -124
+; CHECK-NEXT:    [[TMP2:%.*]] = select i1 [[C:%.*]], i8 [[X:%.*]], i8 [[TMP1]]
+; CHECK-NEXT:    [[NOT_AB:%.*]] = xor i8 [[TMP2]], [[A:%.*]]
+; CHECK-NEXT:    ret i8 [[NOT_AB]]
+;
+  %nx = xor i8 %x, -1
+  %yy = xor i8 %y, 123
+  %b = select i1 %c, i8 %nx, i8 %yy
+  %ab = xor i8 %a, %b
+  %not_ab = xor i8 %ab, -1
+  ret i8 %not_ab
+}
+
+define i8 @xor_fail(i8 %a, i1 %c, i8 %x, i8 %y) {
+; CHECK-LABEL: @xor_fail(
+; CHECK-NEXT:    [[NX:%.*]] = xor i8 [[X:%.*]], -1
+; CHECK-NEXT:    [[B:%.*]] = select i1 [[C:%.*]], i8 [[NX]], i8 [[Y:%.*]]
+; CHECK-NEXT:    [[AB:%.*]] = xor i8 [[B]], [[A:%.*]]
+; CHECK-NEXT:    [[NOT_AB:%.*]] = xor i8 [[AB]], -1
+; CHECK-NEXT:    ret i8 [[NOT_AB]]
+;
+  %nx = xor i8 %x, -1
+  %b = select i1 %c, i8 %nx, i8 %y
+  %ab = xor i8 %a, %b
+  %not_ab = xor i8 %ab, -1
+  ret i8 %not_ab
+}
+
+define i8 @add_1(i8 %a, i1 %c, i8 %x, i8 %y) {
+; CHECK-LABEL: @add_1(
+; CHECK-NEXT:    [[TMP1:%.*]] = xor i8 [[Y:%.*]], -124
+; CHECK-NEXT:    [[TMP2:%.*]] = select i1 [[C:%.*]], i8 [[X:%.*]], i8 [[TMP1]]
+; CHECK-NEXT:    [[NOT_BA:%.*]] = sub i8 [[TMP2]], [[A:%.*]]
+; CHECK-NEXT:    ret i8 [[NOT_BA]]
+;
+  %nx = xor i8 %x, -1
+  %yy = xor i8 %y, 123
+  %b = select i1 %c, i8 %nx, i8 %yy
+  %ba = add i8 %b, %a
+  %not_ba = xor i8 %ba, -1
+  ret i8 %not_ba
+}
+
+define i8 @add_2(i8 %a, i1 %c, i8 %x, i8 %y) {
+; CHECK-LABEL: @add_2(
+; CHECK-NEXT:    [[TMP1:%.*]] = xor i8 [[Y:%.*]], -124
+; CHECK-NEXT:    [[TMP2:%.*]] = select i1 [[C:%.*...
[truncated]

@goldsteinn goldsteinn changed the title [InstCombine] Improve (icmp pred (and X, Y), ...) fold. [InstCombine] Improve handling of not and free inversion. Nov 18, 2023
@goldsteinn
Copy link
Contributor Author

Only looked at the first three commits, which imho could be a separate PR (maybe drop the rest from this one for now).
I think after adding getFreeInverted() it would be good to also go through existing users of isFreeToInvert() and replace them with getFreeInverted() where this is easy, e.g. in foldICmpOrXX(). Directly creating the inverted op instead of letting it get inverted separately does seem a lot more robust to me.

Thank you for the review.

Regarding replacing getFreeInverted with isFreeToInvert + (creating a not). So I do add a fold for using getFreeInverted in not fold. My feeling was we have places where we create a not in a canonicalizing effort and more places recognize m_Not than isFreeToInvert, so it made sense to let the new pattern be checked by other folds before stripping the not. But not married to that idea.

Okay, its split to 4 four commits related to the free inversion (re-titled).
The fourth commit is essentially NFC (although not quite due to change in inst order/names) and puts getFreelyInverted in a few places.

@goldsteinn goldsteinn force-pushed the goldsteinn/and-canonicalization branch from 35dbbb9 to 8748b39 Compare November 18, 2023 18:03
@goldsteinn goldsteinn force-pushed the goldsteinn/and-canonicalization branch from 8748b39 to 525bd44 Compare November 19, 2023 18:51
@goldsteinn goldsteinn force-pushed the goldsteinn/and-canonicalization branch from 525bd44 to f129fc9 Compare November 19, 2023 20:25
@goldsteinn goldsteinn force-pushed the goldsteinn/and-canonicalization branch 2 times, most recently from 1a66941 to 8772bb0 Compare November 19, 2023 21:21
Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with the last commit removed (the test coverage is quite insufficient around m_AddSubInvertable, and I don't particularly want this fold anyway).

@@ -2224,12 +2224,17 @@ Instruction *InstCombinerImpl::visitSub(BinaryOperator &I) {
// (~X) - (~Y) --> Y - X
// This is placed after the other reassociations and explicitly excludes a
// sub-of-sub pattern to avoid infinite looping.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment above is outdated / redundant with the one below.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, at least the part about the "sub-of-sub pattern".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although the whole thing is deprecated as ensuring we consume a not is what precludes inf loop.

@@ -233,66 +233,38 @@ class LLVM_LIBRARY_VISIBILITY InstCombiner {
PatternMatch::m_Value()));
}

/// Return nonnull value if V is free to invert (with condition) regarding
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what the (with condition) bit here is supposed to mean.

@@ -233,66 +233,38 @@ class LLVM_LIBRARY_VISIBILITY InstCombiner {
PatternMatch::m_Value()));
}

/// Return nonnull value if V is free to invert (with condition) regarding
/// WillInvertAllUses.
/// If Builder is nonnull, it will return a simplified ~V
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: period at end of sentence.

@goldsteinn
Copy link
Contributor Author

goldsteinn commented Nov 20, 2023

LGTM with the last commit removed (the test coverage is quite insufficient around m_AddSubInvertable, and I don't particularly want this fold anyway).

I'd be happy to improve the coverage. Why are you opposed to the commit as a whole?
Also do you mean removed entirely, or is changing is to ConsumeLHS && ConsumeRHS okay?

@nikic
Copy link
Contributor

nikic commented Nov 20, 2023

LGTM with the last commit removed (the test coverage is quite insufficient around m_AddSubInvertable, and I don't particularly want this fold anyway).

I'd be happy to improve the coverage. Why are you opposed to the commit as a whole? Also do you mean removed entirely, or is changing is to ConsumeLHS && ConsumeRHS okay?

It looks like a fold with very low value but very high risk. Limiting it to ConsumeLHS && ConsumeRHS (with &&, not ||, and no special add/sub cases) would probably be acceptable.

@goldsteinn goldsteinn force-pushed the goldsteinn/and-canonicalization branch from 8772bb0 to 893bea1 Compare November 20, 2023 18:45
If `Y` and `Z` are constant then we can simplify to `(X - W) + (Y -
Z)`. If `Y == Z` we can fold to `X - W`.

Note these transform exist outside of InstCombine. The purpose of this
commit is primarily to make it so that folds can generate these
simplifiable patterns without having to worry about creating an inf
loop.
Some Instructions (select/min/max) are inverted by just inverting the
operands. So the answer of whether they are free to invert is really
just whether the operands are free to invert.

Differential Revision: https://reviews.llvm.org/D159056
…on of op

With the current logic of `if(isFreeToInvert(Op)) return Not(Op)` its
fairly easy to either 1) cause regressions or 2) infinite loops
if the folds we have for `Not(Op)` ever de-sync with the cases we
know are freely invertible.

This patch adds `getFreeInverted` which is able to build the free
inverted op along with check for free inversion to alleviate this
problem.
…nverted`

This is nearly an NFC, the only change is potentially to order that
values are created/names.

Otherwise it is a slight speed boost/simplification to avoid having to
go through the `getFreelyInverted` recursive logic twice to simplify
the extra `not` op.
@goldsteinn
Copy link
Contributor Author

LGTM with the last commit removed (the test coverage is quite insufficient around m_AddSubInvertable, and I don't particularly want this fold anyway).

I'd be happy to improve the coverage. Why are you opposed to the commit as a whole? Also do you mean removed entirely, or is changing is to ConsumeLHS && ConsumeRHS okay?

It looks like a fold with very low value but very high risk. Limiting it to ConsumeLHS && ConsumeRHS (with &&, not ||, and no special add/sub cases) would probably be acceptable.
Okay, done (added b.c noticed after changing the logic for ~Y - ~X had a regression).

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Comment on lines 1670 to 1672
if (isFreeToInvert(LHS, LHS->hasOneUse(), ConsumesLHS) &&
isFreeToInvert(RHS, RHS->hasOneUse(), ConsumesRHS) && ConsumesLHS &&
ConsumesRHS) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (isFreeToInvert(LHS, LHS->hasOneUse(), ConsumesLHS) &&
isFreeToInvert(RHS, RHS->hasOneUse(), ConsumesRHS) && ConsumesLHS &&
ConsumesRHS) {
if (isFreeToInvert(LHS, LHS->hasOneUse(), ConsumesLHS) && ConsumeLHS &&
isFreeToInvert(RHS, RHS->hasOneUse(), ConsumesRHS) && ConsumesRHS) {

Can short-circuit the second call.

@@ -1662,6 +1662,24 @@ Instruction *InstCombinerImpl::visitAdd(BinaryOperator &I) {
m_c_UMin(m_Deferred(A), m_Deferred(B))))))
return BinaryOperator::CreateWithCopiedFlags(Instruction::Add, A, B, &I);

// (~X) + (~Y) --> (-2 - Y) - X
{
// To ensure we can save instructions we need to ensure that we either
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// To ensure we can save instructions we need to ensure that we either
// To ensure we can save instructions we need to ensure that we

"isFreeToInvert desynced with getFreelyInverted");
Value *RHSMinus2 =
Builder.CreateSub(ConstantInt::get(RHS->getType(), -2), NotRHS);
return BinaryOperator::CreateSub(RHSMinus2, NotLHS);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe InstCombine's canonical form for this is -2 - (X + Y) rather than (-2 - X) - Y.

@goldsteinn goldsteinn force-pushed the goldsteinn/and-canonicalization branch from 893bea1 to 0b79a63 Compare November 20, 2023 22:43
petar-avramovic added a commit to petar-avramovic/llpc that referenced this pull request Nov 22, 2023
Update shader tests after a set of upstream changes in inst-combine.
Last patch in the set is 5271d330770573d2df772aa0fa50d958f44400aa
llvm/llvm-project#66787
Disable affected tests in this patch.
TODO: re-enable tests once everything has propagated.
trenouf pushed a commit to GPUOpen-Drivers/llpc that referenced this pull request Nov 22, 2023
Update shader tests after a set of upstream changes in inst-combine.
Last patch in the set is 5271d330770573d2df772aa0fa50d958f44400aa
llvm/llvm-project#66787
Disable affected tests in this patch.
TODO: re-enable tests once everything has propagated.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants