-
Notifications
You must be signed in to change notification settings - Fork 10.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[InstCombine] Fold ctpop(X) eq/ne 1
if X is non-zero
#67268
Conversation
@llvm/pr-subscribers-llvm-transforms ChangesThis patch folds pattern Full diff: https://github.com/llvm/llvm-project/pull/67268.diff 2 Files Affected:
diff --git a/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp b/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
index a219dac7acfbe16..9aafd83d42d0756 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
@@ -3412,6 +3412,14 @@ static Instruction *foldCtpopPow2Test(ICmpInst &I, IntrinsicInst *CtpopLhs,
const SimplifyQuery &Q) {
assert(CtpopLhs->getIntrinsicID() == Intrinsic::ctpop &&
"Non-ctpop intrin in ctpop fold");
+
+ const ICmpInst::Predicate Pred = I.getPredicate();
+ // If we know X is non-zero, we can fold isPow2OrZero into isPow2.
+ if (Pred == ICmpInst::ICMP_ULT && CRhs == 2 &&
+ isKnownNonZero(CtpopLhs, Q.DL, /*Depth*/ 0, Q.AC, Q.CxtI, Q.DT))
+ return ICmpInst::Create(Instruction::ICmp, ICmpInst::ICMP_EQ, CtpopLhs,
+ ConstantInt::get(CtpopLhs->getType(), 1));
+
if (!CtpopLhs->hasOneUse())
return nullptr;
@@ -3423,7 +3431,6 @@ static Instruction *foldCtpopPow2Test(ICmpInst &I, IntrinsicInst *CtpopLhs,
// If we know any bit of X can be folded to:
// IsPow2 : X & (~Bit) == 0
// NotPow2 : X & (~Bit) != 0
- const ICmpInst::Predicate Pred = I.getPredicate();
if (((I.isEquality() || Pred == ICmpInst::ICMP_UGT) && CRhs == 1) ||
(Pred == ICmpInst::ICMP_ULT && CRhs == 2)) {
Value *Op = CtpopLhs->getArgOperand(0);
diff --git a/llvm/test/Transforms/InstCombine/ispow2.ll b/llvm/test/Transforms/InstCombine/ispow2.ll
index bbd693b11b388ad..60eb522a144927f 100644
--- a/llvm/test/Transforms/InstCombine/ispow2.ll
+++ b/llvm/test/Transforms/InstCombine/ispow2.ll
@@ -198,7 +198,7 @@ define i1 @is_pow2_non_zero(i32 %x) {
; CHECK-NEXT: [[NOTZERO:%.*]] = icmp ne i32 [[X:%.*]], 0
; CHECK-NEXT: call void @llvm.assume(i1 [[NOTZERO]])
; CHECK-NEXT: [[T0:%.*]] = tail call i32 @llvm.ctpop.i32(i32 [[X]]), !range [[RNG0]]
-; CHECK-NEXT: [[CMP:%.*]] = icmp ult i32 [[T0]], 2
+; CHECK-NEXT: [[CMP:%.*]] = icmp eq i32 [[T0]], 1
; CHECK-NEXT: ret i1 [[CMP]]
;
%notzero = icmp ne i32 %x, 0
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ctpop < 2 is better than ctpop == 1 for the backend, and we will no be able to recover from this transform in the backend if it comes from an assume.
Maybe leave a comment explaining as much though. But nikic is right, we intentionally don't do this. In the backend if we still have non-zero info we optimize this properly. |
511e83e
to
0b124db
Compare
ctpop(X) <u 2
into ctpop(X) == 1
if X is non-zeroctpop(X) == 1
into ctpop(X) <u 2
if X is non-zero
ctpop(X) == 1
into ctpop(X) <u 2
if X is non-zeroctpop(X) eq/ne 1
if X is non-zero
0b124db
to
772cecc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This fold goes against the usual direction in IR, and in particular conflicts with this generic fold:
llvm-project/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
Lines 6207 to 6210 in bb764ec
// A <u C -> A == C-1 if min(A)+1 == C | |
if (*CmpC == Op0Min + 1) | |
return new ICmpInst(ICmpInst::ICMP_EQ, Op0, | |
ConstantInt::get(Op1->getType(), *CmpC - 1)); |
llvm-project/llvm/lib/Analysis/ValueTracking.cpp
Line 1560 in d3505c2
case Intrinsic::ctpop: { |
...0?1
then we could run into an infinite compile loop at that point.
Can we canonicalize |
In theory yes, but in practice CGP drops assumes at the very start, and I don't want to have a special ctpop fold run before that. I'd only take that if we stop dropping assumes in CGP (which is on the long-term roadmap, but tricky). |
This patch does the following folds if we know
X
is non-zero:ctpop(X) == 1 -> ctpop(X) <u 2
ctpop(X) != 1 -> ctpop(X) >u 1
The latter forms give better codegen than the former: https://godbolt.org/z/5beeq8fGz
Alive2: https://alive2.llvm.org/ce/z/GQRQ5T
Fixes #57328.