-
Notifications
You must be signed in to change notification settings - Fork 10.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[InstCombine] Fold fcmp ogt (x - y), 0 into fcmp ogt x, y #85245 #85506
base: main
Are you sure you want to change the base?
Conversation
@dtcxzyw, please check this out before making it ready for review. |
@@ -7972,6 +7972,11 @@ Instruction *InstCombinerImpl::visitFCmpInst(FCmpInst &I) { | |||
Constant *RHSC; | |||
if (match(Op0, m_Instruction(LHSI)) && match(Op1, m_Constant(RHSC))) { | |||
switch (LHSI->getOpcode()) { | |||
case Instruction::FSub: | |||
if (Pred == FCmpInst::FCMP_OGT && match(RHSC, m_PosZeroFP()) && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it holds for other predicates?
Does it holds when RHSC
is -0.0?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it holds for other predicates?
I think FCMP_OLT
, FCMP_ONE
will work. Are there any others that work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the unordered versions of all of them too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes it should work for -0 as well as +0. Yes it should work for all predicates. But this optimization is not valid if denormals are being flushed - does InstCombine have a way to query that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
F.getDenormalType(fltSemantics) == DenormalMode::getIEEE() works, but is pretty ugly
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the tests you have written here, yes it will always fail. You will need to have additional tests covering the no-infinity knowledge.
Also, you're not taking advantage of the fast math flags on the fsub itself by directly looking at the operands. IIRC there's another overload of isKnownNeverInfinity with passed in flags
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For ogt olt one ueq uge ule
you do not need to worry about infinity.
For ugt ult une oeq oge ole
you need to rule out the +inf - +inf = nan
and -inf - -inf = nan
cases. To do that, you can check for any of:
- The subtraction has the ninf flag.
- The subtraction has the nnan flag.
- X is known never infinity.
- Y is known never infinity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or you could just submit a first patch that only handles the easy predicates ogt olt one ueq uge ule
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As you suggested, I've already incorporated fast flags into the fsub
operation for some tests. Additionally, I couldn't locate another version of isKnownNeverInfinity
that accepts flags
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
inline KnownFPClass computeKnownFPClass(const Value *V, FastMathFlags FMF, |
@@ -7972,6 +7972,11 @@ Instruction *InstCombinerImpl::visitFCmpInst(FCmpInst &I) { | |||
Constant *RHSC; | |||
if (match(Op0, m_Instruction(LHSI)) && match(Op1, m_Constant(RHSC))) { | |||
switch (LHSI->getOpcode()) { | |||
case Instruction::FSub: | |||
if (Pred == FCmpInst::FCMP_OGT && match(RHSC, m_PosZeroFP()) && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the unordered versions of all of them too
call void @use(float %fs) | ||
%cmp = fcmp ogt float %fs, 0.000000e+00 | ||
ret i1 %cmp | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add some tests to show fast math flag preservation behavior? Also some vectors, and all the handled predicate types
879940d
to
2d32ffe
Compare
I finished some dirty work. Does it look alright? |
case FCmpInst::FCMP_OGE: | ||
case FCmpInst::FCMP_OLE: { | ||
BinaryOperator *SubI = cast<BinaryOperator>(LHSI); | ||
if (!computeKnownFPClass(SubI->getOperand(0), SubI->getFastMathFlags(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This handles fsub ninf
but fsub nnan
would also be sufficient.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jayfoad Is there a more efficient way to achieve this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jayfoad Perhaps we could handle this that way.
if (!computeKnownFPClass(SubI->getOperand(0), SubI->getFastMathFlags(),
fcNan | fcInf, LHSI, 0)
.isKnownNever(fcNan | fcInf) &&
!computeKnownFPClass(SubI->getOperand(1), SubI->getFastMathFlags(),
fcNan | fcInf, LHSI, 0)
.isKnownNever(fcNan | fcInf))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@SahilPatidar isKnownNeverInfOrNaN
would be better :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dtcxzyw, could you please help me understand why it isn't working for the nnan, ninf and fast flags?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, sorry for my mistake.
IIRC @jayfoad means that we should make sure the result of fsub is never NaN.
Can you try this?
if (!isKnownNeverNaN(SubI, /*Depth=*/0, getSimplifyQuery().getWithInstruction(&I)))
break;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
edit - isKnownNeverInfOrNaN
works only with the fast
flag, not with individual flags like ninf
or nnan
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (!isKnownNeverNaN(SubI, /*Depth=*/0, getSimplifyQuery().getWithInstruction(&I))) break;
That's not quite precise. It's OK for the result to be nan if one of the inputs is nan, e.g. x - nan = nan
. The only case we want to disallow is inf - inf = nan
.
I think it's probably simplest to add explicit checks for the fast math flags, like:
if (!SubI.hasNoNaNs() && !SubI.hasNoInfs() &&
!isKnownNeverInf(SubI->getOperand(0)) &&
!isKnownNeverInf(SubI->getOperand(1))
This also has the advantage of doing the quick tests for flags before any of the potentially slow calls to "isKnownNever" functions.
; CHECK-NEXT: [[CMP:%.*]] = fcmp une <8 x float> [[X:%.*]], [[Y:%.*]] | ||
; CHECK-NEXT: ret <8 x i1> [[CMP]] | ||
; | ||
%fs = fsub fast <8 x float> %x, %y |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
None of these test flags on the fcmp
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still needs a check for denormals being flushed, as mentioned earlier.
2d32ffe
to
724cfef
Compare
@jayfoad, how can I check for denormals being flushed? |
@jayfoad, |
You would need to take a test where your optimization does apply and then make a copy of it with an attribute like this, and show that the optimization does not apply to it:
See the LangRef for the allowed values of the |
@jayfoad, I tried setting the |
Need to add these in some test functions |
!isKnownNeverInfinity(LHSI->getOperand(0), /*Depth=*/0, | ||
getSimplifyQuery().getWithInstruction(&I)) && | ||
!isKnownNeverInfinity(LHSI->getOperand(1), /*Depth=*/0, | ||
getSimplifyQuery().getWithInstruction(&I))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
!isKnownNeverInfinity(LHSI->getOperand(0), /*Depth=*/0, | |
getSimplifyQuery().getWithInstruction(&I)) && | |
!isKnownNeverInfinity(LHSI->getOperand(1), /*Depth=*/0, | |
getSimplifyQuery().getWithInstruction(&I))) | |
!isKnownNeverInfinity(LHSI->getOperand(1), /*Depth=*/0, | |
getSimplifyQuery().getWithInstruction(&I)) && | |
!isKnownNeverInfinity(LHSI->getOperand(0), /*Depth=*/0, | |
getSimplifyQuery().getWithInstruction(&I))) |
724cfef
to
773d159
Compare
@arsenm Is there any further comment? |
LGTM. I see "This pull request is still a work in progress Draft pull requests cannot be merged." |
LGTM with some nits, also has some conflicts. Should be converted to a not-draft PR |
getSimplifyQuery().getWithInstruction(&I))) | ||
break; | ||
} | ||
LLVM_FALLTHROUGH; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can just use [[fallthrough]] now. Also belongs before the }
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't need the braces anyway.
ret <8 x i1> %cmp | ||
} | ||
|
||
define <8 x i1> @fcmp_ugt_fsub_const_vec_denormal_positive-zero(<8 x float> %x, <8 x float> %y) "denormal-fp-math"="positive-zero" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
define <8 x i1> @fcmp_ugt_fsub_const_vec_denormal_positive-zero(<8 x float> %x, <8 x float> %y) "denormal-fp-math"="positive-zero" { | |
define <8 x i1> @fcmp_ugt_fsub_const_vec_denormal_positive-zero(<8 x float> %x, <8 x float> %y) "denormal-fp-math"="positive-zero,positive-zero" { |
ret <8 x i1> %cmp | ||
} | ||
|
||
define <8 x i1> @fcmp_ogt_fsub_const_vec_denormal_preserve-sign(<8 x float> %x, <8 x float> %y) "denormal-fp-math"="preserve-sign" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
define <8 x i1> @fcmp_ogt_fsub_const_vec_denormal_preserve-sign(<8 x float> %x, <8 x float> %y) "denormal-fp-math"="preserve-sign" { | |
define <8 x i1> @fcmp_ogt_fsub_const_vec_denormal_preserve-sign(<8 x float> %x, <8 x float> %y) "denormal-fp-math"="preserve-sign,preserve-sign" { |
case FCmpInst::FCMP_OEQ: | ||
case FCmpInst::FCMP_OGE: | ||
case FCmpInst::FCMP_OLE: { | ||
if (!LHSI->hasNoNaNs() && !LHSI->hasNoInfs() && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a dag style comment of what this is doing
case FCmpInst::FCMP_UEQ: | ||
case FCmpInst::FCMP_UGE: | ||
case FCmpInst::FCMP_ULE: | ||
if (match(RHSC, m_AnyZeroFP()) && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a dag style comment of what this is doing
@llvm/pr-subscribers-llvm-transforms Author: None (SahilPatidar) ChangesResolve #85245 Patch is 20.55 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/85506.diff 2 Files Affected:
diff --git a/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp b/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
index 7092fb5e509bb..baf8c35f94fba 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineCompares.cpp
@@ -8037,6 +8037,44 @@ Instruction *InstCombinerImpl::visitFCmpInst(FCmpInst &I) {
if (Instruction *NV = FoldOpIntoSelect(I, cast<SelectInst>(LHSI)))
return NV;
break;
+ case Instruction::FSub:
+ switch (Pred) {
+ default:
+ break;
+ case FCmpInst::FCMP_UGT:
+ case FCmpInst::FCMP_ULT:
+ case FCmpInst::FCMP_UNE:
+ case FCmpInst::FCMP_OEQ:
+ case FCmpInst::FCMP_OGE:
+ case FCmpInst::FCMP_OLE:
+ // fsub x, y --> isnnan(x, y) && isninf(x, y)
+ if (!LHSI->hasNoNaNs() && !LHSI->hasNoInfs() &&
+ !isKnownNeverInfinity(LHSI->getOperand(1), /*Depth=*/0,
+ getSimplifyQuery().getWithInstruction(&I)) &&
+ !isKnownNeverInfinity(LHSI->getOperand(0), /*Depth=*/0,
+ getSimplifyQuery().getWithInstruction(&I)))
+ break;
+
+ [[fallthrough]];
+ case FCmpInst::FCMP_OGT:
+ case FCmpInst::FCMP_OLT:
+ case FCmpInst::FCMP_ONE:
+ case FCmpInst::FCMP_UEQ:
+ case FCmpInst::FCMP_UGE:
+ case FCmpInst::FCMP_ULE:
+ // fcmp pred (x - y), 0 --> fcmp pred x, y
+ if (match(RHSC, m_AnyZeroFP()) &&
+ match(LHSI, m_FSub(m_Value(X), m_Value(Y))) &&
+ I.getFunction()->getDenormalMode(
+ LHSI->getType()->getScalarType()->getFltSemantics()) ==
+ DenormalMode::getIEEE()) {
+ replaceOperand(I, 0, X);
+ replaceOperand(I, 1, Y);
+ return &I;
+ }
+ break;
+ }
+ break;
case Instruction::PHI:
if (Instruction *NV = foldOpIntoPhi(I, cast<PHINode>(LHSI)))
return NV;
diff --git a/llvm/test/Transforms/InstCombine/fcmp.ll b/llvm/test/Transforms/InstCombine/fcmp.ll
index 4d907800219d6..1f7b3fca0b00e 100644
--- a/llvm/test/Transforms/InstCombine/fcmp.ll
+++ b/llvm/test/Transforms/InstCombine/fcmp.ll
@@ -1289,7 +1289,7 @@ define <1 x i1> @bitcast_1vec_eq0(i32 %x) {
define i1 @fcmp_fadd_zero_ugt(float %x, float %y) {
; CHECK-LABEL: @fcmp_fadd_zero_ugt(
-; CHECK-NEXT: [[CMP:%.*]] = fcmp ugt float [[ADD:%.*]], [[Y:%.*]]
+; CHECK-NEXT: [[CMP:%.*]] = fcmp ugt float [[X:%.*]], [[Y:%.*]]
; CHECK-NEXT: ret i1 [[CMP]]
;
%add = fadd float %x, 0.000000e+00
@@ -1299,7 +1299,7 @@ define i1 @fcmp_fadd_zero_ugt(float %x, float %y) {
define i1 @fcmp_fadd_zero_uge(float %x, float %y) {
; CHECK-LABEL: @fcmp_fadd_zero_uge(
-; CHECK-NEXT: [[CMP:%.*]] = fcmp uge float [[ADD:%.*]], [[Y:%.*]]
+; CHECK-NEXT: [[CMP:%.*]] = fcmp uge float [[X:%.*]], [[Y:%.*]]
; CHECK-NEXT: ret i1 [[CMP]]
;
%add = fadd float %x, 0.000000e+00
@@ -1309,7 +1309,7 @@ define i1 @fcmp_fadd_zero_uge(float %x, float %y) {
define i1 @fcmp_fadd_zero_ogt(float %x, float %y) {
; CHECK-LABEL: @fcmp_fadd_zero_ogt(
-; CHECK-NEXT: [[CMP:%.*]] = fcmp ogt float [[ADD:%.*]], [[Y:%.*]]
+; CHECK-NEXT: [[CMP:%.*]] = fcmp ogt float [[X:%.*]], [[Y:%.*]]
; CHECK-NEXT: ret i1 [[CMP]]
;
%add = fadd float %x, 0.000000e+00
@@ -1319,7 +1319,7 @@ define i1 @fcmp_fadd_zero_ogt(float %x, float %y) {
define i1 @fcmp_fadd_zero_oge(float %x, float %y) {
; CHECK-LABEL: @fcmp_fadd_zero_oge(
-; CHECK-NEXT: [[CMP:%.*]] = fcmp oge float [[ADD:%.*]], [[Y:%.*]]
+; CHECK-NEXT: [[CMP:%.*]] = fcmp oge float [[X:%.*]], [[Y:%.*]]
; CHECK-NEXT: ret i1 [[CMP]]
;
%add = fadd float %x, 0.000000e+00
@@ -1329,7 +1329,7 @@ define i1 @fcmp_fadd_zero_oge(float %x, float %y) {
define i1 @fcmp_fadd_zero_ult(float %x, float %y) {
; CHECK-LABEL: @fcmp_fadd_zero_ult(
-; CHECK-NEXT: [[CMP:%.*]] = fcmp ult float [[ADD:%.*]], [[Y:%.*]]
+; CHECK-NEXT: [[CMP:%.*]] = fcmp ult float [[X:%.*]], [[Y:%.*]]
; CHECK-NEXT: ret i1 [[CMP]]
;
%add = fadd float %x, 0.000000e+00
@@ -1339,7 +1339,7 @@ define i1 @fcmp_fadd_zero_ult(float %x, float %y) {
define i1 @fcmp_fadd_zero_ule(float %x, float %y) {
; CHECK-LABEL: @fcmp_fadd_zero_ule(
-; CHECK-NEXT: [[CMP:%.*]] = fcmp ule float [[ADD:%.*]], [[Y:%.*]]
+; CHECK-NEXT: [[CMP:%.*]] = fcmp ule float [[X:%.*]], [[Y:%.*]]
; CHECK-NEXT: ret i1 [[CMP]]
;
%add = fadd float %x, 0.000000e+00
@@ -1349,7 +1349,7 @@ define i1 @fcmp_fadd_zero_ule(float %x, float %y) {
define i1 @fcmp_fadd_zero_olt(float %x, float %y) {
; CHECK-LABEL: @fcmp_fadd_zero_olt(
-; CHECK-NEXT: [[CMP:%.*]] = fcmp olt float [[ADD:%.*]], [[Y:%.*]]
+; CHECK-NEXT: [[CMP:%.*]] = fcmp olt float [[X:%.*]], [[Y:%.*]]
; CHECK-NEXT: ret i1 [[CMP]]
;
%add = fadd float %x, 0.000000e+00
@@ -1359,7 +1359,7 @@ define i1 @fcmp_fadd_zero_olt(float %x, float %y) {
define i1 @fcmp_fadd_zero_ole(float %x, float %y) {
; CHECK-LABEL: @fcmp_fadd_zero_ole(
-; CHECK-NEXT: [[CMP:%.*]] = fcmp ole float [[ADD:%.*]], [[Y:%.*]]
+; CHECK-NEXT: [[CMP:%.*]] = fcmp ole float [[X:%.*]], [[Y:%.*]]
; CHECK-NEXT: ret i1 [[CMP]]
;
%add = fadd float %x, 0.000000e+00
@@ -1369,7 +1369,7 @@ define i1 @fcmp_fadd_zero_ole(float %x, float %y) {
define i1 @fcmp_fadd_zero_oeq(float %x, float %y) {
; CHECK-LABEL: @fcmp_fadd_zero_oeq(
-; CHECK-NEXT: [[CMP:%.*]] = fcmp oeq float [[ADD:%.*]], [[Y:%.*]]
+; CHECK-NEXT: [[CMP:%.*]] = fcmp oeq float [[X:%.*]], [[Y:%.*]]
; CHECK-NEXT: ret i1 [[CMP]]
;
%add = fadd float %x, 0.000000e+00
@@ -1379,7 +1379,7 @@ define i1 @fcmp_fadd_zero_oeq(float %x, float %y) {
define i1 @fcmp_fadd_zero_one(float %x, float %y) {
; CHECK-LABEL: @fcmp_fadd_zero_one(
-; CHECK-NEXT: [[CMP:%.*]] = fcmp one float [[ADD:%.*]], [[Y:%.*]]
+; CHECK-NEXT: [[CMP:%.*]] = fcmp one float [[X:%.*]], [[Y:%.*]]
; CHECK-NEXT: ret i1 [[CMP]]
;
%add = fadd float %x, 0.000000e+00
@@ -1389,7 +1389,7 @@ define i1 @fcmp_fadd_zero_one(float %x, float %y) {
define i1 @fcmp_fadd_zero_ueq(float %x, float %y) {
; CHECK-LABEL: @fcmp_fadd_zero_ueq(
-; CHECK-NEXT: [[CMP:%.*]] = fcmp ueq float [[ADD:%.*]], [[Y:%.*]]
+; CHECK-NEXT: [[CMP:%.*]] = fcmp ueq float [[X:%.*]], [[Y:%.*]]
; CHECK-NEXT: ret i1 [[CMP]]
;
%add = fadd float %x, 0.000000e+00
@@ -1399,7 +1399,7 @@ define i1 @fcmp_fadd_zero_ueq(float %x, float %y) {
define i1 @fcmp_fadd_zero_une(float %x, float %y) {
; CHECK-LABEL: @fcmp_fadd_zero_une(
-; CHECK-NEXT: [[CMP:%.*]] = fcmp une float [[ADD:%.*]], [[Y:%.*]]
+; CHECK-NEXT: [[CMP:%.*]] = fcmp une float [[X:%.*]], [[Y:%.*]]
; CHECK-NEXT: ret i1 [[CMP]]
;
%add = fadd float %x, 0.000000e+00
@@ -1409,7 +1409,7 @@ define i1 @fcmp_fadd_zero_une(float %x, float %y) {
define i1 @fcmp_fadd_zero_ord(float %x, float %y) {
; CHECK-LABEL: @fcmp_fadd_zero_ord(
-; CHECK-NEXT: [[CMP:%.*]] = fcmp ord float [[ADD:%.*]], [[Y:%.*]]
+; CHECK-NEXT: [[CMP:%.*]] = fcmp ord float [[X:%.*]], [[Y:%.*]]
; CHECK-NEXT: ret i1 [[CMP]]
;
%add = fadd float %x, 0.000000e+00
@@ -1419,7 +1419,7 @@ define i1 @fcmp_fadd_zero_ord(float %x, float %y) {
define i1 @fcmp_fadd_zero_uno(float %x, float %y) {
; CHECK-LABEL: @fcmp_fadd_zero_uno(
-; CHECK-NEXT: [[CMP:%.*]] = fcmp uno float [[ADD:%.*]], [[Y:%.*]]
+; CHECK-NEXT: [[CMP:%.*]] = fcmp uno float [[X:%.*]], [[Y:%.*]]
; CHECK-NEXT: ret i1 [[CMP]]
;
%add = fadd float %x, 0.000000e+00
@@ -1718,3 +1718,372 @@ define <2 x i1> @fcmp_une_sel_x_negx_with_any_fpzero_nnan_vec(<2 x i1> %cond, <2
%icmp = fcmp nnan une <2 x float> %sel, <float 0.0, float -0.0>
ret <2 x i1> %icmp
}
+
+define i1 @fcmp_oeq_fsub_const(float %x, float %y) {
+; CHECK-LABEL: @fcmp_oeq_fsub_const(
+; CHECK-NEXT: [[FS:%.*]] = fsub float [[X:%.*]], [[Y:%.*]]
+; CHECK-NEXT: [[CMP:%.*]] = fcmp oeq float [[FS]], 0.000000e+00
+; CHECK-NEXT: ret i1 [[CMP]]
+;
+ %fs = fsub float %x, %y
+ %cmp = fcmp oeq float %fs, 0.000000e+00
+ ret i1 %cmp
+}
+
+define i1 @fcmp_oge_fsub_const(float %x, float %y) {
+; CHECK-LABEL: @fcmp_oge_fsub_const(
+; CHECK-NEXT: [[FS:%.*]] = fsub float [[X:%.*]], [[Y:%.*]]
+; CHECK-NEXT: [[CMP:%.*]] = fcmp oge float [[FS]], 0.000000e+00
+; CHECK-NEXT: ret i1 [[CMP]]
+;
+ %fs = fsub float %x, %y
+ %cmp = fcmp oge float %fs, 0.000000e+00
+ ret i1 %cmp
+}
+
+define i1 @fcmp_ole_fsub_const(float %x, float %y) {
+; CHECK-LABEL: @fcmp_ole_fsub_const(
+; CHECK-NEXT: [[FS:%.*]] = fsub float [[X:%.*]], [[Y:%.*]]
+; CHECK-NEXT: [[CMP:%.*]] = fcmp ole float [[FS]], 0.000000e+00
+; CHECK-NEXT: ret i1 [[CMP]]
+;
+ %fs = fsub float %x, %y
+ %cmp = fcmp ole float %fs, 0.000000e+00
+ ret i1 %cmp
+}
+
+define i1 @fcmp_ueq_fsub_const(float %x, float %y) {
+; CHECK-LABEL: @fcmp_ueq_fsub_const(
+; CHECK-NEXT: [[CMP:%.*]] = fcmp ueq float [[X:%.*]], [[Y:%.*]]
+; CHECK-NEXT: ret i1 [[CMP]]
+;
+ %fs = fsub float %x, %y
+ %cmp = fcmp ueq float %fs, 0.000000e+00
+ ret i1 %cmp
+}
+
+define i1 @fcmp_uge_fsub_const(float %x, float %y) {
+; CHECK-LABEL: @fcmp_uge_fsub_const(
+; CHECK-NEXT: [[CMP:%.*]] = fcmp uge float [[X:%.*]], [[Y:%.*]]
+; CHECK-NEXT: ret i1 [[CMP]]
+;
+ %fs = fsub float %x, %y
+ %cmp = fcmp uge float %fs, 0.000000e+00
+ ret i1 %cmp
+}
+
+define i1 @fcmp_ule_fsub_const(float %x, float %y) {
+; CHECK-LABEL: @fcmp_ule_fsub_const(
+; CHECK-NEXT: [[CMP:%.*]] = fcmp ule float [[X:%.*]], [[Y:%.*]]
+; CHECK-NEXT: ret i1 [[CMP]]
+;
+ %fs = fsub float %x, %y
+ %cmp = fcmp ule float %fs, 0.000000e+00
+ ret i1 %cmp
+}
+
+define i1 @fcmp_ugt_fsub_const(float %x, float %y) {
+; CHECK-LABEL: @fcmp_ugt_fsub_const(
+; CHECK-NEXT: [[FS:%.*]] = fsub float [[X:%.*]], [[Y:%.*]]
+; CHECK-NEXT: [[CMP:%.*]] = fcmp ugt float [[FS]], 0.000000e+00
+; CHECK-NEXT: ret i1 [[CMP]]
+;
+ %fs = fsub float %x, %y
+ %cmp = fcmp ugt float %fs, 0.000000e+00
+ ret i1 %cmp
+}
+
+define i1 @fcmp_ult_fsub_const(float %x, float %y) {
+; CHECK-LABEL: @fcmp_ult_fsub_const(
+; CHECK-NEXT: [[FS:%.*]] = fsub float [[X:%.*]], [[Y:%.*]]
+; CHECK-NEXT: [[CMP:%.*]] = fcmp ult float [[FS]], 0.000000e+00
+; CHECK-NEXT: ret i1 [[CMP]]
+;
+ %fs = fsub float %x, %y
+ %cmp = fcmp ult float %fs, 0.000000e+00
+ ret i1 %cmp
+}
+
+define i1 @fcmp_une_fsub_const(float %x, float %y) {
+; CHECK-LABEL: @fcmp_une_fsub_const(
+; CHECK-NEXT: [[FS:%.*]] = fsub float [[X:%.*]], [[Y:%.*]]
+; CHECK-NEXT: [[CMP:%.*]] = fcmp une float [[FS]], 0.000000e+00
+; CHECK-NEXT: ret i1 [[CMP]]
+;
+ %fs = fsub float %x, %y
+ %cmp = fcmp une float %fs, 0.000000e+00
+ ret i1 %cmp
+}
+
+define <8 x i1> @fcmp_uge_fsub_const_ninf_vec(<8 x float> %x, <8 x float> %y) {
+; CHECK-LABEL: @fcmp_uge_fsub_const_ninf_vec(
+; CHECK-NEXT: [[CMP:%.*]] = fcmp ninf uge <8 x float> [[X:%.*]], [[Y:%.*]]
+; CHECK-NEXT: ret <8 x i1> [[CMP]]
+;
+ %fs = fsub ninf <8 x float> %x, %y
+ %cmp = fcmp ninf uge <8 x float> %fs, zeroinitializer
+ ret <8 x i1> %cmp
+}
+
+define <8 x i1> @fcmp_ule_fsub_const_ninf_vec(<8 x float> %x, <8 x float> %y) {
+; CHECK-LABEL: @fcmp_ule_fsub_const_ninf_vec(
+; CHECK-NEXT: [[CMP:%.*]] = fcmp ninf ule <8 x float> [[X:%.*]], [[Y:%.*]]
+; CHECK-NEXT: ret <8 x i1> [[CMP]]
+;
+ %fs = fsub ninf <8 x float> %x, %y
+ %cmp = fcmp ninf ule <8 x float> %fs, zeroinitializer
+ ret <8 x i1> %cmp
+}
+
+define <8 x i1> @fcmp_ueq_fsub_const_ninf_vec(<8 x float> %x, <8 x float> %y) {
+; CHECK-LABEL: @fcmp_ueq_fsub_const_ninf_vec(
+; CHECK-NEXT: [[CMP:%.*]] = fcmp ninf ueq <8 x float> [[X:%.*]], [[Y:%.*]]
+; CHECK-NEXT: ret <8 x i1> [[CMP]]
+;
+ %fs = fsub ninf <8 x float> %x, %y
+ %cmp = fcmp ninf ueq <8 x float> %fs, zeroinitializer
+ ret <8 x i1> %cmp
+}
+
+define <8 x i1> @fcmp_oge_fsub_const_ninf_vec(<8 x float> %x, <8 x float> %y) {
+; CHECK-LABEL: @fcmp_oge_fsub_const_ninf_vec(
+; CHECK-NEXT: [[CMP:%.*]] = fcmp ninf oge <8 x float> [[X:%.*]], [[Y:%.*]]
+; CHECK-NEXT: ret <8 x i1> [[CMP]]
+;
+ %fs = fsub ninf <8 x float> %x, %y
+ %cmp = fcmp ninf oge <8 x float> %fs, zeroinitializer
+ ret <8 x i1> %cmp
+}
+
+define <8 x i1> @fcmp_ole_fsub_const_ninf_vec(<8 x float> %x, <8 x float> %y) {
+; CHECK-LABEL: @fcmp_ole_fsub_const_ninf_vec(
+; CHECK-NEXT: [[CMP:%.*]] = fcmp ninf ole <8 x float> [[X:%.*]], [[Y:%.*]]
+; CHECK-NEXT: ret <8 x i1> [[CMP]]
+;
+ %fs = fsub ninf <8 x float> %x, %y
+ %cmp = fcmp ninf ole <8 x float> %fs, zeroinitializer
+ ret <8 x i1> %cmp
+}
+
+define <8 x i1> @fcmp_oeq_fsub_const_ninf_vec(<8 x float> %x, <8 x float> %y) {
+; CHECK-LABEL: @fcmp_oeq_fsub_const_ninf_vec(
+; CHECK-NEXT: [[CMP:%.*]] = fcmp ninf oeq <8 x float> [[X:%.*]], [[Y:%.*]]
+; CHECK-NEXT: ret <8 x i1> [[CMP]]
+;
+ %fs = fsub ninf <8 x float> %x, %y
+ %cmp = fcmp ninf oeq <8 x float> %fs, zeroinitializer
+ ret <8 x i1> %cmp
+}
+
+define <8 x i1> @fcmp_ogt_fsub_const_ninf_vec(<8 x float> %x, <8 x float> %y) {
+; CHECK-LABEL: @fcmp_ogt_fsub_const_ninf_vec(
+; CHECK-NEXT: [[CMP:%.*]] = fcmp ninf ogt <8 x float> [[X:%.*]], [[Y:%.*]]
+; CHECK-NEXT: ret <8 x i1> [[CMP]]
+;
+ %fs = fsub ninf <8 x float> %x, %y
+ %cmp = fcmp ninf ogt <8 x float> %fs, zeroinitializer
+ ret <8 x i1> %cmp
+}
+
+define <8 x i1> @fcmp_olt_fsub_const_ninf_vec(<8 x float> %x, <8 x float> %y) {
+; CHECK-LABEL: @fcmp_olt_fsub_const_ninf_vec(
+; CHECK-NEXT: [[CMP:%.*]] = fcmp ninf olt <8 x float> [[X:%.*]], [[Y:%.*]]
+; CHECK-NEXT: ret <8 x i1> [[CMP]]
+;
+ %fs = fsub ninf <8 x float> %x, %y
+ %cmp = fcmp ninf olt <8 x float> %fs, zeroinitializer
+ ret <8 x i1> %cmp
+}
+
+define <8 x i1> @fcmp_one_fsub_const_ninf_vec(<8 x float> %x, <8 x float> %y) {
+; CHECK-LABEL: @fcmp_one_fsub_const_ninf_vec(
+; CHECK-NEXT: [[CMP:%.*]] = fcmp ninf one <8 x float> [[X:%.*]], [[Y:%.*]]
+; CHECK-NEXT: ret <8 x i1> [[CMP]]
+;
+ %fs = fsub ninf <8 x float> %x, %y
+ %cmp = fcmp ninf one <8 x float> %fs, zeroinitializer
+ ret <8 x i1> %cmp
+}
+
+define <8 x i1> @fcmp_ugt_fsub_const_ninf_vec(<8 x float> %x, <8 x float> %y) {
+; CHECK-LABEL: @fcmp_ugt_fsub_const_ninf_vec(
+; CHECK-NEXT: [[CMP:%.*]] = fcmp ninf ugt <8 x float> [[X:%.*]], [[Y:%.*]]
+; CHECK-NEXT: ret <8 x i1> [[CMP]]
+;
+ %fs = fsub ninf <8 x float> %x, %y
+ %cmp = fcmp ninf ugt <8 x float> %fs, zeroinitializer
+ ret <8 x i1> %cmp
+}
+
+define <8 x i1> @fcmp_ult_fsub_const_ninf_vec(<8 x float> %x, <8 x float> %y) {
+; CHECK-LABEL: @fcmp_ult_fsub_const_ninf_vec(
+; CHECK-NEXT: [[CMP:%.*]] = fcmp ninf ult <8 x float> [[X:%.*]], [[Y:%.*]]
+; CHECK-NEXT: ret <8 x i1> [[CMP]]
+;
+ %fs = fsub ninf <8 x float> %x, %y
+ %cmp = fcmp ninf ult <8 x float> %fs, zeroinitializer
+ ret <8 x i1> %cmp
+}
+
+define <8 x i1> @fcmp_une_fsub_const_ninf_vec(<8 x float> %x, <8 x float> %y) {
+; CHECK-LABEL: @fcmp_une_fsub_const_ninf_vec(
+; CHECK-NEXT: [[CMP:%.*]] = fcmp ninf une <8 x float> [[X:%.*]], [[Y:%.*]]
+; CHECK-NEXT: ret <8 x i1> [[CMP]]
+;
+ %fs = fsub ninf <8 x float> %x, %y
+ %cmp = fcmp ninf une <8 x float> %fs, zeroinitializer
+ ret <8 x i1> %cmp
+}
+
+define <8 x i1> @fcmp_uge_fsub_const_nnan_vec(<8 x float> %x, <8 x float> %y) {
+; CHECK-LABEL: @fcmp_uge_fsub_const_nnan_vec(
+; CHECK-NEXT: [[CMP:%.*]] = fcmp nnan uge <8 x float> [[X:%.*]], [[Y:%.*]]
+; CHECK-NEXT: ret <8 x i1> [[CMP]]
+;
+ %fs = fsub nnan <8 x float> %x, %y
+ %cmp = fcmp nnan uge <8 x float> %fs, zeroinitializer
+ ret <8 x i1> %cmp
+}
+
+define <8 x i1> @fcmp_ule_fsub_const_nnan_vec(<8 x float> %x, <8 x float> %y) {
+; CHECK-LABEL: @fcmp_ule_fsub_const_nnan_vec(
+; CHECK-NEXT: [[CMP:%.*]] = fcmp nnan ule <8 x float> [[X:%.*]], [[Y:%.*]]
+; CHECK-NEXT: ret <8 x i1> [[CMP]]
+;
+ %fs = fsub nnan <8 x float> %x, %y
+ %cmp = fcmp nnan ule <8 x float> %fs, zeroinitializer
+ ret <8 x i1> %cmp
+}
+
+define <8 x i1> @fcmp_ueq_fsub_const_nnan_vec(<8 x float> %x, <8 x float> %y) {
+; CHECK-LABEL: @fcmp_ueq_fsub_const_nnan_vec(
+; CHECK-NEXT: [[CMP:%.*]] = fcmp nnan ueq <8 x float> [[X:%.*]], [[Y:%.*]]
+; CHECK-NEXT: ret <8 x i1> [[CMP]]
+;
+ %fs = fsub nnan <8 x float> %x, %y
+ %cmp = fcmp nnan ueq <8 x float> %fs, zeroinitializer
+ ret <8 x i1> %cmp
+}
+
+define <8 x i1> @fcmp_oge_fsub_const_nnan_vec(<8 x float> %x, <8 x float> %y) {
+; CHECK-LABEL: @fcmp_oge_fsub_const_nnan_vec(
+; CHECK-NEXT: [[CMP:%.*]] = fcmp nnan oge <8 x float> [[X:%.*]], [[Y:%.*]]
+; CHECK-NEXT: ret <8 x i1> [[CMP]]
+;
+ %fs = fsub nnan <8 x float> %x, %y
+ %cmp = fcmp nnan oge <8 x float> %fs, zeroinitializer
+ ret <8 x i1> %cmp
+}
+
+define <8 x i1> @fcmp_ole_fsub_const_nnan_vec(<8 x float> %x, <8 x float> %y) {
+; CHECK-LABEL: @fcmp_ole_fsub_const_nnan_vec(
+; CHECK-NEXT: [[CMP:%.*]] = fcmp nnan ole <8 x float> [[X:%.*]], [[Y:%.*]]
+; CHECK-NEXT: ret <8 x i1> [[CMP]]
+;
+ %fs = fsub nnan <8 x float> %x, %y
+ %cmp = fcmp nnan ole <8 x float> %fs, zeroinitializer
+ ret <8 x i1> %cmp
+}
+
+define <8 x i1> @fcmp_oeq_fsub_const_nnan_vec(<8 x float> %x, <8 x float> %y) {
+; CHECK-LABEL: @fcmp_oeq_fsub_const_nnan_vec(
+; CHECK-NEXT: [[CMP:%.*]] = fcmp nnan oeq <8 x float> [[X:%.*]], [[Y:%.*]]
+; CHECK-NEXT: ret <8 x i1> [[CMP]]
+;
+ %fs = fsub nnan <8 x float> %x, %y
+ %cmp = fcmp nnan oeq <8 x float> %fs, zeroinitializer
+ ret <8 x i1> %cmp
+}
+
+define <8 x i1> @fcmp_ogt_fsub_const_nnan_vec(<8 x float> %x, <8 x float> %y) {
+; CHECK-LABEL: @fcmp_ogt_fsub_const_nnan_vec(
+; CHECK-NEXT: [[CMP:%.*]] = fcmp nnan ogt <8 x float> [[X:%.*]], [[Y:%.*]]
+; CHECK-NEXT: ret <8 x i1> [[CMP]]
+;
+ %fs = fsub nnan <8 x float> %x, %y
+ %cmp = fcmp nnan ogt <8 x float> %fs, zeroinitializer
+ ret <8 x i1> %cmp
+}
+
+define <8 x i1> @fcmp_olt_fsub_const_nnan_vec(<8 x float> %x, <8 x float> %y) {
+; CHECK-LABEL: @fcmp_olt_fsub_const_nnan_vec(
+; CHECK-NEXT: [[CMP:%.*]] = fcmp nnan olt <8 x float> [[X:%.*]], [[Y:%.*]]
+; CHECK-NEXT: ret <8 x i1> [[CMP]]
+;
+ %fs = fsub nnan <8 x float> %x, %y
+ %cmp = fcmp nnan olt <8 x float> %fs, zeroinitializer
+ ret <8 x i1> %cmp
+}
+
+define <8 x i1> @fcmp_one_fsub_const_nnan_vec(<8 x float> %x, <8 x float> %y) {
+; CHECK-LABEL: @fcmp_one_fsub_const_nnan_vec(
+; CHECK-NEXT: [[CMP:%.*]] = fcmp nnan one <8 x float> [[X:%.*]], [[Y:%.*]]
+; CHECK-NEXT: ret <8 x i1> [[CMP]]
+;
+ %fs = fsub nnan <8 x float> %x, %y
+ %cmp = fcmp nnan one <8 x float> %fs, zeroinitializer
+ ret <8 x i1> %cmp
+}
+
+define <8 x i1> @fcmp_ugt_fsub_const_nnan_vec(<8 x float> %x, <8 x float> %y) {
+; CHECK-LABEL: @fcmp_ugt_fsub_const_nnan_vec(
+; CHECK-NEXT: [[CMP:%.*]] = fcmp nnan ugt <8 x float> [[X:%.*]], [[Y:%.*]]
+; CHECK-NEXT: ret <8 x i1> [[CMP]]
+;
+ %fs = fsub nnan <8 x float> %x, %y
+ %cmp = fcmp nnan ugt <8 x float> %fs, zeroinitializer
+ ret <8 x i1> %cmp
+}
+
+define <8 x i1> @fcmp_ult_fsub_const_nnan_vec(<8 x float> %x, <8 x float> %y) {
+; CHECK-LABEL: @fcmp_ult_fsub_const_nnan_vec(
+; CHECK-NEXT: [[CMP:%.*]] = fcmp nnan ult <8 x float> [[X:%.*]], [[Y:%.*]]
+; CHECK-NEXT: ret <8 x i1> [[CMP]]
+;
+ %fs = fsub nnan <8 x float> %x, %y
+ %cmp = fcmp nnan ult <8 x float> %fs, zeroinitializer
+ ret <8 x i1> %cmp
+}
+
+define <8 x i1> @fcmp_une_fsub_const_nnan_vec(<8 x float> %x, <8 x float> %y) {
+; CHECK-LABEL: @fcmp_une_fsub_const_nnan_vec(
+; CHECK-NEXT: [[CMP:%.*]] = fcmp nnan une <8 x float> [[X:%.*]], [[Y:%.*]]
+; CHECK-NEXT: ret <8 x i1> [[CMP]]
+;
+ %fs = fsub nnan <8 x float> %x, %y
+ %cmp = fcmp nnan une <8 x float> %fs, zeroinitializer
+ ret <8 x i1> %cmp
+}
+
+define <8 x i1> @fcmp_ugt_fsub_const_vec_denormal_positive-zero(<8 x float> %x, <8 x float> %y) "denormal-fp-math"="positive-zero,positive-zero" {
+; CHECK-LABEL: @fcmp_ugt_fsub_const_vec_denormal_positive-zero(
+; CHECK-NEXT: [[FS:%.*]] = fsub <8 x float> [[X:%.*]], [[Y:%.*]]
+; CHECK-NEXT: [[CMP:%.*]] = fcmp ogt <8 x float> [[FS]], zeroinitializer
+; CHECK-NEXT: ret <8 x i1> [[CMP]]
+;
+ %fs = fsub <8 x float> %x, %y
+ %cmp = fcmp ogt <8 x float> %fs, zeroinitializer
+ ret <8 x i1> %cmp
+}
+
+define <8 x i1> @fcmp_ogt_fsub_const_vec_denormal_dynamic(<8 x float> %x, <8 x float> %y) "denormal-fp-math"="dynamic,dynamic" {
+; CHECK-LABEL: @fcmp_ogt_fsub_const_vec_denormal_dynamic(
+; CHECK-NEXT: [[FS:%.*]] = fsub <8 x float> [[X:%.*]], [[Y:%.*]]
+; CHECK-NEXT: [[CMP:%.*]] = fcmp ogt <8 x float> [[FS]], zeroinitializer
+; CHECK-NEXT: ret <8 x i1> [[CMP]]
+;
+ %fs = fsub <8 x float> %x, %y
+ %cmp = fcmp ogt <8 x...
[truncated]
|
case FCmpInst::FCMP_OEQ: | ||
case FCmpInst::FCMP_OGE: | ||
case FCmpInst::FCMP_OLE: | ||
// fsub x, y --> isnnan(x, y) && isninf(x, y) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comment seems off, this isn't transforming it into this logical test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I apologize for my poor commenting skills.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm with the comment fixed
✅ With the latest revision this PR passed the C/C++ code formatter. |
Co-authored-by: Jay Foad <jay.foad@gmail.com>
Transforms/PhaseOrdering/X86/vector-reductions.ll fails precheck. It does have some fcmp + fsub patterns, so likely a real failure |
After updating the test, it is breaking define i32 @TestVectorsEqualFP(ptr noalias %Vec0, ptr noalias %Vec1, float %Tole
; CHECK-NEXT: [[TMP0:%.*]] = load <4 x float>, ptr [[VEC0:%.*]], align 4
; CHECK-NEXT: [[TMP1:%.*]] = load <4 x float>, ptr [[VEC1:%.*]], align 4
; CHECK-NEXT: [[TMP2:%.*]] = fsub fast <4 x float> [[TMP0]], [[TMP1]]
-; CHECK-NEXT: [[TMP3:%.*]] = tail call fast <4 x float> @llvm.fabs.v4f32(<4 x float> [[TMP2]])
-; CHECK-NEXT: [[TMP4:%.*]] = tail call fast float @llvm.vector.reduce.fadd.v4f32(float -0.000000e+00, <4 x float> [[TMP3]])
-; CHECK-NEXT: [[CMP4:%.*]] = fcmp fast ole float [[TMP4]], [[TOLERANCE:%.*]]
+; CHECK-NEXT: [[TMP3:%.*]] = fcmp fast oge <4 x float> [[TMP0]], [[TMP1]]
+; CHECK-NEXT: [[TMP4:%.*]] = fneg fast <4 x float> [[TMP2]]
+; CHECK-NEXT: [[TMP5:%.*]] = select <4 x i1> [[TMP3]], <4 x float> [[TMP2]], <4 x float> [[TMP4]]
+; CHECK-NEXT: [[TMP6:%.*]] = tail call fast float @llvm.vector.reduce.fadd.v4f32(float -0.000000e+00, <4 x float> [[TMP5]])
+; CHECK-NEXT: [[CMP4:%.*]] = fcmp fast ole float [[TMP6]], [[TOLERANCE:%.*]]
; CHECK-NEXT: [[COND5:%.*]] = zext i1 [[CMP4]] to i32
; CHECK-NEXT: ret i32 [[COND5]]
;
|
This regression looks like it's because you are missing a hasOneUse check on the fsub. You should add some multi-use negative tests |
case FCmpInst::FCMP_OLE: | ||
// Skip optimization: fsub x, y unless guaranteed !isinf(x) || | ||
// !isinf(y). | ||
if (!LHSI->hasNoNaNs() && !LHSI->hasNoInfs() && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could have had the hasOneUse check skip the complex isKnownNeverInfinity calls
%fs = fsub float %x, %y | ||
call void @use(float %fs) | ||
%cmp = fcmp ueq float %fs, 0.000000e+00 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this failing for the right reason? Don't you need the known-not-inf case for the fsub?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
essentially LGTM, except the hasOneUse duplication makes less sense than moving the whole pattern check before the predicate test switch with the isKnownNeverInf checks
case FCmpInst::FCMP_OLE: | ||
// Skip optimization: fsub x, y unless guaranteed !isinf(x) || | ||
// !isinf(y). | ||
if (!LHSI->hasOneUse() || |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of duplicating the hasOneUse check, you could hoist the whole inner pattern match below before the condition type handling. It would also probably be nicer to read splitting the whole switch into a helper function
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we do it this way:
if (LHSI->hasOneUse())
if (Instruction *NV = foldFCmpFSubIntoFCmp(I, LHSI, RHSC, *this))
return NV;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I replaced the switch block with a function.
Resolve #85245
Alive2: https://alive2.llvm.org/ce/z/azHAHq