-
Notifications
You must be signed in to change notification settings - Fork 15.2k
[InstSimplify] Simplify fcmp implied by dominating fcmp #161090
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-llvm-analysis @llvm/pr-subscribers-backend-amdgpu Author: Yingwei Zheng (dtcxzyw) ChangesThis patch simplifies an fcmp into true/false if it is implied by a dominating fcmp.
Note: It doesn't fix #70985, as the second fcmp in the motivating case is not dominated by the edge. We may need to adjust JumpThreading to handle this case. Comptime impact (~+0.1%): https://llvm-compile-time-tracker.com/compare.php?from=a728f213c863e4dd19f8969a417148d2951323c0&to=8ca70404fb0d66a824f39d83050ac38e2f1b25b9&stat=instructions:u Full diff: https://github.com/llvm/llvm-project/pull/161090.diff 5 Files Affected:
diff --git a/llvm/lib/Analysis/InstructionSimplify.cpp b/llvm/lib/Analysis/InstructionSimplify.cpp
index 07f4a8e5c889e..0d978d4da125e 100644
--- a/llvm/lib/Analysis/InstructionSimplify.cpp
+++ b/llvm/lib/Analysis/InstructionSimplify.cpp
@@ -4164,6 +4164,10 @@ static Value *simplifyFCmpInst(CmpPredicate Pred, Value *LHS, Value *RHS,
return ConstantInt::get(RetTy, Pred == CmpInst::FCMP_UNO);
}
+ if (std::optional<bool> Res =
+ isImpliedByDomCondition(Pred, LHS, RHS, Q.CxtI, Q.DL))
+ return ConstantInt::getBool(RetTy, *Res);
+
const APFloat *C = nullptr;
match(RHS, m_APFloatAllowPoison(C));
std::optional<KnownFPClass> FullKnownClassLHS;
diff --git a/llvm/lib/Analysis/ValueTracking.cpp b/llvm/lib/Analysis/ValueTracking.cpp
index 6f11b250cf21f..4d049fc356273 100644
--- a/llvm/lib/Analysis/ValueTracking.cpp
+++ b/llvm/lib/Analysis/ValueTracking.cpp
@@ -39,6 +39,7 @@
#include "llvm/IR/Attributes.h"
#include "llvm/IR/BasicBlock.h"
#include "llvm/IR/Constant.h"
+#include "llvm/IR/ConstantFPRange.h"
#include "llvm/IR/ConstantRange.h"
#include "llvm/IR/Constants.h"
#include "llvm/IR/DerivedTypes.h"
@@ -9447,6 +9448,69 @@ isImpliedCondICmps(CmpPredicate LPred, const Value *L0, const Value *L1,
return std::nullopt;
}
+/// Return true if LHS implies RHS (expanded to its components as "R0 RPred R1")
+/// is true. Return false if LHS implies RHS is false. Otherwise, return
+/// std::nullopt if we can't infer anything.
+static std::optional<bool>
+isImpliedCondFCmps(FCmpInst::Predicate LPred, const Value *L0, const Value *L1,
+ FCmpInst::Predicate RPred, const Value *R0, const Value *R1,
+ const DataLayout &DL, bool LHSIsTrue) {
+ // The rest of the logic assumes the LHS condition is true. If that's not the
+ // case, invert the predicate to make it so.
+ if (!LHSIsTrue)
+ LPred = FCmpInst::getInversePredicate(LPred);
+
+ // We can have non-canonical operands, so try to normalize any common operand
+ // to L0/R0.
+ if (L0 == R1) {
+ std::swap(R0, R1);
+ RPred = FCmpInst::getSwappedPredicate(RPred);
+ }
+ if (R0 == L1) {
+ std::swap(L0, L1);
+ LPred = FCmpInst::getSwappedPredicate(LPred);
+ }
+ if (L1 == R1) {
+ // If we have L0 == R0 and L1 == R1, then make L1/R1 the constants.
+ if (L0 != R0 || match(L0, m_ImmConstant())) {
+ std::swap(L0, L1);
+ LPred = ICmpInst::getSwappedCmpPredicate(LPred);
+ std::swap(R0, R1);
+ RPred = ICmpInst::getSwappedCmpPredicate(RPred);
+ }
+ }
+
+ // Can we infer anything when the two compares have matching operands?
+ if (L0 == R0 && L1 == R1) {
+ if ((LPred & RPred) == LPred)
+ return true;
+ if ((LPred & ~RPred) == LPred)
+ return false;
+ }
+
+ // See if we can infer anything if operand-0 matches and we have at least one
+ // constant.
+ const APFloat *L1C, *R1C;
+ if (L0 == R0 && match(L1, m_APFloat(L1C)) && match(R1, m_APFloat(R1C))) {
+ if (std::optional<ConstantFPRange> DomCR =
+ ConstantFPRange::makeExactFCmpRegion(LPred, *L1C)) {
+ if (std::optional<ConstantFPRange> ImpliedCR =
+ ConstantFPRange::makeExactFCmpRegion(RPred, *R1C)) {
+ if (ImpliedCR->contains(*DomCR))
+ return true;
+ }
+ if (std::optional<ConstantFPRange> ImpliedCR =
+ ConstantFPRange::makeExactFCmpRegion(
+ FCmpInst::getInversePredicate(RPred), *R1C)) {
+ if (ImpliedCR->contains(*DomCR))
+ return false;
+ }
+ }
+ }
+
+ return std::nullopt;
+}
+
/// Return true if LHS implies RHS is true. Return false if LHS implies RHS is
/// false. Otherwise, return std::nullopt if we can't infer anything. We
/// expect the RHS to be an icmp and the LHS to be an 'and', 'or', or a 'select'
@@ -9502,15 +9566,24 @@ llvm::isImpliedCondition(const Value *LHS, CmpPredicate RHSPred,
LHSIsTrue = !LHSIsTrue;
// Both LHS and RHS are icmps.
- if (const auto *LHSCmp = dyn_cast<ICmpInst>(LHS))
- return isImpliedCondICmps(LHSCmp->getCmpPredicate(), LHSCmp->getOperand(0),
- LHSCmp->getOperand(1), RHSPred, RHSOp0, RHSOp1,
- DL, LHSIsTrue);
- const Value *V;
- if (match(LHS, m_NUWTrunc(m_Value(V))))
- return isImpliedCondICmps(CmpInst::ICMP_NE, V,
- ConstantInt::get(V->getType(), 0), RHSPred,
- RHSOp0, RHSOp1, DL, LHSIsTrue);
+ if (RHSOp0->getType()->getScalarType()->isIntOrPtrTy()) {
+ if (const auto *LHSCmp = dyn_cast<ICmpInst>(LHS))
+ return isImpliedCondICmps(LHSCmp->getCmpPredicate(),
+ LHSCmp->getOperand(0), LHSCmp->getOperand(1),
+ RHSPred, RHSOp0, RHSOp1, DL, LHSIsTrue);
+ const Value *V;
+ if (match(LHS, m_NUWTrunc(m_Value(V))))
+ return isImpliedCondICmps(CmpInst::ICMP_NE, V,
+ ConstantInt::get(V->getType(), 0), RHSPred,
+ RHSOp0, RHSOp1, DL, LHSIsTrue);
+ } else {
+ assert(RHSOp0->getType()->isFPOrFPVectorTy() &&
+ "Expected floating point type only!");
+ if (const auto *LHSCmp = dyn_cast<FCmpInst>(LHS))
+ return isImpliedCondFCmps(LHSCmp->getPredicate(), LHSCmp->getOperand(0),
+ LHSCmp->getOperand(1), RHSPred, RHSOp0, RHSOp1,
+ DL, LHSIsTrue);
+ }
/// The LHS should be an 'or', 'and', or a 'select' instruction. We expect
/// the RHS to be an icmp.
@@ -9547,6 +9620,13 @@ std::optional<bool> llvm::isImpliedCondition(const Value *LHS, const Value *RHS,
return InvertRHS ? !*Implied : *Implied;
return std::nullopt;
}
+ if (const FCmpInst *RHSCmp = dyn_cast<FCmpInst>(RHS)) {
+ if (auto Implied = isImpliedCondition(
+ LHS, RHSCmp->getPredicate(), RHSCmp->getOperand(0),
+ RHSCmp->getOperand(1), DL, LHSIsTrue, Depth))
+ return InvertRHS ? !*Implied : *Implied;
+ return std::nullopt;
+ }
const Value *V;
if (match(RHS, m_NUWTrunc(m_Value(V)))) {
diff --git a/llvm/test/CodeGen/AMDGPU/sgpr-copy.ll b/llvm/test/CodeGen/AMDGPU/sgpr-copy.ll
index c82b3410a0c42..5bc9cdb5de527 100644
--- a/llvm/test/CodeGen/AMDGPU/sgpr-copy.ll
+++ b/llvm/test/CodeGen/AMDGPU/sgpr-copy.ll
@@ -256,7 +256,7 @@ endif: ; preds = %else, %if
define amdgpu_kernel void @copy1(ptr addrspace(1) %out, ptr addrspace(1) %in0) {
entry:
%tmp = load float, ptr addrspace(1) %in0
- %tmp1 = fcmp oeq float %tmp, 0.000000e+00
+ %tmp1 = fcmp one float %tmp, 0.000000e+00
br i1 %tmp1, label %if0, label %endif
if0: ; preds = %entry
diff --git a/llvm/test/Transforms/InstCombine/clamp-to-minmax.ll b/llvm/test/Transforms/InstCombine/clamp-to-minmax.ll
index 7f3276608c5da..0ccaa9c393654 100644
--- a/llvm/test/Transforms/InstCombine/clamp-to-minmax.ll
+++ b/llvm/test/Transforms/InstCombine/clamp-to-minmax.ll
@@ -172,10 +172,8 @@ define float @clamp_negative_wrong_const(float %x) {
; Like @clamp_test_1 but both are min
define float @clamp_negative_same_op(float %x) {
; CHECK-LABEL: @clamp_negative_same_op(
-; CHECK-NEXT: [[INNER_CMP_INV:%.*]] = fcmp fast oge float [[X:%.*]], 2.550000e+02
-; CHECK-NEXT: [[INNER_SEL:%.*]] = select nnan ninf i1 [[INNER_CMP_INV]], float 2.550000e+02, float [[X]]
-; CHECK-NEXT: [[OUTER_CMP:%.*]] = fcmp fast ult float [[X]], 1.000000e+00
-; CHECK-NEXT: [[R:%.*]] = select i1 [[OUTER_CMP]], float [[INNER_SEL]], float 1.000000e+00
+; CHECK-NEXT: [[OUTER_CMP_INV:%.*]] = fcmp fast oge float [[X:%.*]], 1.000000e+00
+; CHECK-NEXT: [[R:%.*]] = select nnan ninf i1 [[OUTER_CMP_INV]], float 1.000000e+00, float [[X]]
; CHECK-NEXT: ret float [[R]]
;
%inner_cmp = fcmp fast ult float %x, 255.0
diff --git a/llvm/test/Transforms/InstSimplify/domcondition.ll b/llvm/test/Transforms/InstSimplify/domcondition.ll
index 43be5de6bf95d..2893bb155bbe7 100644
--- a/llvm/test/Transforms/InstSimplify/domcondition.ll
+++ b/llvm/test/Transforms/InstSimplify/domcondition.ll
@@ -278,3 +278,210 @@ end:
}
declare void @foo(i32)
+
+
+define i1 @simplify_fcmp_implied_by_dom_cond_range_true(float %x) {
+; CHECK-LABEL: @simplify_fcmp_implied_by_dom_cond_range_true(
+; CHECK-NEXT: [[CMP:%.*]] = fcmp olt float [[X:%.*]], 0.000000e+00
+; CHECK-NEXT: br i1 [[CMP]], label [[IF_THEN:%.*]], label [[IF_ELSE:%.*]]
+; CHECK: if.then:
+; CHECK-NEXT: ret i1 true
+; CHECK: if.else:
+; CHECK-NEXT: ret i1 false
+;
+ %cmp = fcmp olt float %x, 0.0
+ br i1 %cmp, label %if.then, label %if.else
+
+if.then:
+ %cmp2 = fcmp olt float %x, 1.0
+ ret i1 %cmp2
+
+if.else:
+ ret i1 false
+}
+
+define i1 @simplify_fcmp_in_else_implied_by_dom_cond_range_true(float %x) {
+; CHECK-LABEL: @simplify_fcmp_in_else_implied_by_dom_cond_range_true(
+; CHECK-NEXT: [[CMP:%.*]] = fcmp olt float [[X:%.*]], 1.000000e+00
+; CHECK-NEXT: br i1 [[CMP]], label [[IF_THEN:%.*]], label [[IF_ELSE:%.*]]
+; CHECK: if.then:
+; CHECK-NEXT: ret i1 true
+; CHECK: if.else:
+; CHECK-NEXT: ret i1 true
+;
+ %cmp = fcmp olt float %x, 1.0
+ br i1 %cmp, label %if.then, label %if.else
+
+if.then:
+ ret i1 true
+
+if.else:
+ %cmp2 = fcmp uge float %x, 0.5
+ ret i1 %cmp2
+}
+
+define i1 @simplify_fcmp_implied_by_dom_cond_range_false(float %x) {
+; CHECK-LABEL: @simplify_fcmp_implied_by_dom_cond_range_false(
+; CHECK-NEXT: [[CMP:%.*]] = fcmp olt float [[X:%.*]], 0.000000e+00
+; CHECK-NEXT: br i1 [[CMP]], label [[IF_THEN:%.*]], label [[IF_ELSE:%.*]]
+; CHECK: if.then:
+; CHECK-NEXT: ret i1 false
+; CHECK: if.else:
+; CHECK-NEXT: ret i1 false
+;
+ %cmp = fcmp olt float %x, 0.0
+ br i1 %cmp, label %if.then, label %if.else
+
+if.then:
+ %cmp2 = fcmp ogt float %x, 1.0
+ ret i1 %cmp2
+
+if.else:
+ ret i1 false
+}
+
+define i1 @simplify_fcmp_implied_by_dom_cond_pred_true(float %x, float %y) {
+; CHECK-LABEL: @simplify_fcmp_implied_by_dom_cond_pred_true(
+; CHECK-NEXT: [[CMP:%.*]] = fcmp olt float [[X:%.*]], [[Y:%.*]]
+; CHECK-NEXT: br i1 [[CMP]], label [[IF_THEN:%.*]], label [[IF_ELSE:%.*]]
+; CHECK: if.then:
+; CHECK-NEXT: ret i1 true
+; CHECK: if.else:
+; CHECK-NEXT: ret i1 false
+;
+ %cmp = fcmp olt float %x, %y
+ br i1 %cmp, label %if.then, label %if.else
+
+if.then:
+ %cmp2 = fcmp ole float %x, %y
+ ret i1 %cmp2
+
+if.else:
+ ret i1 false
+}
+
+define i1 @simplify_fcmp_implied_by_dom_cond_pred_false(float %x, float %y) {
+; CHECK-LABEL: @simplify_fcmp_implied_by_dom_cond_pred_false(
+; CHECK-NEXT: [[CMP:%.*]] = fcmp olt float [[X:%.*]], [[Y:%.*]]
+; CHECK-NEXT: br i1 [[CMP]], label [[IF_THEN:%.*]], label [[IF_ELSE:%.*]]
+; CHECK: if.then:
+; CHECK-NEXT: ret i1 false
+; CHECK: if.else:
+; CHECK-NEXT: ret i1 false
+;
+ %cmp = fcmp olt float %x, %y
+ br i1 %cmp, label %if.then, label %if.else
+
+if.then:
+ %cmp2 = fcmp ogt float %x, %y
+ ret i1 %cmp2
+
+if.else:
+ ret i1 false
+}
+
+define i1 @simplify_fcmp_implied_by_dom_cond_pred_commuted(float %x, float %y) {
+; CHECK-LABEL: @simplify_fcmp_implied_by_dom_cond_pred_commuted(
+; CHECK-NEXT: [[CMP:%.*]] = fcmp olt float [[X:%.*]], [[Y:%.*]]
+; CHECK-NEXT: br i1 [[CMP]], label [[IF_THEN:%.*]], label [[IF_ELSE:%.*]]
+; CHECK: if.then:
+; CHECK-NEXT: ret i1 true
+; CHECK: if.else:
+; CHECK-NEXT: ret i1 false
+;
+ %cmp = fcmp olt float %x, %y
+ br i1 %cmp, label %if.then, label %if.else
+
+if.then:
+ %cmp2 = fcmp oge float %y, %x
+ ret i1 %cmp2
+
+if.else:
+ ret i1 false
+}
+
+; Negative tests
+
+define i1 @simplify_fcmp_implied_by_dom_cond_wrong_range(float %x) {
+; CHECK-LABEL: @simplify_fcmp_implied_by_dom_cond_wrong_range(
+; CHECK-NEXT: [[CMP:%.*]] = fcmp olt float [[X:%.*]], 0.000000e+00
+; CHECK-NEXT: br i1 [[CMP]], label [[IF_THEN:%.*]], label [[IF_ELSE:%.*]]
+; CHECK: if.then:
+; CHECK-NEXT: [[CMP2:%.*]] = fcmp olt float [[X]], -1.000000e+00
+; CHECK-NEXT: ret i1 [[CMP2]]
+; CHECK: if.else:
+; CHECK-NEXT: ret i1 false
+;
+ %cmp = fcmp olt float %x, 0.0
+ br i1 %cmp, label %if.then, label %if.else
+
+if.then:
+ %cmp2 = fcmp olt float %x, -1.0
+ ret i1 %cmp2
+
+if.else:
+ ret i1 false
+}
+
+define i1 @simplify_fcmp_implied_by_dom_cond_range_mismatched_operand(float %x, float %y) {
+; CHECK-LABEL: @simplify_fcmp_implied_by_dom_cond_range_mismatched_operand(
+; CHECK-NEXT: [[CMP:%.*]] = fcmp olt float [[X:%.*]], 0.000000e+00
+; CHECK-NEXT: br i1 [[CMP]], label [[IF_THEN:%.*]], label [[IF_ELSE:%.*]]
+; CHECK: if.then:
+; CHECK-NEXT: [[CMP2:%.*]] = fcmp olt float [[Y:%.*]], 1.000000e+00
+; CHECK-NEXT: ret i1 [[CMP2]]
+; CHECK: if.else:
+; CHECK-NEXT: ret i1 false
+;
+ %cmp = fcmp olt float %x, 0.0
+ br i1 %cmp, label %if.then, label %if.else
+
+if.then:
+ %cmp2 = fcmp olt float %y, 1.0
+ ret i1 %cmp2
+
+if.else:
+ ret i1 false
+}
+
+define i1 @simplify_fcmp_implied_by_dom_cond_wrong_pred(float %x, float %y) {
+; CHECK-LABEL: @simplify_fcmp_implied_by_dom_cond_wrong_pred(
+; CHECK-NEXT: [[CMP:%.*]] = fcmp ole float [[X:%.*]], [[Y:%.*]]
+; CHECK-NEXT: br i1 [[CMP]], label [[IF_THEN:%.*]], label [[IF_ELSE:%.*]]
+; CHECK: if.then:
+; CHECK-NEXT: [[CMP2:%.*]] = fcmp olt float [[X]], [[Y]]
+; CHECK-NEXT: ret i1 [[CMP2]]
+; CHECK: if.else:
+; CHECK-NEXT: ret i1 false
+;
+ %cmp = fcmp ole float %x, %y
+ br i1 %cmp, label %if.then, label %if.else
+
+if.then:
+ %cmp2 = fcmp olt float %x, %y
+ ret i1 %cmp2
+
+if.else:
+ ret i1 false
+}
+
+define i1 @simplify_fcmp_implied_by_dom_cond_pred_mismatched_operand(float %x, float %y, float %z) {
+; CHECK-LABEL: @simplify_fcmp_implied_by_dom_cond_pred_mismatched_operand(
+; CHECK-NEXT: [[CMP:%.*]] = fcmp olt float [[X:%.*]], [[Y:%.*]]
+; CHECK-NEXT: br i1 [[CMP]], label [[IF_THEN:%.*]], label [[IF_ELSE:%.*]]
+; CHECK: if.then:
+; CHECK-NEXT: [[CMP2:%.*]] = fcmp ole float [[X]], [[Z:%.*]]
+; CHECK-NEXT: ret i1 [[CMP2]]
+; CHECK: if.else:
+; CHECK-NEXT: ret i1 false
+;
+ %cmp = fcmp olt float %x, %y
+ br i1 %cmp, label %if.then, label %if.else
+
+if.then:
+ %cmp2 = fcmp ole float %x, %z
+ ret i1 %cmp2
+
+if.else:
+ ret i1 false
+}
|
@llvm/pr-subscribers-llvm-transforms Author: Yingwei Zheng (dtcxzyw) ChangesThis patch simplifies an fcmp into true/false if it is implied by a dominating fcmp.
Note: It doesn't fix #70985, as the second fcmp in the motivating case is not dominated by the edge. We may need to adjust JumpThreading to handle this case. Comptime impact (~+0.1%): https://llvm-compile-time-tracker.com/compare.php?from=a728f213c863e4dd19f8969a417148d2951323c0&to=8ca70404fb0d66a824f39d83050ac38e2f1b25b9&stat=instructions:u Full diff: https://github.com/llvm/llvm-project/pull/161090.diff 5 Files Affected:
diff --git a/llvm/lib/Analysis/InstructionSimplify.cpp b/llvm/lib/Analysis/InstructionSimplify.cpp
index 07f4a8e5c889e..0d978d4da125e 100644
--- a/llvm/lib/Analysis/InstructionSimplify.cpp
+++ b/llvm/lib/Analysis/InstructionSimplify.cpp
@@ -4164,6 +4164,10 @@ static Value *simplifyFCmpInst(CmpPredicate Pred, Value *LHS, Value *RHS,
return ConstantInt::get(RetTy, Pred == CmpInst::FCMP_UNO);
}
+ if (std::optional<bool> Res =
+ isImpliedByDomCondition(Pred, LHS, RHS, Q.CxtI, Q.DL))
+ return ConstantInt::getBool(RetTy, *Res);
+
const APFloat *C = nullptr;
match(RHS, m_APFloatAllowPoison(C));
std::optional<KnownFPClass> FullKnownClassLHS;
diff --git a/llvm/lib/Analysis/ValueTracking.cpp b/llvm/lib/Analysis/ValueTracking.cpp
index 6f11b250cf21f..4d049fc356273 100644
--- a/llvm/lib/Analysis/ValueTracking.cpp
+++ b/llvm/lib/Analysis/ValueTracking.cpp
@@ -39,6 +39,7 @@
#include "llvm/IR/Attributes.h"
#include "llvm/IR/BasicBlock.h"
#include "llvm/IR/Constant.h"
+#include "llvm/IR/ConstantFPRange.h"
#include "llvm/IR/ConstantRange.h"
#include "llvm/IR/Constants.h"
#include "llvm/IR/DerivedTypes.h"
@@ -9447,6 +9448,69 @@ isImpliedCondICmps(CmpPredicate LPred, const Value *L0, const Value *L1,
return std::nullopt;
}
+/// Return true if LHS implies RHS (expanded to its components as "R0 RPred R1")
+/// is true. Return false if LHS implies RHS is false. Otherwise, return
+/// std::nullopt if we can't infer anything.
+static std::optional<bool>
+isImpliedCondFCmps(FCmpInst::Predicate LPred, const Value *L0, const Value *L1,
+ FCmpInst::Predicate RPred, const Value *R0, const Value *R1,
+ const DataLayout &DL, bool LHSIsTrue) {
+ // The rest of the logic assumes the LHS condition is true. If that's not the
+ // case, invert the predicate to make it so.
+ if (!LHSIsTrue)
+ LPred = FCmpInst::getInversePredicate(LPred);
+
+ // We can have non-canonical operands, so try to normalize any common operand
+ // to L0/R0.
+ if (L0 == R1) {
+ std::swap(R0, R1);
+ RPred = FCmpInst::getSwappedPredicate(RPred);
+ }
+ if (R0 == L1) {
+ std::swap(L0, L1);
+ LPred = FCmpInst::getSwappedPredicate(LPred);
+ }
+ if (L1 == R1) {
+ // If we have L0 == R0 and L1 == R1, then make L1/R1 the constants.
+ if (L0 != R0 || match(L0, m_ImmConstant())) {
+ std::swap(L0, L1);
+ LPred = ICmpInst::getSwappedCmpPredicate(LPred);
+ std::swap(R0, R1);
+ RPred = ICmpInst::getSwappedCmpPredicate(RPred);
+ }
+ }
+
+ // Can we infer anything when the two compares have matching operands?
+ if (L0 == R0 && L1 == R1) {
+ if ((LPred & RPred) == LPred)
+ return true;
+ if ((LPred & ~RPred) == LPred)
+ return false;
+ }
+
+ // See if we can infer anything if operand-0 matches and we have at least one
+ // constant.
+ const APFloat *L1C, *R1C;
+ if (L0 == R0 && match(L1, m_APFloat(L1C)) && match(R1, m_APFloat(R1C))) {
+ if (std::optional<ConstantFPRange> DomCR =
+ ConstantFPRange::makeExactFCmpRegion(LPred, *L1C)) {
+ if (std::optional<ConstantFPRange> ImpliedCR =
+ ConstantFPRange::makeExactFCmpRegion(RPred, *R1C)) {
+ if (ImpliedCR->contains(*DomCR))
+ return true;
+ }
+ if (std::optional<ConstantFPRange> ImpliedCR =
+ ConstantFPRange::makeExactFCmpRegion(
+ FCmpInst::getInversePredicate(RPred), *R1C)) {
+ if (ImpliedCR->contains(*DomCR))
+ return false;
+ }
+ }
+ }
+
+ return std::nullopt;
+}
+
/// Return true if LHS implies RHS is true. Return false if LHS implies RHS is
/// false. Otherwise, return std::nullopt if we can't infer anything. We
/// expect the RHS to be an icmp and the LHS to be an 'and', 'or', or a 'select'
@@ -9502,15 +9566,24 @@ llvm::isImpliedCondition(const Value *LHS, CmpPredicate RHSPred,
LHSIsTrue = !LHSIsTrue;
// Both LHS and RHS are icmps.
- if (const auto *LHSCmp = dyn_cast<ICmpInst>(LHS))
- return isImpliedCondICmps(LHSCmp->getCmpPredicate(), LHSCmp->getOperand(0),
- LHSCmp->getOperand(1), RHSPred, RHSOp0, RHSOp1,
- DL, LHSIsTrue);
- const Value *V;
- if (match(LHS, m_NUWTrunc(m_Value(V))))
- return isImpliedCondICmps(CmpInst::ICMP_NE, V,
- ConstantInt::get(V->getType(), 0), RHSPred,
- RHSOp0, RHSOp1, DL, LHSIsTrue);
+ if (RHSOp0->getType()->getScalarType()->isIntOrPtrTy()) {
+ if (const auto *LHSCmp = dyn_cast<ICmpInst>(LHS))
+ return isImpliedCondICmps(LHSCmp->getCmpPredicate(),
+ LHSCmp->getOperand(0), LHSCmp->getOperand(1),
+ RHSPred, RHSOp0, RHSOp1, DL, LHSIsTrue);
+ const Value *V;
+ if (match(LHS, m_NUWTrunc(m_Value(V))))
+ return isImpliedCondICmps(CmpInst::ICMP_NE, V,
+ ConstantInt::get(V->getType(), 0), RHSPred,
+ RHSOp0, RHSOp1, DL, LHSIsTrue);
+ } else {
+ assert(RHSOp0->getType()->isFPOrFPVectorTy() &&
+ "Expected floating point type only!");
+ if (const auto *LHSCmp = dyn_cast<FCmpInst>(LHS))
+ return isImpliedCondFCmps(LHSCmp->getPredicate(), LHSCmp->getOperand(0),
+ LHSCmp->getOperand(1), RHSPred, RHSOp0, RHSOp1,
+ DL, LHSIsTrue);
+ }
/// The LHS should be an 'or', 'and', or a 'select' instruction. We expect
/// the RHS to be an icmp.
@@ -9547,6 +9620,13 @@ std::optional<bool> llvm::isImpliedCondition(const Value *LHS, const Value *RHS,
return InvertRHS ? !*Implied : *Implied;
return std::nullopt;
}
+ if (const FCmpInst *RHSCmp = dyn_cast<FCmpInst>(RHS)) {
+ if (auto Implied = isImpliedCondition(
+ LHS, RHSCmp->getPredicate(), RHSCmp->getOperand(0),
+ RHSCmp->getOperand(1), DL, LHSIsTrue, Depth))
+ return InvertRHS ? !*Implied : *Implied;
+ return std::nullopt;
+ }
const Value *V;
if (match(RHS, m_NUWTrunc(m_Value(V)))) {
diff --git a/llvm/test/CodeGen/AMDGPU/sgpr-copy.ll b/llvm/test/CodeGen/AMDGPU/sgpr-copy.ll
index c82b3410a0c42..5bc9cdb5de527 100644
--- a/llvm/test/CodeGen/AMDGPU/sgpr-copy.ll
+++ b/llvm/test/CodeGen/AMDGPU/sgpr-copy.ll
@@ -256,7 +256,7 @@ endif: ; preds = %else, %if
define amdgpu_kernel void @copy1(ptr addrspace(1) %out, ptr addrspace(1) %in0) {
entry:
%tmp = load float, ptr addrspace(1) %in0
- %tmp1 = fcmp oeq float %tmp, 0.000000e+00
+ %tmp1 = fcmp one float %tmp, 0.000000e+00
br i1 %tmp1, label %if0, label %endif
if0: ; preds = %entry
diff --git a/llvm/test/Transforms/InstCombine/clamp-to-minmax.ll b/llvm/test/Transforms/InstCombine/clamp-to-minmax.ll
index 7f3276608c5da..0ccaa9c393654 100644
--- a/llvm/test/Transforms/InstCombine/clamp-to-minmax.ll
+++ b/llvm/test/Transforms/InstCombine/clamp-to-minmax.ll
@@ -172,10 +172,8 @@ define float @clamp_negative_wrong_const(float %x) {
; Like @clamp_test_1 but both are min
define float @clamp_negative_same_op(float %x) {
; CHECK-LABEL: @clamp_negative_same_op(
-; CHECK-NEXT: [[INNER_CMP_INV:%.*]] = fcmp fast oge float [[X:%.*]], 2.550000e+02
-; CHECK-NEXT: [[INNER_SEL:%.*]] = select nnan ninf i1 [[INNER_CMP_INV]], float 2.550000e+02, float [[X]]
-; CHECK-NEXT: [[OUTER_CMP:%.*]] = fcmp fast ult float [[X]], 1.000000e+00
-; CHECK-NEXT: [[R:%.*]] = select i1 [[OUTER_CMP]], float [[INNER_SEL]], float 1.000000e+00
+; CHECK-NEXT: [[OUTER_CMP_INV:%.*]] = fcmp fast oge float [[X:%.*]], 1.000000e+00
+; CHECK-NEXT: [[R:%.*]] = select nnan ninf i1 [[OUTER_CMP_INV]], float 1.000000e+00, float [[X]]
; CHECK-NEXT: ret float [[R]]
;
%inner_cmp = fcmp fast ult float %x, 255.0
diff --git a/llvm/test/Transforms/InstSimplify/domcondition.ll b/llvm/test/Transforms/InstSimplify/domcondition.ll
index 43be5de6bf95d..2893bb155bbe7 100644
--- a/llvm/test/Transforms/InstSimplify/domcondition.ll
+++ b/llvm/test/Transforms/InstSimplify/domcondition.ll
@@ -278,3 +278,210 @@ end:
}
declare void @foo(i32)
+
+
+define i1 @simplify_fcmp_implied_by_dom_cond_range_true(float %x) {
+; CHECK-LABEL: @simplify_fcmp_implied_by_dom_cond_range_true(
+; CHECK-NEXT: [[CMP:%.*]] = fcmp olt float [[X:%.*]], 0.000000e+00
+; CHECK-NEXT: br i1 [[CMP]], label [[IF_THEN:%.*]], label [[IF_ELSE:%.*]]
+; CHECK: if.then:
+; CHECK-NEXT: ret i1 true
+; CHECK: if.else:
+; CHECK-NEXT: ret i1 false
+;
+ %cmp = fcmp olt float %x, 0.0
+ br i1 %cmp, label %if.then, label %if.else
+
+if.then:
+ %cmp2 = fcmp olt float %x, 1.0
+ ret i1 %cmp2
+
+if.else:
+ ret i1 false
+}
+
+define i1 @simplify_fcmp_in_else_implied_by_dom_cond_range_true(float %x) {
+; CHECK-LABEL: @simplify_fcmp_in_else_implied_by_dom_cond_range_true(
+; CHECK-NEXT: [[CMP:%.*]] = fcmp olt float [[X:%.*]], 1.000000e+00
+; CHECK-NEXT: br i1 [[CMP]], label [[IF_THEN:%.*]], label [[IF_ELSE:%.*]]
+; CHECK: if.then:
+; CHECK-NEXT: ret i1 true
+; CHECK: if.else:
+; CHECK-NEXT: ret i1 true
+;
+ %cmp = fcmp olt float %x, 1.0
+ br i1 %cmp, label %if.then, label %if.else
+
+if.then:
+ ret i1 true
+
+if.else:
+ %cmp2 = fcmp uge float %x, 0.5
+ ret i1 %cmp2
+}
+
+define i1 @simplify_fcmp_implied_by_dom_cond_range_false(float %x) {
+; CHECK-LABEL: @simplify_fcmp_implied_by_dom_cond_range_false(
+; CHECK-NEXT: [[CMP:%.*]] = fcmp olt float [[X:%.*]], 0.000000e+00
+; CHECK-NEXT: br i1 [[CMP]], label [[IF_THEN:%.*]], label [[IF_ELSE:%.*]]
+; CHECK: if.then:
+; CHECK-NEXT: ret i1 false
+; CHECK: if.else:
+; CHECK-NEXT: ret i1 false
+;
+ %cmp = fcmp olt float %x, 0.0
+ br i1 %cmp, label %if.then, label %if.else
+
+if.then:
+ %cmp2 = fcmp ogt float %x, 1.0
+ ret i1 %cmp2
+
+if.else:
+ ret i1 false
+}
+
+define i1 @simplify_fcmp_implied_by_dom_cond_pred_true(float %x, float %y) {
+; CHECK-LABEL: @simplify_fcmp_implied_by_dom_cond_pred_true(
+; CHECK-NEXT: [[CMP:%.*]] = fcmp olt float [[X:%.*]], [[Y:%.*]]
+; CHECK-NEXT: br i1 [[CMP]], label [[IF_THEN:%.*]], label [[IF_ELSE:%.*]]
+; CHECK: if.then:
+; CHECK-NEXT: ret i1 true
+; CHECK: if.else:
+; CHECK-NEXT: ret i1 false
+;
+ %cmp = fcmp olt float %x, %y
+ br i1 %cmp, label %if.then, label %if.else
+
+if.then:
+ %cmp2 = fcmp ole float %x, %y
+ ret i1 %cmp2
+
+if.else:
+ ret i1 false
+}
+
+define i1 @simplify_fcmp_implied_by_dom_cond_pred_false(float %x, float %y) {
+; CHECK-LABEL: @simplify_fcmp_implied_by_dom_cond_pred_false(
+; CHECK-NEXT: [[CMP:%.*]] = fcmp olt float [[X:%.*]], [[Y:%.*]]
+; CHECK-NEXT: br i1 [[CMP]], label [[IF_THEN:%.*]], label [[IF_ELSE:%.*]]
+; CHECK: if.then:
+; CHECK-NEXT: ret i1 false
+; CHECK: if.else:
+; CHECK-NEXT: ret i1 false
+;
+ %cmp = fcmp olt float %x, %y
+ br i1 %cmp, label %if.then, label %if.else
+
+if.then:
+ %cmp2 = fcmp ogt float %x, %y
+ ret i1 %cmp2
+
+if.else:
+ ret i1 false
+}
+
+define i1 @simplify_fcmp_implied_by_dom_cond_pred_commuted(float %x, float %y) {
+; CHECK-LABEL: @simplify_fcmp_implied_by_dom_cond_pred_commuted(
+; CHECK-NEXT: [[CMP:%.*]] = fcmp olt float [[X:%.*]], [[Y:%.*]]
+; CHECK-NEXT: br i1 [[CMP]], label [[IF_THEN:%.*]], label [[IF_ELSE:%.*]]
+; CHECK: if.then:
+; CHECK-NEXT: ret i1 true
+; CHECK: if.else:
+; CHECK-NEXT: ret i1 false
+;
+ %cmp = fcmp olt float %x, %y
+ br i1 %cmp, label %if.then, label %if.else
+
+if.then:
+ %cmp2 = fcmp oge float %y, %x
+ ret i1 %cmp2
+
+if.else:
+ ret i1 false
+}
+
+; Negative tests
+
+define i1 @simplify_fcmp_implied_by_dom_cond_wrong_range(float %x) {
+; CHECK-LABEL: @simplify_fcmp_implied_by_dom_cond_wrong_range(
+; CHECK-NEXT: [[CMP:%.*]] = fcmp olt float [[X:%.*]], 0.000000e+00
+; CHECK-NEXT: br i1 [[CMP]], label [[IF_THEN:%.*]], label [[IF_ELSE:%.*]]
+; CHECK: if.then:
+; CHECK-NEXT: [[CMP2:%.*]] = fcmp olt float [[X]], -1.000000e+00
+; CHECK-NEXT: ret i1 [[CMP2]]
+; CHECK: if.else:
+; CHECK-NEXT: ret i1 false
+;
+ %cmp = fcmp olt float %x, 0.0
+ br i1 %cmp, label %if.then, label %if.else
+
+if.then:
+ %cmp2 = fcmp olt float %x, -1.0
+ ret i1 %cmp2
+
+if.else:
+ ret i1 false
+}
+
+define i1 @simplify_fcmp_implied_by_dom_cond_range_mismatched_operand(float %x, float %y) {
+; CHECK-LABEL: @simplify_fcmp_implied_by_dom_cond_range_mismatched_operand(
+; CHECK-NEXT: [[CMP:%.*]] = fcmp olt float [[X:%.*]], 0.000000e+00
+; CHECK-NEXT: br i1 [[CMP]], label [[IF_THEN:%.*]], label [[IF_ELSE:%.*]]
+; CHECK: if.then:
+; CHECK-NEXT: [[CMP2:%.*]] = fcmp olt float [[Y:%.*]], 1.000000e+00
+; CHECK-NEXT: ret i1 [[CMP2]]
+; CHECK: if.else:
+; CHECK-NEXT: ret i1 false
+;
+ %cmp = fcmp olt float %x, 0.0
+ br i1 %cmp, label %if.then, label %if.else
+
+if.then:
+ %cmp2 = fcmp olt float %y, 1.0
+ ret i1 %cmp2
+
+if.else:
+ ret i1 false
+}
+
+define i1 @simplify_fcmp_implied_by_dom_cond_wrong_pred(float %x, float %y) {
+; CHECK-LABEL: @simplify_fcmp_implied_by_dom_cond_wrong_pred(
+; CHECK-NEXT: [[CMP:%.*]] = fcmp ole float [[X:%.*]], [[Y:%.*]]
+; CHECK-NEXT: br i1 [[CMP]], label [[IF_THEN:%.*]], label [[IF_ELSE:%.*]]
+; CHECK: if.then:
+; CHECK-NEXT: [[CMP2:%.*]] = fcmp olt float [[X]], [[Y]]
+; CHECK-NEXT: ret i1 [[CMP2]]
+; CHECK: if.else:
+; CHECK-NEXT: ret i1 false
+;
+ %cmp = fcmp ole float %x, %y
+ br i1 %cmp, label %if.then, label %if.else
+
+if.then:
+ %cmp2 = fcmp olt float %x, %y
+ ret i1 %cmp2
+
+if.else:
+ ret i1 false
+}
+
+define i1 @simplify_fcmp_implied_by_dom_cond_pred_mismatched_operand(float %x, float %y, float %z) {
+; CHECK-LABEL: @simplify_fcmp_implied_by_dom_cond_pred_mismatched_operand(
+; CHECK-NEXT: [[CMP:%.*]] = fcmp olt float [[X:%.*]], [[Y:%.*]]
+; CHECK-NEXT: br i1 [[CMP]], label [[IF_THEN:%.*]], label [[IF_ELSE:%.*]]
+; CHECK: if.then:
+; CHECK-NEXT: [[CMP2:%.*]] = fcmp ole float [[X]], [[Z:%.*]]
+; CHECK-NEXT: ret i1 [[CMP2]]
+; CHECK: if.else:
+; CHECK-NEXT: ret i1 false
+;
+ %cmp = fcmp olt float %x, %y
+ br i1 %cmp, label %if.then, label %if.else
+
+if.then:
+ %cmp2 = fcmp ole float %x, %z
+ ret i1 %cmp2
+
+if.else:
+ ret i1 false
+}
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this is the fcmp
version of the already existing isImpliedCondICmps
, implementation and tests look good to me.
We may need to adjust JumpThreading to handle this case.
Why can't it be done here if I may ask?
Generally, only one component is involved in one patch. Although this patch doesn't resolve the original issue, it still removes many redundant fcmps in real-world code. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LG once llvm/test/CodeGen/AMDGPU/sgpr-copy.ll is addressed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This look ok. I'm not a fan of isImpliedByDomCondition() because it's a hack that only supports dominating conditions from a directly preceding branch -- we're going to need ConstantRangeFP support in one of the lattice propagating passes to cover the general case. But I guess it does work for some cases...
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/190/builds/28528 Here is the relevant piece of the build log for the reference
|
This patch simplifies an fcmp into true/false if it is implied by a dominating fcmp. As an initial support, it only handles two cases: + `fcmp pred1, X, Y -> fcmp pred2, X, Y`: use set operations. + `fcmp pred1, X, C1 -> fcmp pred2, X, C2`: use `ConstantFPRange` and set operations. Note: It doesn't fix llvm#70985, as the second fcmp in the motivating case is not dominated by the edge. We may need to adjust JumpThreading to handle this case. Comptime impact (~+0.1%): https://llvm-compile-time-tracker.com/compare.php?from=a728f213c863e4dd19f8969a417148d2951323c0&to=8ca70404fb0d66a824f39d83050ac38e2f1b25b9&stat=instructions:u IR diff: dtcxzyw/llvm-opt-benchmark#2848
This patch simplifies an fcmp into true/false if it is implied by a dominating fcmp.
As an initial support, it only handles two cases:
fcmp pred1, X, Y -> fcmp pred2, X, Y
: use set operations.fcmp pred1, X, C1 -> fcmp pred2, X, C2
: useConstantFPRange
and set operations.Note: It doesn't fix #70985, as the second fcmp in the motivating case is not dominated by the edge. We may need to adjust JumpThreading to handle this case.
Comptime impact (~+0.1%): https://llvm-compile-time-tracker.com/compare.php?from=a728f213c863e4dd19f8969a417148d2951323c0&to=8ca70404fb0d66a824f39d83050ac38e2f1b25b9&stat=instructions:u
IR diff: dtcxzyw/llvm-opt-benchmark#2848