-
Notifications
You must be signed in to change notification settings - Fork 15.2k
[InstCombine] Enable FAdd simplifications when user can ignore sign bit #157757
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-backend-amdgpu @llvm/pr-subscribers-llvm-analysis Author: Vedant Paranjape (VedantParanjape) ChangesWhen FAdd result is used by fabs, we can safely ignore the sign bit of fp zero. This patch enables an instruction simplification optimization that folds fadd x, 0 ==> x, which would otherwise not work as the compiler cannot prove that the zero isn't -0. But if the result of the fadd is used by fabs we can simply ignore this and still do the optimization. Fixes #154238 Full diff: https://github.com/llvm/llvm-project/pull/157757.diff 2 Files Affected:
diff --git a/llvm/lib/Analysis/InstructionSimplify.cpp b/llvm/lib/Analysis/InstructionSimplify.cpp
index ebe329aa1d5fe..7f555c24f71a8 100644
--- a/llvm/lib/Analysis/InstructionSimplify.cpp
+++ b/llvm/lib/Analysis/InstructionSimplify.cpp
@@ -5723,7 +5723,9 @@ simplifyFAddInst(Value *Op0, Value *Op1, FastMathFlags FMF,
// fadd X, 0 ==> X, when we know X is not -0
if (canIgnoreSNaN(ExBehavior, FMF))
if (match(Op1, m_PosZeroFP()) &&
- (FMF.noSignedZeros() || cannotBeNegativeZero(Op0, Q)))
+ (FMF.noSignedZeros() || cannotBeNegativeZero(Op0, Q) ||
+ (Q.CxtI && !Q.CxtI->use_empty() &&
+ canIgnoreSignBitOfZero(*(Q.CxtI->use_begin())))))
return Op0;
if (!isDefaultFPEnvironment(ExBehavior, Rounding))
diff --git a/llvm/test/Transforms/InstSimplify/fold-fadd-with-zero-gh154238.ll b/llvm/test/Transforms/InstSimplify/fold-fadd-with-zero-gh154238.ll
new file mode 100644
index 0000000000000..bb12328574dda
--- /dev/null
+++ b/llvm/test/Transforms/InstSimplify/fold-fadd-with-zero-gh154238.ll
@@ -0,0 +1,12 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
+; RUN: opt < %s -passes=instsimplify -S | FileCheck %s
+define float @src(float %arg1) {
+; CHECK-LABEL: define float @src(
+; CHECK-SAME: float [[ARG1:%.*]]) {
+; CHECK-NEXT: [[V3:%.*]] = call float @llvm.fabs.f32(float [[ARG1]])
+; CHECK-NEXT: ret float [[V3]]
+;
+ %v2 = fadd float %arg1, 0.000000e+00
+ %v3 = call float @llvm.fabs.f32(float %v2)
+ ret float %v3
+}
|
@llvm/pr-subscribers-llvm-transforms Author: Vedant Paranjape (VedantParanjape) ChangesWhen FAdd result is used by fabs, we can safely ignore the sign bit of fp zero. This patch enables an instruction simplification optimization that folds fadd x, 0 ==> x, which would otherwise not work as the compiler cannot prove that the zero isn't -0. But if the result of the fadd is used by fabs we can simply ignore this and still do the optimization. Fixes #154238 Full diff: https://github.com/llvm/llvm-project/pull/157757.diff 2 Files Affected:
diff --git a/llvm/lib/Analysis/InstructionSimplify.cpp b/llvm/lib/Analysis/InstructionSimplify.cpp
index ebe329aa1d5fe..7f555c24f71a8 100644
--- a/llvm/lib/Analysis/InstructionSimplify.cpp
+++ b/llvm/lib/Analysis/InstructionSimplify.cpp
@@ -5723,7 +5723,9 @@ simplifyFAddInst(Value *Op0, Value *Op1, FastMathFlags FMF,
// fadd X, 0 ==> X, when we know X is not -0
if (canIgnoreSNaN(ExBehavior, FMF))
if (match(Op1, m_PosZeroFP()) &&
- (FMF.noSignedZeros() || cannotBeNegativeZero(Op0, Q)))
+ (FMF.noSignedZeros() || cannotBeNegativeZero(Op0, Q) ||
+ (Q.CxtI && !Q.CxtI->use_empty() &&
+ canIgnoreSignBitOfZero(*(Q.CxtI->use_begin())))))
return Op0;
if (!isDefaultFPEnvironment(ExBehavior, Rounding))
diff --git a/llvm/test/Transforms/InstSimplify/fold-fadd-with-zero-gh154238.ll b/llvm/test/Transforms/InstSimplify/fold-fadd-with-zero-gh154238.ll
new file mode 100644
index 0000000000000..bb12328574dda
--- /dev/null
+++ b/llvm/test/Transforms/InstSimplify/fold-fadd-with-zero-gh154238.ll
@@ -0,0 +1,12 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
+; RUN: opt < %s -passes=instsimplify -S | FileCheck %s
+define float @src(float %arg1) {
+; CHECK-LABEL: define float @src(
+; CHECK-SAME: float [[ARG1:%.*]]) {
+; CHECK-NEXT: [[V3:%.*]] = call float @llvm.fabs.f32(float [[ARG1]])
+; CHECK-NEXT: ret float [[V3]]
+;
+ %v2 = fadd float %arg1, 0.000000e+00
+ %v3 = call float @llvm.fabs.f32(float %v2)
+ ret float %v3
+}
|
I reviewed the failing test case (CodeGen/AMDGPU/fcanonicalize-elimination.ll) |
When FAdd result is used by fabs, we can safely ignore the sign bit of fp zero. This patch enables an instruction simplification optimization that folds fadd x, 0 ==> x, which would otherwise not work as the compiler cannot prove that the zero isn't -0. But if the result of the fadd is used by fabs we can simply ignore this and still do the optimization. Fixes llvm#154238
0e7d005
to
b230ed7
Compare
It seems on older arch it emits a vmax, and vmul on the newer ones. It does so to make sure fmath flags are copied over correctly. |
(FMF.noSignedZeros() || cannotBeNegativeZero(Op0, Q))) | ||
(FMF.noSignedZeros() || cannotBeNegativeZero(Op0, Q) || | ||
(Q.CxtI && !Q.CxtI->use_empty() && | ||
canIgnoreSignBitOfZero(*(Q.CxtI->use_begin()))))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can't do use-based reasoning inside InstSimplify, this must happen in InstCombine.
Made the changes as proposed, also changed canIgnoreSignBitOfNaN for uniformity. |
This reverts the previous change asked by #141015 (comment) :) |
Okay makes sense, as in move the complete optimization to InstCombine |
f6e3f21
to
3aaf155
Compare
@dtcxzyw made the changes. |
@arsenm tracking here for which call operators this can be implemented. Operators whose identity element is zero.
|
@zyw-bot mfuzz |
Co-authored-by: Yingwei Zheng <dtcxzyw@qq.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LG
Value *A; | ||
if (match(&I, m_OneUse(m_FSub(m_Value(A), m_AnyZeroFP()))) && | ||
canIgnoreSignBitOfZero(*I.use_begin())) | ||
return replaceInstUsesWith(I, A); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This fold doesn't make sense, as fsub x, C
will be canonicalized to fadd x, -C
. The test case you added makes even less sense because it uses fsub x, 0.0
aka fadd x, -0.0
which is always a no-op and does not actually depend on the use-based logic.
Please remove this code again.
Since FSub X, 0 gets canoncialised to FAdd X, -0 the said optimization didn't make much sense for FSub. Remove it from IC and the adjoined testcase.
When FAdd result is used by fabs, we can safely ignore the sign bit of fp zero. This patch enables an instruction simplification optimization that folds fadd x, 0 ==> x, which would otherwise not work as the compiler cannot prove that the zero isn't -0. But if the result of the fadd is used by fabs we can simply ignore this and still do the optimization.
Fixes #154238