-
Notifications
You must be signed in to change notification settings - Fork 10.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[X86] Set x87 fld1/fldz pseudo instructions as rematerializable #74592
Conversation
@llvm/pr-subscribers-backend-x86 Author: Simon Pilgrim (RKSimon) ChangesNo need to generate/spill/restore to cpu stack Cleanup work to allow us to properly use isFPImmLegal and fix some regressions encountered while looking at #74304 Full diff: https://github.com/llvm/llvm-project/pull/74592.diff 3 Files Affected:
diff --git a/llvm/lib/Target/X86/X86InstrFPStack.td b/llvm/lib/Target/X86/X86InstrFPStack.td
index ef4c011c669ad..09655d9391211 100644
--- a/llvm/lib/Target/X86/X86InstrFPStack.td
+++ b/llvm/lib/Target/X86/X86InstrFPStack.td
@@ -524,7 +524,7 @@ def XCH_F : FPI<0xD9, MRM1r, (outs), (ins RSTi:$op), "fxch\t$op">;
}
// Floating point constant loads.
-let SchedRW = [WriteZero], Uses = [FPCW] in {
+let SchedRW = [WriteZero], Uses = [FPCW], isReMaterializable = 1 in {
def LD_Fp032 : FpIf32<(outs RFP32:$dst), (ins), ZeroArgFP,
[(set RFP32:$dst, fpimm0)]>;
def LD_Fp132 : FpIf32<(outs RFP32:$dst), (ins), ZeroArgFP,
diff --git a/llvm/lib/Target/X86/X86InstrInfo.cpp b/llvm/lib/Target/X86/X86InstrInfo.cpp
index ea3bf1f101c1e..bf0845a6708b0 100644
--- a/llvm/lib/Target/X86/X86InstrInfo.cpp
+++ b/llvm/lib/Target/X86/X86InstrInfo.cpp
@@ -782,6 +782,12 @@ bool X86InstrInfo::isReallyTriviallyReMaterializable(
break;
case X86::LOAD_STACK_GUARD:
+ case X86::LD_Fp032:
+ case X86::LD_Fp064:
+ case X86::LD_Fp080:
+ case X86::LD_Fp132:
+ case X86::LD_Fp164:
+ case X86::LD_Fp180:
case X86::AVX1_SETALLONES:
case X86::AVX2_SETALLONES:
case X86::AVX512_128_SET0:
diff --git a/llvm/test/CodeGen/X86/swifterror.ll b/llvm/test/CodeGen/X86/swifterror.ll
index 5814146a54613..8fff6405d0d89 100644
--- a/llvm/test/CodeGen/X86/swifterror.ll
+++ b/llvm/test/CodeGen/X86/swifterror.ll
@@ -243,8 +243,6 @@ define float @caller2(ptr %error_ref) {
; CHECK-i386-NEXT: .cfi_offset %edi, -8
; CHECK-i386-NEXT: movl 32(%esp), %esi
; CHECK-i386-NEXT: leal 16(%esp), %edi
-; CHECK-i386-NEXT: fld1
-; CHECK-i386-NEXT: fstps {{[-0-9]+}}(%e{{[sb]}}p) ## 4-byte Folded Spill
; CHECK-i386-NEXT: LBB2_1: ## %bb_loop
; CHECK-i386-NEXT: ## =>This Inner Loop Header: Depth=1
; CHECK-i386-NEXT: movl $0, 16(%esp)
@@ -255,7 +253,7 @@ define float @caller2(ptr %error_ref) {
; CHECK-i386-NEXT: jne LBB2_4
; CHECK-i386-NEXT: ## %bb.2: ## %cont
; CHECK-i386-NEXT: ## in Loop: Header=BB2_1 Depth=1
-; CHECK-i386-NEXT: flds {{[-0-9]+}}(%e{{[sb]}}p) ## 4-byte Folded Reload
+; CHECK-i386-NEXT: fld1
; CHECK-i386-NEXT: fxch %st(1)
; CHECK-i386-NEXT: fucompp
; CHECK-i386-NEXT: fnstsw %ax
@@ -270,7 +268,7 @@ define float @caller2(ptr %error_ref) {
; CHECK-i386-NEXT: fstp %st(0)
; CHECK-i386-NEXT: movl %ecx, (%esp)
; CHECK-i386-NEXT: calll _free
-; CHECK-i386-NEXT: flds {{[-0-9]+}}(%e{{[sb]}}p) ## 4-byte Folded Reload
+; CHECK-i386-NEXT: fld1
; CHECK-i386-NEXT: addl $20, %esp
; CHECK-i386-NEXT: popl %esi
; CHECK-i386-NEXT: popl %edi
@@ -470,8 +468,6 @@ define float @foo_loop(ptr swifterror %error_ptr_ref, i32 %cc, float %cc2) {
; CHECK-i386-NEXT: fstps {{[-0-9]+}}(%e{{[sb]}}p) ## 4-byte Folded Spill
; CHECK-i386-NEXT: movl 36(%esp), %esi
; CHECK-i386-NEXT: movl 32(%esp), %edi
-; CHECK-i386-NEXT: fld1
-; CHECK-i386-NEXT: fstps {{[-0-9]+}}(%e{{[sb]}}p) ## 4-byte Folded Spill
; CHECK-i386-NEXT: LBB4_1: ## %bb_loop
; CHECK-i386-NEXT: ## =>This Inner Loop Header: Depth=1
; CHECK-i386-NEXT: testl %esi, %esi
@@ -486,7 +482,7 @@ define float @foo_loop(ptr swifterror %error_ptr_ref, i32 %cc, float %cc2) {
; CHECK-i386-NEXT: LBB4_3: ## %bb_cont
; CHECK-i386-NEXT: ## in Loop: Header=BB4_1 Depth=1
; CHECK-i386-NEXT: flds {{[-0-9]+}}(%e{{[sb]}}p) ## 4-byte Folded Reload
-; CHECK-i386-NEXT: flds {{[-0-9]+}}(%e{{[sb]}}p) ## 4-byte Folded Reload
+; CHECK-i386-NEXT: fld1
; CHECK-i386-NEXT: fxch %st(1)
; CHECK-i386-NEXT: fucompp
; CHECK-i386-NEXT: fnstsw %ax
|
No need to generate/spill/restore to cpu stack Cleanup work to allow us to properly use isFPImmLegal and fix some regressions encountered while looking at llvm#74304
71a400d
to
477c113
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
No need to generate/spill/restore to cpu stack
Cleanup work to allow us to properly use isFPImmLegal and fix some regressions encountered while looking at #74304