-
Notifications
You must be signed in to change notification settings - Fork 10.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[X86] Resolve FIXME: Add FPCW as a rounding control register #82452
Conversation
@llvm/pr-subscribers-backend-x86 Author: AtariDreams (AtariDreams) ChangesIn tests, this seems to only cause the wait instruction to disappear, which I do not see as necessary in these particular cases. Patch is 29.15 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/82452.diff 10 Files Affected:
diff --git a/llvm/lib/Target/X86/X86ISelLoweringCall.cpp b/llvm/lib/Target/X86/X86ISelLoweringCall.cpp
index be8275c92e11ae..27784dc99202a0 100644
--- a/llvm/lib/Target/X86/X86ISelLoweringCall.cpp
+++ b/llvm/lib/Target/X86/X86ISelLoweringCall.cpp
@@ -670,9 +670,7 @@ const MCPhysReg *X86TargetLowering::getScratchRegisters(CallingConv::ID) const {
}
ArrayRef<MCPhysReg> X86TargetLowering::getRoundingControlRegisters() const {
- // FIXME: We should def X86::FPCW for x87 as well. But it affects a lot of lit
- // tests at the moment, which is not what we expected.
- static const MCPhysReg RCRegs[] = {X86::MXCSR};
+ static const MCPhysReg RCRegs[] = {X86::MXCSR, X86::FPCW};
return RCRegs;
}
diff --git a/llvm/test/CodeGen/X86/fp-intrinsics.ll b/llvm/test/CodeGen/X86/fp-intrinsics.ll
index d2b45ee1e03e63..1387cac7fe1a70 100644
--- a/llvm/test/CodeGen/X86/fp-intrinsics.ll
+++ b/llvm/test/CodeGen/X86/fp-intrinsics.ll
@@ -301,7 +301,6 @@ define double @f6() #0 {
; X87-NEXT: fstpl {{[0-9]+}}(%esp)
; X87-NEXT: fldl {{\.?LCPI[0-9]+_[0-9]+}}
; X87-NEXT: fstpl (%esp)
-; X87-NEXT: wait
; X87-NEXT: calll pow
; X87-NEXT: addl $28, %esp
; X87-NEXT: .cfi_def_cfa_offset 4
@@ -413,7 +412,6 @@ define double @f8() #0 {
; X87-NEXT: .cfi_def_cfa_offset 16
; X87-NEXT: flds {{\.?LCPI[0-9]+_[0-9]+}}
; X87-NEXT: fstpl (%esp)
-; X87-NEXT: wait
; X87-NEXT: calll sin
; X87-NEXT: addl $12, %esp
; X87-NEXT: .cfi_def_cfa_offset 4
@@ -464,7 +462,6 @@ define double @f9() #0 {
; X87-NEXT: .cfi_def_cfa_offset 16
; X87-NEXT: flds {{\.?LCPI[0-9]+_[0-9]+}}
; X87-NEXT: fstpl (%esp)
-; X87-NEXT: wait
; X87-NEXT: calll cos
; X87-NEXT: addl $12, %esp
; X87-NEXT: .cfi_def_cfa_offset 4
@@ -515,7 +512,6 @@ define double @f10() #0 {
; X87-NEXT: .cfi_def_cfa_offset 16
; X87-NEXT: flds {{\.?LCPI[0-9]+_[0-9]+}}
; X87-NEXT: fstpl (%esp)
-; X87-NEXT: wait
; X87-NEXT: calll exp
; X87-NEXT: addl $12, %esp
; X87-NEXT: .cfi_def_cfa_offset 4
@@ -566,7 +562,6 @@ define double @f11() #0 {
; X87-NEXT: .cfi_def_cfa_offset 16
; X87-NEXT: fldl {{\.?LCPI[0-9]+_[0-9]+}}
; X87-NEXT: fstpl (%esp)
-; X87-NEXT: wait
; X87-NEXT: calll exp2
; X87-NEXT: addl $12, %esp
; X87-NEXT: .cfi_def_cfa_offset 4
@@ -617,7 +612,6 @@ define double @f12() #0 {
; X87-NEXT: .cfi_def_cfa_offset 16
; X87-NEXT: flds {{\.?LCPI[0-9]+_[0-9]+}}
; X87-NEXT: fstpl (%esp)
-; X87-NEXT: wait
; X87-NEXT: calll log
; X87-NEXT: addl $12, %esp
; X87-NEXT: .cfi_def_cfa_offset 4
@@ -668,7 +662,6 @@ define double @f13() #0 {
; X87-NEXT: .cfi_def_cfa_offset 16
; X87-NEXT: flds {{\.?LCPI[0-9]+_[0-9]+}}
; X87-NEXT: fstpl (%esp)
-; X87-NEXT: wait
; X87-NEXT: calll log10
; X87-NEXT: addl $12, %esp
; X87-NEXT: .cfi_def_cfa_offset 4
@@ -719,7 +712,6 @@ define double @f14() #0 {
; X87-NEXT: .cfi_def_cfa_offset 16
; X87-NEXT: flds {{\.?LCPI[0-9]+_[0-9]+}}
; X87-NEXT: fstpl (%esp)
-; X87-NEXT: wait
; X87-NEXT: calll log2
; X87-NEXT: addl $12, %esp
; X87-NEXT: .cfi_def_cfa_offset 4
@@ -770,7 +762,6 @@ define double @f15() #0 {
; X87-NEXT: .cfi_def_cfa_offset 16
; X87-NEXT: fldl {{\.?LCPI[0-9]+_[0-9]+}}
; X87-NEXT: fstpl (%esp)
-; X87-NEXT: wait
; X87-NEXT: calll rint
; X87-NEXT: addl $12, %esp
; X87-NEXT: .cfi_def_cfa_offset 4
@@ -818,7 +809,6 @@ define double @f16() #0 {
; X87-NEXT: .cfi_def_cfa_offset 16
; X87-NEXT: fldl {{\.?LCPI[0-9]+_[0-9]+}}
; X87-NEXT: fstpl (%esp)
-; X87-NEXT: wait
; X87-NEXT: calll nearbyint
; X87-NEXT: addl $12, %esp
; X87-NEXT: .cfi_def_cfa_offset 4
@@ -867,7 +857,6 @@ define double @f19() #0 {
; X87-NEXT: fstpl {{[0-9]+}}(%esp)
; X87-NEXT: fld1
; X87-NEXT: fstpl (%esp)
-; X87-NEXT: wait
; X87-NEXT: calll fmod
; X87-NEXT: addl $28, %esp
; X87-NEXT: .cfi_def_cfa_offset 4
@@ -1607,7 +1596,6 @@ define i32 @f23(double %x) #0 {
; X87-NEXT: .cfi_def_cfa_offset 16
; X87-NEXT: fldl {{[0-9]+}}(%esp)
; X87-NEXT: fstpl (%esp)
-; X87-NEXT: wait
; X87-NEXT: calll lrint
; X87-NEXT: addl $12, %esp
; X87-NEXT: .cfi_def_cfa_offset 4
@@ -1655,7 +1643,6 @@ define i32 @f24(float %x) #0 {
; X87-NEXT: .cfi_def_cfa_offset 16
; X87-NEXT: flds {{[0-9]+}}(%esp)
; X87-NEXT: fstps (%esp)
-; X87-NEXT: wait
; X87-NEXT: calll lrintf
; X87-NEXT: addl $12, %esp
; X87-NEXT: .cfi_def_cfa_offset 4
@@ -1703,7 +1690,6 @@ define i64 @f25(double %x) #0 {
; X87-NEXT: .cfi_def_cfa_offset 16
; X87-NEXT: fldl {{[0-9]+}}(%esp)
; X87-NEXT: fstpl (%esp)
-; X87-NEXT: wait
; X87-NEXT: calll llrint
; X87-NEXT: addl $12, %esp
; X87-NEXT: .cfi_def_cfa_offset 4
@@ -1751,7 +1737,6 @@ define i64 @f26(float %x) #0 {
; X87-NEXT: .cfi_def_cfa_offset 16
; X87-NEXT: flds {{[0-9]+}}(%esp)
; X87-NEXT: fstps (%esp)
-; X87-NEXT: wait
; X87-NEXT: calll llrintf
; X87-NEXT: addl $12, %esp
; X87-NEXT: .cfi_def_cfa_offset 4
@@ -1799,7 +1784,6 @@ define i32 @f27(double %x) #0 {
; X87-NEXT: .cfi_def_cfa_offset 16
; X87-NEXT: fldl {{[0-9]+}}(%esp)
; X87-NEXT: fstpl (%esp)
-; X87-NEXT: wait
; X87-NEXT: calll lround
; X87-NEXT: addl $12, %esp
; X87-NEXT: .cfi_def_cfa_offset 4
@@ -1846,7 +1830,6 @@ define i32 @f28(float %x) #0 {
; X87-NEXT: .cfi_def_cfa_offset 16
; X87-NEXT: flds {{[0-9]+}}(%esp)
; X87-NEXT: fstps (%esp)
-; X87-NEXT: wait
; X87-NEXT: calll lroundf
; X87-NEXT: addl $12, %esp
; X87-NEXT: .cfi_def_cfa_offset 4
@@ -1893,7 +1876,6 @@ define i64 @f29(double %x) #0 {
; X87-NEXT: .cfi_def_cfa_offset 16
; X87-NEXT: fldl {{[0-9]+}}(%esp)
; X87-NEXT: fstpl (%esp)
-; X87-NEXT: wait
; X87-NEXT: calll llround
; X87-NEXT: addl $12, %esp
; X87-NEXT: .cfi_def_cfa_offset 4
@@ -1940,7 +1922,6 @@ define i64 @f30(float %x) #0 {
; X87-NEXT: .cfi_def_cfa_offset 16
; X87-NEXT: flds {{[0-9]+}}(%esp)
; X87-NEXT: fstps (%esp)
-; X87-NEXT: wait
; X87-NEXT: calll llroundf
; X87-NEXT: addl $12, %esp
; X87-NEXT: .cfi_def_cfa_offset 4
diff --git a/llvm/test/CodeGen/X86/fp-strict-libcalls-msvc32.ll b/llvm/test/CodeGen/X86/fp-strict-libcalls-msvc32.ll
index 1bc308bef8cccf..c9dd8a7374569f 100644
--- a/llvm/test/CodeGen/X86/fp-strict-libcalls-msvc32.ll
+++ b/llvm/test/CodeGen/X86/fp-strict-libcalls-msvc32.ll
@@ -7,7 +7,6 @@ define float @ceil(float %x) #0 {
; CHECK-NEXT: subl $12, %esp
; CHECK-NEXT: flds {{[0-9]+}}(%esp)
; CHECK-NEXT: fstpl (%esp)
-; CHECK-NEXT: wait
; CHECK-NEXT: calll _ceil
; CHECK-NEXT: fstps {{[0-9]+}}(%esp)
; CHECK-NEXT: flds {{[0-9]+}}(%esp)
@@ -24,7 +23,6 @@ define float @cos(float %x) #0 {
; CHECK-NEXT: subl $12, %esp
; CHECK-NEXT: flds {{[0-9]+}}(%esp)
; CHECK-NEXT: fstpl (%esp)
-; CHECK-NEXT: wait
; CHECK-NEXT: calll _cos
; CHECK-NEXT: fstps {{[0-9]+}}(%esp)
; CHECK-NEXT: flds {{[0-9]+}}(%esp)
@@ -41,7 +39,6 @@ define float @exp(float %x) #0 {
; CHECK-NEXT: subl $12, %esp
; CHECK-NEXT: flds {{[0-9]+}}(%esp)
; CHECK-NEXT: fstpl (%esp)
-; CHECK-NEXT: wait
; CHECK-NEXT: calll _exp
; CHECK-NEXT: fstps {{[0-9]+}}(%esp)
; CHECK-NEXT: flds {{[0-9]+}}(%esp)
@@ -58,7 +55,6 @@ define float @floor(float %x) #0 {
; CHECK-NEXT: subl $12, %esp
; CHECK-NEXT: flds {{[0-9]+}}(%esp)
; CHECK-NEXT: fstpl (%esp)
-; CHECK-NEXT: wait
; CHECK-NEXT: calll _floor
; CHECK-NEXT: fstps {{[0-9]+}}(%esp)
; CHECK-NEXT: flds {{[0-9]+}}(%esp)
@@ -78,7 +74,6 @@ define float @frem(float %x, float %y) #0 {
; CHECK-NEXT: fxch %st(1)
; CHECK-NEXT: fstpl {{[0-9]+}}(%esp)
; CHECK-NEXT: fstpl (%esp)
-; CHECK-NEXT: wait
; CHECK-NEXT: calll _fmod
; CHECK-NEXT: fstps {{[0-9]+}}(%esp)
; CHECK-NEXT: flds {{[0-9]+}}(%esp)
@@ -95,7 +90,6 @@ define float @log(float %x) #0 {
; CHECK-NEXT: subl $12, %esp
; CHECK-NEXT: flds {{[0-9]+}}(%esp)
; CHECK-NEXT: fstpl (%esp)
-; CHECK-NEXT: wait
; CHECK-NEXT: calll _log
; CHECK-NEXT: fstps {{[0-9]+}}(%esp)
; CHECK-NEXT: flds {{[0-9]+}}(%esp)
@@ -112,7 +106,6 @@ define float @log10(float %x) #0 {
; CHECK-NEXT: subl $12, %esp
; CHECK-NEXT: flds {{[0-9]+}}(%esp)
; CHECK-NEXT: fstpl (%esp)
-; CHECK-NEXT: wait
; CHECK-NEXT: calll _log10
; CHECK-NEXT: fstps {{[0-9]+}}(%esp)
; CHECK-NEXT: flds {{[0-9]+}}(%esp)
@@ -132,7 +125,6 @@ define float @pow(float %x, float %y) #0 {
; CHECK-NEXT: fxch %st(1)
; CHECK-NEXT: fstpl {{[0-9]+}}(%esp)
; CHECK-NEXT: fstpl (%esp)
-; CHECK-NEXT: wait
; CHECK-NEXT: calll _pow
; CHECK-NEXT: fstps {{[0-9]+}}(%esp)
; CHECK-NEXT: flds {{[0-9]+}}(%esp)
@@ -149,7 +141,6 @@ define float @sin(float %x) #0 {
; CHECK-NEXT: subl $12, %esp
; CHECK-NEXT: flds {{[0-9]+}}(%esp)
; CHECK-NEXT: fstpl (%esp)
-; CHECK-NEXT: wait
; CHECK-NEXT: calll _sin
; CHECK-NEXT: fstps {{[0-9]+}}(%esp)
; CHECK-NEXT: flds {{[0-9]+}}(%esp)
diff --git a/llvm/test/CodeGen/X86/fp-strict-scalar-cmp.ll b/llvm/test/CodeGen/X86/fp-strict-scalar-cmp.ll
index cb1876fee05aea..47a83e9ae581a2 100644
--- a/llvm/test/CodeGen/X86/fp-strict-scalar-cmp.ll
+++ b/llvm/test/CodeGen/X86/fp-strict-scalar-cmp.ll
@@ -4186,7 +4186,6 @@ define void @foo(float %0, float %1) #0 {
; X87-CMOV-NEXT: flds {{[0-9]+}}(%esp)
; X87-CMOV-NEXT: fucompi %st(1), %st
; X87-CMOV-NEXT: fstp %st(0)
-; X87-CMOV-NEXT: wait
; X87-CMOV-NEXT: ja bar # TAILCALL
; X87-CMOV-NEXT: # %bb.1:
; X87-CMOV-NEXT: retl
diff --git a/llvm/test/CodeGen/X86/fp-strict-scalar.ll b/llvm/test/CodeGen/X86/fp-strict-scalar.ll
index f1be74f5c3ac4c..5939bb4189b7ab 100644
--- a/llvm/test/CodeGen/X86/fp-strict-scalar.ll
+++ b/llvm/test/CodeGen/X86/fp-strict-scalar.ll
@@ -660,7 +660,6 @@ define double @fma_f64(double %a, double %b, double %c) nounwind strictfp {
; X87-NEXT: fstpl {{[0-9]+}}(%esp)
; X87-NEXT: fstpl {{[0-9]+}}(%esp)
; X87-NEXT: fstpl (%esp)
-; X87-NEXT: wait
; X87-NEXT: calll fma
; X87-NEXT: addl $24, %esp
; X87-NEXT: retl
@@ -717,7 +716,6 @@ define float @fma_f32(float %a, float %b, float %c) nounwind strictfp {
; X87-NEXT: fstps {{[0-9]+}}(%esp)
; X87-NEXT: fstps {{[0-9]+}}(%esp)
; X87-NEXT: fstps (%esp)
-; X87-NEXT: wait
; X87-NEXT: calll fmaf
; X87-NEXT: addl $12, %esp
; X87-NEXT: retl
diff --git a/llvm/test/CodeGen/X86/fp128-cast-strict.ll b/llvm/test/CodeGen/X86/fp128-cast-strict.ll
index f141153d059acb..a7cc2ae49e005f 100644
--- a/llvm/test/CodeGen/X86/fp128-cast-strict.ll
+++ b/llvm/test/CodeGen/X86/fp128-cast-strict.ll
@@ -28,7 +28,7 @@ define dso_local void @TestFPExtF16_F128() nounwind strictfp {
; X64-AVX512-LABEL: TestFPExtF16_F128:
; X64-AVX512: # %bb.0: # %entry
; X64-AVX512-NEXT: pushq %rax
-; X64-AVX512-NEXT: vmovsh vf16(%rip), %xmm0
+; X64-AVX512-NEXT: vmovsh {{.*#+}} xmm0 = mem[0],zero,zero,zero,zero,zero,zero,zero
; X64-AVX512-NEXT: callq __extendhftf2@PLT
; X64-AVX512-NEXT: vmovaps %xmm0, vf128(%rip)
; X64-AVX512-NEXT: popq %rax
@@ -167,7 +167,6 @@ define dso_local void @TestFPExtF80_F128() nounwind strictfp {
; X64-SSE-NEXT: subq $24, %rsp
; X64-SSE-NEXT: fldt vf80(%rip)
; X64-SSE-NEXT: fstpt (%rsp)
-; X64-SSE-NEXT: wait
; X64-SSE-NEXT: callq __extendxftf2@PLT
; X64-SSE-NEXT: movaps %xmm0, vf128(%rip)
; X64-SSE-NEXT: addq $24, %rsp
@@ -178,7 +177,6 @@ define dso_local void @TestFPExtF80_F128() nounwind strictfp {
; X64-AVX-NEXT: subq $24, %rsp
; X64-AVX-NEXT: fldt vf80(%rip)
; X64-AVX-NEXT: fstpt (%rsp)
-; X64-AVX-NEXT: wait
; X64-AVX-NEXT: callq __extendxftf2@PLT
; X64-AVX-NEXT: vmovaps %xmm0, vf128(%rip)
; X64-AVX-NEXT: addq $24, %rsp
diff --git a/llvm/test/CodeGen/X86/fp80-strict-libcalls.ll b/llvm/test/CodeGen/X86/fp80-strict-libcalls.ll
index 4d50b15e5c185b..9462acd53c4392 100644
--- a/llvm/test/CodeGen/X86/fp80-strict-libcalls.ll
+++ b/llvm/test/CodeGen/X86/fp80-strict-libcalls.ll
@@ -12,7 +12,6 @@ define x86_fp80 @fma(x86_fp80 %x, x86_fp80 %y, x86_fp80 %z) nounwind strictfp {
; X86-NEXT: fstpt {{[0-9]+}}(%esp)
; X86-NEXT: fstpt {{[0-9]+}}(%esp)
; X86-NEXT: fstpt (%esp)
-; X86-NEXT: wait
; X86-NEXT: calll fmal
; X86-NEXT: addl $36, %esp
; X86-NEXT: retl
@@ -26,7 +25,6 @@ define x86_fp80 @fma(x86_fp80 %x, x86_fp80 %y, x86_fp80 %z) nounwind strictfp {
; X64-NEXT: fstpt {{[0-9]+}}(%rsp)
; X64-NEXT: fstpt {{[0-9]+}}(%rsp)
; X64-NEXT: fstpt (%rsp)
-; X64-NEXT: wait
; X64-NEXT: callq fmal@PLT
; X64-NEXT: addq $56, %rsp
; X64-NEXT: retq
@@ -43,7 +41,6 @@ define x86_fp80 @frem(x86_fp80 %x, x86_fp80 %y) nounwind strictfp {
; X86-NEXT: fldt {{[0-9]+}}(%esp)
; X86-NEXT: fstpt {{[0-9]+}}(%esp)
; X86-NEXT: fstpt (%esp)
-; X86-NEXT: wait
; X86-NEXT: calll fmodl
; X86-NEXT: addl $24, %esp
; X86-NEXT: retl
@@ -55,7 +52,6 @@ define x86_fp80 @frem(x86_fp80 %x, x86_fp80 %y) nounwind strictfp {
; X64-NEXT: fldt {{[0-9]+}}(%rsp)
; X64-NEXT: fstpt {{[0-9]+}}(%rsp)
; X64-NEXT: fstpt (%rsp)
-; X64-NEXT: wait
; X64-NEXT: callq fmodl@PLT
; X64-NEXT: addq $40, %rsp
; X64-NEXT: retq
@@ -70,7 +66,6 @@ define x86_fp80 @ceil(x86_fp80 %x) nounwind strictfp {
; X86-NEXT: subl $12, %esp
; X86-NEXT: fldt {{[0-9]+}}(%esp)
; X86-NEXT: fstpt (%esp)
-; X86-NEXT: wait
; X86-NEXT: calll ceill
; X86-NEXT: addl $12, %esp
; X86-NEXT: retl
@@ -80,7 +75,6 @@ define x86_fp80 @ceil(x86_fp80 %x) nounwind strictfp {
; X64-NEXT: subq $24, %rsp
; X64-NEXT: fldt {{[0-9]+}}(%rsp)
; X64-NEXT: fstpt (%rsp)
-; X64-NEXT: wait
; X64-NEXT: callq ceill@PLT
; X64-NEXT: addq $24, %rsp
; X64-NEXT: retq
@@ -95,7 +89,6 @@ define x86_fp80 @cos(x86_fp80 %x) nounwind strictfp {
; X86-NEXT: subl $12, %esp
; X86-NEXT: fldt {{[0-9]+}}(%esp)
; X86-NEXT: fstpt (%esp)
-; X86-NEXT: wait
; X86-NEXT: calll cosl
; X86-NEXT: addl $12, %esp
; X86-NEXT: retl
@@ -105,7 +98,6 @@ define x86_fp80 @cos(x86_fp80 %x) nounwind strictfp {
; X64-NEXT: subq $24, %rsp
; X64-NEXT: fldt {{[0-9]+}}(%rsp)
; X64-NEXT: fstpt (%rsp)
-; X64-NEXT: wait
; X64-NEXT: callq cosl@PLT
; X64-NEXT: addq $24, %rsp
; X64-NEXT: retq
@@ -120,7 +112,6 @@ define x86_fp80 @exp(x86_fp80 %x) nounwind strictfp {
; X86-NEXT: subl $12, %esp
; X86-NEXT: fldt {{[0-9]+}}(%esp)
; X86-NEXT: fstpt (%esp)
-; X86-NEXT: wait
; X86-NEXT: calll expl
; X86-NEXT: addl $12, %esp
; X86-NEXT: retl
@@ -130,7 +121,6 @@ define x86_fp80 @exp(x86_fp80 %x) nounwind strictfp {
; X64-NEXT: subq $24, %rsp
; X64-NEXT: fldt {{[0-9]+}}(%rsp)
; X64-NEXT: fstpt (%rsp)
-; X64-NEXT: wait
; X64-NEXT: callq expl@PLT
; X64-NEXT: addq $24, %rsp
; X64-NEXT: retq
@@ -145,7 +135,6 @@ define x86_fp80 @exp2(x86_fp80 %x) nounwind strictfp {
; X86-NEXT: subl $12, %esp
; X86-NEXT: fldt {{[0-9]+}}(%esp)
; X86-NEXT: fstpt (%esp)
-; X86-NEXT: wait
; X86-NEXT: calll exp2l
; X86-NEXT: addl $12, %esp
; X86-NEXT: retl
@@ -155,7 +144,6 @@ define x86_fp80 @exp2(x86_fp80 %x) nounwind strictfp {
; X64-NEXT: subq $24, %rsp
; X64-NEXT: fldt {{[0-9]+}}(%rsp)
; X64-NEXT: fstpt (%rsp)
-; X64-NEXT: wait
; X64-NEXT: callq exp2l@PLT
; X64-NEXT: addq $24, %rsp
; X64-NEXT: retq
@@ -170,7 +158,6 @@ define x86_fp80 @floor(x86_fp80 %x) nounwind strictfp {
; X86-NEXT: subl $12, %esp
; X86-NEXT: fldt {{[0-9]+}}(%esp)
; X86-NEXT: fstpt (%esp)
-; X86-NEXT: wait
; X86-NEXT: calll floorl
; X86-NEXT: addl $12, %esp
; X86-NEXT: retl
@@ -180,7 +167,6 @@ define x86_fp80 @floor(x86_fp80 %x) nounwind strictfp {
; X64-NEXT: subq $24, %rsp
; X64-NEXT: fldt {{[0-9]+}}(%rsp)
; X64-NEXT: fstpt (%rsp)
-; X64-NEXT: wait
; X64-NEXT: callq floorl@PLT
; X64-NEXT: addq $24, %rsp
; X64-NEXT: retq
@@ -195,7 +181,6 @@ define x86_fp80 @log(x86_fp80 %x) nounwind strictfp {
; X86-NEXT: subl $12, %esp
; X86-NEXT: fldt {{[0-9]+}}(%esp)
; X86-NEXT: fstpt (%esp)
-; X86-NEXT: wait
; X86-NEXT: calll logl
; X86-NEXT: addl $12, %esp
; X86-NEXT: retl
@@ -205,7 +190,6 @@ define x86_fp80 @log(x86_fp80 %x) nounwind strictfp {
; X64-NEXT: subq $24, %rsp
; X64-NEXT: fldt {{[0-9]+}}(%rsp)
; X64-NEXT: fstpt (%rsp)
-; X64-NEXT: wait
; X64-NEXT: callq logl@PLT
; X64-NEXT: addq $24, %rsp
; X64-NEXT: retq
@@ -220,7 +204,6 @@ define x86_fp80 @log10(x86_fp80 %x) nounwind strictfp {
; X86-NEXT: subl $12, %esp
; X86-NEXT: fldt {{[0-9]+}}(%esp)
; X86-NEXT: fstpt (%esp)
-; X86-NEXT: wait
; X86-NEXT: calll log10l
; X86-NEXT: addl $12, %esp
; X86-NEXT: retl
@@ -230,7 +213,6 @@ define x86_fp80 @log10(x86_fp80 %x) nounwind strictfp {
; X64-NEXT: subq $24, %rsp
; X64-NEXT: fldt {{[0-9]+}}(%rsp)
; X64-NEXT: fstpt (%rsp)
-; X64-NEXT: wait
; X64-NEXT: callq log10l@PLT
; X64-NEXT: addq $24, %rsp
; X64-NEXT: retq
@@ -245,7 +227,6 @@ define x86_fp80 @log2(x86_fp80 %x) nounwind strictfp {
; X86-NEXT: subl $12, %esp
; X86-NEXT: fldt {{[0-9]+}}(%esp)
; X86-NEXT: fstpt (%esp)
-; X86-NEXT: wait
; X86-NEXT: calll log2l
; X86-NEXT: addl $12, %esp
; X86-NEXT: retl
@@ -255,7 +236,6 @@ define x86_fp80 @log2(x86_fp80 %x) nounwind strictfp {
; X64-NEXT: subq $24, %rsp
; X64-NEXT: fldt {{[0-9]+}}(%rsp)
; X64-NEXT: fstpt (%rsp)
-; X64-NEXT: wait
; X64-NEXT: callq log2l@PLT
; X64-NEXT: addq $24, %rsp
; X64-NEXT: retq
@@ -272,7 +252,6 @@ define x86_fp80 @maxnum(x86_fp80 %x, x86_fp80 %y) nounwind strictfp {
; X86-NEXT: fldt {{[0-9]+}}(%esp)
; X86-NEXT: fstpt {{[0-9]+}}(%esp)
; X86-NEXT: fstpt (%esp)
-; X86-NEXT: wait
; X86-NEXT: calll fmaxl
; X86-NEXT: addl $24, %esp
; X86-NEXT: retl
@@ -284,7 +263,6 @@ define x86_fp80 @maxnum(x86_fp80 %x, x86_fp80 %y) nounwind strictfp {
; X64-NEXT: fldt {{[0-9]+}}(%rsp)
; X64-NEXT: fstpt {{[0-9]+}}(%rsp)
; X64-NEXT: fstpt (%rsp)
-; X64-NEXT: wait
; X64-NEXT: callq fmaxl@PLT
; X64-NEXT: addq $40, %rsp
; X64-NEXT: retq
@@ -301,7 +279,6 @@ define x86_fp80 @minnum(x86_fp80 %x, x86_fp80 %y) nounwind strictfp {
; X86-NEXT: fldt {{[0-9]+}}(%esp)
; X86-NEXT: fstpt {{[0-9]+}}(%esp)
; X86-NEXT: fstpt (%esp)
-; X86-NEXT: wait
; X86-NEXT: calll fminl
; X86-NEXT: addl $24, %esp
; X86-NEXT: retl
@@ -313,7 +290,6 @@ define x86_fp80 @minnum(x86_fp80 %x, x86_fp80 %y) nounwind strictfp {
; X64-NEXT: fldt {{[0-9]+}}(%rsp)
; X64-NEXT: fstpt {{[0-9]+}}(%rsp)
; X64-NEXT: fstpt (%rsp)
-; X64-NEXT: wait
; X64-NEXT: callq fminl@PLT
; X64-NEXT: addq $40, %rsp
; X64-NEXT: retq
@@ -328,7 +304,6 @@ define x86_fp80 @nearbyint(x86_fp80 %x) nounwind strictfp {
; X86-NEXT: subl $12, %esp
; X86-NEXT: fldt {{[0-9]+}}(%esp)
; X86-NEXT: fstpt (%esp)
-; X86-NEXT: wait
; X86-NEXT: calll nearbyintl
; X86-NEXT: addl $12, %esp
; X86-NEXT: retl
@@ -338,7 +313,6 @@ define x86_fp80 @nearbyint(x86_fp80 %x) nounwind strictfp {
; X64-NEXT: subq $24, %rsp
; X64-NEXT: fldt {{[0-9]+}}(%rsp)
; X64-NEXT: fstpt (%rsp)
-; X64-NEXT: wait
; X64-NEXT: callq nearbyintl@PLT
; X64-NEXT: addq $24, %rsp
; X64-NEXT: retq
@@ -355,7 +329,6 @@ define x86_fp80 @pow(x86_fp80 %x, x86_fp80 %y) nounwind strictfp {
; X86-NEXT: fldt {{[0-9]+}}(%esp)
; X86-NEXT: fstpt {{[0-9]+}}(%esp)
; X86-NEXT: fstpt (%esp)
-; X86-NEXT: wait
; X86-NEXT: calll powl
; X86-NEXT: addl $24, %esp
; X86-NEXT: retl
@@ -367,7 +340,6 @@ define x86_fp80 @pow(x86_fp80 %x, x86_fp80 %y) nounwind strictfp {
; X64-NEXT: fldt {{[0-9]+}}(%rsp)
; X64-NEXT: fstpt {{[0-9]+}}(%rsp)
; X64-NEXT: fstpt (%rsp)
-; X64-NEXT: wait
; X64-NEXT: callq powl@PLT
; X64-NEXT: addq $40, %rsp
; X64-NEXT: retq
@@ -385,7 +357,6 @@ define x86_fp80 @powi(x86_fp80 %x, i32 %y) nounwind strictfp {
; X86-NEXT: movl {{[0-9]+}}(%esp), %eax
; X86-NEXT: movl %eax, {{[0-9]+}}(%esp)
; X86-NEXT: fstpt (%esp)
-; X86-NEXT: wait
; X86-NEXT: calll __powixf2
; X86-NEXT: addl $16, %esp
; X86-NEXT: retl
@@ -395,7 +366,6 @@ define x86_fp80 @powi(x86_fp80 %x, i32 %y) nounwind strictfp {
; X64-NEXT:...
[truncated]
|
|
I think the wait instructions are intended. The affected tests are all for constrained intrinsics which care the precise exception. Per to SDM
We don't know what instructions a function call may execute, so should emit |
09d9ee3
to
16fcef2
Compare
In that case, let's fix the wait instruction to do that. |
In that case, I think I have a solution. Question: do you know if implicit-defs for function calls are counted as operands in a call instruction? |
93c408b
to
60ac384
Compare
@phoebewang So what do you think about that now? |
Is it possible to test this change? I don't know what |
It's for adding those registers to the usedregs array so they aren't marked as killed between function calls. This had the side effect of causing the wait insertion pass to not consider call sites as places to insert wait before, so as a result I'm sure other workarounds had to be done. |
My point is that there should be some other test change that proves that this patch does something useful. Making a change and then another change to undo the only observed effect doesn't seem useful by itself. |
4b4552e
to
986e88a
Compare
3d0c0f6
to
542ce93
Compare
48d68f5
to
82e6233
Compare
9938093
to
7df048c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, sorry for missing phoebe's suggestion
The reason adding fpcr broke tests is because that caused LLVM to no longer kill the instruction before a call, which prevented LLVM from treating x87 as an operand, which meant the call was not eligible for a wait before it as a result. This patch now has LLVM add wait to the end of a x87 instruction is a call is immediately after. To prevent tests from breaking, another fix had to be made: Now, we check if the instruction after a waiting instruction is a call, and if so, we insert the wait.
@phoebewang @KanRobert I do not have commit permissions. |
This commit is associated with a |
To prevent tests from breaking, another fix had to be made: Now, we check if the instruction after a waiting instruction is a call, and if so, we insert the wait.