New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix PPC_FP on non-sse4.1 code paths. #681
Conversation
|
The other possible solution that phire suggested was to clear the x87 FPU flags on the NaN path (so if a NAN appears, the flags get set back accordingly). This has a latency of 22 cycles on Haswell, so it might stall the FPU for quite a while, but if that path is rare it might not matter. |
|
Could this not be abstracted out into a function called PTEST or something equivalent(considering it is emulating it, may as well toss a concrete name onto it)? This is sort of duplicating code. |
|
It's only a partial emulation; ptest sets both the C and Z flags while this only partially replicates the functionality (just, only the part this code needs). I mean, it could still be a separate function, it's just not quite generic enough to be a PTEST replacement in full (like the MOVDDUP replacement and similar things in the x64emitter), I think? |
|
Making it its own function would be sort of nice, because then you can move the m_zero to be a local. |
|
Also note the only reason I have the m_zero there is because I ran out of registers (I'd need three, I think, unless I'm missing something, to use pxor). |
|
Comments fixed to c++ style. |
|
The "AX could get trashed" comment for ConvertSingleToDouble and such should probably be changed to "EAX could get trashed", I think |
|
This is a serious issue for users without SSE4.1, is there anything I can do to help test it on my computers? |
|
@JMC47 What I'd do is checkout the branch, find any files with #ifdef _M_SSE4.1 or whatever and put an #undef _M_SSE4.1 at the top of the file before compiling. That's a pretty derpy way to do it, though – you can probably just pass a compiler flag that does the same job better, but I don't know enough about MSVC to tell you what it is. |
|
You could delete the SSE4 side of the branch in those two functions to force it to run the pre-SSE4 code, maybe? |
|
lgtm once the comment in the header file is fixed. @JMC47 If you want to test this, |
The Invalid bit on the x87 fpu is sticky, so once a single NaN goes through the old code on CPUs without sse4.1 all future floats are mutilated. Patch to emulate PTEST by Fiora. Fixes issue 7237 and issue 7510.
|
@JMC47 I set the bSSE4_1 variable to false to test the whole jit as these users would experience it. x64CPUDetect.cpp:160 The comment in the header has been fixed. This can be merged if everyone is happy. |
|
You still haven't answered the code de-duplication suggestion. You can encapsulate a member into a local that way. EDIT: Never mind, looks good to me. |
|
oh, I was hoping everyone had forgotten about that :D |
|
Code isn't common, notice one block uses movsd and the other uses movss. |
Fix PPC_FP on non-sse4.1 code paths.
The Invalid bit on the x87 fpu is sticky, so once a single NaN goes
through the old code on CPUs without sse4.1 all future floats are
mutilated.
Patch to emulate PTEST by Fiora.
Should fix issue 7237 and issue 7510.