Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix PPC_FP on non-sse4.1 code paths. #681

Merged
merged 1 commit into from Jul 31, 2014
Merged

Conversation

phire
Copy link
Member

@phire phire commented Jul 27, 2014

The Invalid bit on the x87 fpu is sticky, so once a single NaN goes
through the old code on CPUs without sse4.1 all future floats are
mutilated.

Patch to emulate PTEST by Fiora.

Should fix issue 7237 and issue 7510.

@FioraAeterna
Copy link
Contributor

The other possible solution that phire suggested was to clear the x87 FPU flags on the NaN path (so if a NAN appears, the flags get set back accordingly). This has a latency of 22 cycles on Haswell, so it might stall the FPU for quite a while, but if that path is rare it might not matter.

@lioncash
Copy link
Member

Could this not be abstracted out into a function called PTEST or something equivalent(considering it is emulating it, may as well toss a concrete name onto it)? This is sort of duplicating code.

@FioraAeterna
Copy link
Contributor

It's only a partial emulation; ptest sets both the C and Z flags while this only partially replicates the functionality (just, only the part this code needs). I mean, it could still be a separate function, it's just not quite generic enough to be a PTEST replacement in full (like the MOVDDUP replacement and similar things in the x64emitter), I think?

@lioncash
Copy link
Member

Making it its own function would be sort of nice, because then you can move the m_zero to be a local.

@FioraAeterna
Copy link
Contributor

Also note the only reason I have the m_zero there is because I ran out of registers (I'd need three, I think, unless I'm missing something, to use pxor).

@phire
Copy link
Member Author

phire commented Jul 27, 2014

Comments fixed to c++ style.

@FioraAeterna
Copy link
Contributor

The "AX could get trashed" comment for ConvertSingleToDouble and such should probably be changed to "EAX could get trashed", I think

@JMC47
Copy link
Contributor

JMC47 commented Jul 27, 2014

This is a serious issue for users without SSE4.1, is there anything I can do to help test it on my computers?

@pauldacheez
Copy link
Contributor

@JMC47 What I'd do is checkout the branch, find any files with #ifdef _M_SSE4.1 or whatever and put an #undef _M_SSE4.1 at the top of the file before compiling. That's a pretty derpy way to do it, though – you can probably just pass a compiler flag that does the same job better, but I don't know enough about MSVC to tell you what it is.

@FioraAeterna
Copy link
Contributor

You could delete the SSE4 side of the branch in those two functions to force it to run the pre-SSE4 code, maybe?

@Tilka
Copy link
Member

Tilka commented Jul 30, 2014

lgtm once the comment in the header file is fixed. @JMC47 If you want to test this, #undef _M_SSE4.1 won't change anything, you need to delete the branch.

The Invalid bit on the x87 fpu is sticky, so once a single NaN goes
through the old code on CPUs without sse4.1 all future floats are
mutilated.

Patch to emulate PTEST by Fiora.

Fixes issue 7237 and issue 7510.
@phire
Copy link
Member Author

phire commented Jul 31, 2014

@JMC47 I set the bSSE4_1 variable to false to test the whole jit as these users would experience it. x64CPUDetect.cpp:160

The comment in the header has been fixed. This can be merged if everyone is happy.

@lioncash
Copy link
Member

You still haven't answered the code de-duplication suggestion. You can encapsulate a member into a local that way.

EDIT: Never mind, looks good to me.

@phire
Copy link
Member Author

phire commented Jul 31, 2014

oh, I was hoping everyone had forgotten about that :D

@phire
Copy link
Member Author

phire commented Jul 31, 2014

Code isn't common, notice one block uses movsd and the other uses movss.

lioncash added a commit that referenced this pull request Jul 31, 2014
Fix PPC_FP on non-sse4.1 code paths.
@lioncash lioncash merged commit f507827 into dolphin-emu:master Jul 31, 2014
@phire phire deleted the non-sse4_1 branch November 2, 2014 06:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
6 participants