Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JIT: Optimize JitAsmCommon, Float, and PS implementations #686

Merged
merged 4 commits into from Sep 17, 2014

Conversation

FioraAeterna
Copy link
Contributor

I did various optimizations to JitAsmCommon to improve the quantization/dequantization routines and save a few instructions, plus take advantage of the SSE4 instructions I added a bit back.

@lioncash
Copy link
Member

@dolphin-emu-bot rebuild

@Tilka
Copy link
Member

Tilka commented Jul 28, 2014

Haven't yet looked much at the paired-single commit, but the other two commits lgtm.

@FioraAeterna
Copy link
Contributor Author

I pushed another commit to this branch with some similar optimizations for the regular, non-paired float functions. I also added MOVLPD/HPD, which should help dealing with the unpaired floats a little bit, since they can read/write to/from the top/bottom halves of an SSE register.

@Parlane
Copy link
Member

Parlane commented Jul 28, 2014

@dolphin-emu-bot rebuild

@FioraAeterna
Copy link
Contributor Author

Fixed a bug with my blendvpd implementation of ps_sel.

@phire
Copy link
Member

phire commented Jul 28, 2014

@dolphin-emu-bot rebuild

@neobrain
Copy link
Member

I'm highly sceptic about this. The introduction of SSE optimizations has historically been prone to bring regressions along with them, often being hard to find ones (i.e. due to alignment requirements on OS X, which is a less tested platform).

Given that these patch are purely for optimization, I'd hence say this needs some cputests for https://github.com/dolphin-emu/hwtests to make sure behavior is correct.

@pauldacheez
Copy link
Contributor

Not sure if relevant, but Clang gives me an "LLVM ERROR" when compiling with -sse4 or -march=native anyway. I'm not sure if that implies that OS X builds compile without any SSE4.x opts at all (including Dolphin's) or if it just does ours and doesn't do extra opts.

@Tilka
Copy link
Member

Tilka commented Jul 30, 2014

@FioraAeterna Maybe you could extract the 3-byte opcode commit into a separate PR to get it merged more quickly? Also, what do you think about adding some wrapper functions for the CPU checks? Maybe even a macro that stringifies the instruction function names so you don't have to provide them every time?

@FioraAeterna FioraAeterna changed the title Optimize JitAsmCommon and Paired Singles, add support for some SSE4 opcodes WIP: Optimize JitAsmCommon and Paired Singles, add support for some SSE4 opcodes Aug 23, 2014
@FioraAeterna FioraAeterna changed the title WIP: Optimize JitAsmCommon and Paired Singles, add support for some SSE4 opcodes JIT: Optimize JitAsmCommon, Float, and PS implementations Aug 28, 2014
@FioraAeterna
Copy link
Contributor Author

I stripped out a bunch of parts of this patch because I do not trust any of this code I wrote weeks ago <_<;;;

@FioraAeterna FioraAeterna changed the title JIT: Optimize JitAsmCommon, Float, and PS implementations WIP: JIT: Optimize JitAsmCommon, Float, and PS implementations Aug 28, 2014
@FioraAeterna FioraAeterna changed the title WIP: JIT: Optimize JitAsmCommon, Float, and PS implementations JIT: Optimize JitAsmCommon, Float, and PS implementations Sep 14, 2014
Use some SSE4 instructions in on CPUs that support them.
Use float instructions instead of int where appropriate (it's a cycle faster
on CPUs with arithmetic unit forwarding penalties).
Based on a patch by Tilka.
@@ -77,16 +77,7 @@ void Jit64::fp_tri_op(int d, int a, int b, bool reversible, bool single, void (X
if (single)
{
ForceSinglePrecisionS(fpr.RX(d));
if (cpu_info.bSSE3)

This comment was marked as off-topic.

This comment was marked as off-topic.

skidau added a commit that referenced this pull request Sep 17, 2014
JIT: Optimize JitAsmCommon, Float, and PS implementations
@skidau skidau merged commit 2c233c4 into dolphin-emu:master Sep 17, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
9 participants