Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize memory access on Haswell by using MOVBE when possible. #173

Merged
merged 5 commits into from Apr 11, 2014

Conversation

delroth
Copy link
Member

@delroth delroth commented Mar 16, 2014

Haven't done any performance testing, just throwing that PR out there for early reviews. It should be mostly done, but I want to get some user testing.

@delroth
Copy link
Member Author

delroth commented Mar 16, 2014

Oh, and MOVBE for RAM access is still missing, which means most MOV+BSWAP don't even use MOVBE yet.

@Sonicadvance1
Copy link
Contributor

I like this. Should we merge this as is or wait until the rest of the areas that can use it are updated to support it as well?

@delroth
Copy link
Member Author

delroth commented Mar 25, 2014

Could be merged as-is, but I'd prefer if someone could complete this work (busy with the shader compiler atm).

@Sonicadvance1
Copy link
Contributor

Should we merge this as-is since it is unlikely that anyone else will complete the rest of the work for at least quite a while?
(I won't of course since I really hate x86)

@delroth
Copy link
Member Author

delroth commented Apr 10, 2014

SGTM. Can anyone review this?

@Sonicadvance1
Copy link
Contributor

Reviewed it before and it was a 'LGTM' from me.
Probably should get someone else that cares more about x86 than me though.

@delroth
Copy link
Member Author

delroth commented Apr 10, 2014

@hrydgard want to review? :D

@hrydgard
Copy link
Contributor

Seems fine to me.

IIRC, though, MOVBE is only a significant speed up on Atom. It was added to Haswell mostly for easy binary compat, Haswell and other non-Atom Intel CPUs are very very fast at bswap. So don't expect miracles.

@delroth
Copy link
Member Author

delroth commented Apr 10, 2014

Looking at instruction latency tables it looked like this had potential for slight speedups compared to MOV+BSWAP, but I don't remember the details. Also smaller generated code == slightly less cache used.

@hrydgard
Copy link
Contributor

Well, yeah. Only benchmarking can say for sure, I just wanted to temper any too-high expectations :)

@delroth
Copy link
Member Author

delroth commented Apr 10, 2014

I don't think anyone had high expectations regarding that change. It doesn't include any fastmem code change anyway, and at least 90% of our accesses are via fastmem anyway.

@Sonicadvance1
Copy link
Contributor

Are you saying that this change won't make my Haswell CPU stupidly quicker than it already is?!

@hrydgard
Copy link
Contributor

Hehe right. Anyway, LGTM.

@delroth
Copy link
Member Author

delroth commented Apr 10, 2014

I'll do a few changes suggested by @Tilka tonight (there is a little refactoring to do in the MOVBE emitter code) and merge that.

delroth added a commit that referenced this pull request Apr 11, 2014
Optimize memory access on Haswell by using MOVBE when possible.
@delroth delroth merged commit a823edc into dolphin-emu:master Apr 11, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
3 participants