New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize memory access on Haswell by using MOVBE when possible. #173
Conversation
|
Oh, and MOVBE for RAM access is still missing, which means most MOV+BSWAP don't even use MOVBE yet. |
|
I like this. Should we merge this as is or wait until the rest of the areas that can use it are updated to support it as well? |
|
Could be merged as-is, but I'd prefer if someone could complete this work (busy with the shader compiler atm). |
|
Should we merge this as-is since it is unlikely that anyone else will complete the rest of the work for at least quite a while? |
|
SGTM. Can anyone review this? |
|
Reviewed it before and it was a 'LGTM' from me. |
|
@hrydgard want to review? :D |
|
Seems fine to me. IIRC, though, MOVBE is only a significant speed up on Atom. It was added to Haswell mostly for easy binary compat, Haswell and other non-Atom Intel CPUs are very very fast at bswap. So don't expect miracles. |
|
Looking at instruction latency tables it looked like this had potential for slight speedups compared to MOV+BSWAP, but I don't remember the details. Also smaller generated code == slightly less cache used. |
|
Well, yeah. Only benchmarking can say for sure, I just wanted to temper any too-high expectations :) |
|
I don't think anyone had high expectations regarding that change. It doesn't include any fastmem code change anyway, and at least 90% of our accesses are via fastmem anyway. |
|
Are you saying that this change won't make my Haswell CPU stupidly quicker than it already is?! |
|
Hehe right. Anyway, LGTM. |
|
I'll do a few changes suggested by @Tilka tonight (there is a little refactoring to do in the MOVBE emitter code) and merge that. |
…sing MOVBE internally when possible.
…PU support code).
Optimize memory access on Haswell by using MOVBE when possible.
Haven't done any performance testing, just throwing that PR out there for early reviews. It should be mostly done, but I want to get some user testing.