New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JIT: Carry optimizations! #1021
Conversation
| bool Jit64::CheckMergedBranch(int crf) | ||
| { | ||
| const UGeckoInstruction& next = js.next_inst; | ||
| if (((next.OPCD == 16 /* bcx */) || |
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
This comment was marked as off-topic.
This comment was marked as off-topic.
Sorry, something went wrong.
|
This PR is looking pretty hot |
663ed03
to
f1ddf8f
Compare
8dad30c
to
990b4e5
Compare
990b4e5
to
3ff6f7a
Compare
|
It's starting to get a little messy, You are pushing the limit for what jit64 should be able to do. If you plan to do much more instruction merging/reordering in the future, you might want to think hard about how you are arranging things, maybe move towards grafting some kind of IL on top of Jit64. The branch merging disabling feature is nice. But would be nice to have a flag to disable all forms of merging such as the carry stuff. None of this is bad code, you are just pushing limits. |
Shorter, plus should make future optimizations easier.
Again, shorter and should make future optimizations easier.
3ff6f7a
to
50a8911
Compare
|
Looks good to me. |
50a8911
to
8d441da
Compare
|
Please don't merge this right now; I'm looking into a strange possible bug in POV-Ray with this. |
Omit carry calculations that get overwritten later in the block before they're used. Very common in the case of srawix and friends.
Keep carry flags in the x86 flags register if used in the next instruction.
Tries as hard as possible to push carry-using operations (like addc and adde) next to each other. Refactor the instruction reordering to be more flexible and allow multiple passes. 353 -> 192 x86 instructions on a carry-heavy code block in Pokemon Puzzle. 12% faster overall in Pokemon Puzzle; probably less in typical games (Virtual Console games seem to be carry-heavy for some reason; maybe a different compiler?)
8d441da
to
08ac10d
Compare
|
Should be fixed now! |
|
For future reference, my bug I wasted hours on was due to the fact that "addi" treats an input of "r0" as 0, but addic doesn't. I made the comment in the code about this a bit more clear in the hopes that nobody else suffers the same fate. |
These commits basically:
1/2: Refactor the carry-affecting arithmetic ops to save ~130 lines of code and make the later patches easier.
3: Add PPCAnalyst optimization for not calculating carry flags that won't be used (especially useful for srawix).
4: Add the ability to keep carry flags in the x86 flags register between ops in the case that they're used in the next op, such as in super common idioms like subc/subfe and so on.
5: Reorder carry-affecting instructions to try to be next to each other whenever possible.
Seems to be ~12% faster overall in Pokemon Puzzle from a quick inaccurate benchmark. Probably a lot less in actual games, but I don't know.