Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jit64: Minor divwx optimizations #9667

Merged
merged 3 commits into from May 20, 2021
Merged

Conversation

Sintendo
Copy link
Member

Some of the optimizations introduced in #9566 can be tweaked to generate ever so slightly better code under specific circumstances.


Division by 2 (with d == a)

Before:

8B C7                mov         eax,edi
8B F8                mov         edi,eax
C1 EF 1F             shr         edi,1Fh
03 F8                add         edi,eax
D1 FF                sar         edi,1

After:

8B C7                mov         eax,edi
C1 EF 1F             shr         edi,1Fh
03 F8                add         edi,eax
D1 FF                sar         edi,1
Division by power of 2 (with d == a)

Before:

8B C6                mov         eax,esi
85 C0                test        eax,eax
8D 70 03             lea         esi,[rax+3]
0F 49 F0             cmovns      esi,eax
C1 FE 02             sar         esi,2

After:

85 F6                test        esi,esi
8D 46 03             lea         eax,[rsi+3]
0F 48 F0             cmovs       esi,eax
C1 FE 02             sar         esi,2
Constant dividend (with d == b)

Before:

45 85 FF             test        r15d,r15d
75 05                jne         normal_path
45 33 FF             xor         r15d,r15d
EB 0C                jmp         done
normal_path:
B8 5A 00 00 00       mov         eax,5Ah
99                   cdq
41 F7 FF             idiv        eax,r15d
44 8B F8             mov         r15d,eax
done:

After:

45 85 FF             test        r15d,r15d
74 0C                je          done
B8 5A 00 00 00       mov         eax,5Ah
99                   cdq
41 F7 FF             idiv        eax,r15d
44 8B F8             mov         r15d,eax
done:

When destination and input registers match, a redundant MOV instruction
can be eliminated.

Before:
8B C7                mov         eax,edi
8B F8                mov         edi,eax
C1 EF 1F             shr         edi,1Fh
03 F8                add         edi,eax
D1 FF                sar         edi,1

After:
8B C7                mov         eax,edi
C1 EF 1F             shr         edi,1Fh
03 F8                add         edi,eax
D1 FF                sar         edi,1
Division by a power of two can be slightly improved when the
destination and dividend registers are the same.

Before:
8B C6                mov         eax,esi
85 C0                test        eax,eax
8D 70 03             lea         esi,[rax+3]
0F 49 F0             cmovns      esi,eax
C1 FE 02             sar         esi,2

After:
85 F6                test        esi,esi
8D 46 03             lea         eax,[rsi+3]
0F 48 F0             cmovs       esi,eax
C1 FE 02             sar         esi,2
We normally check for division by zero to know if we should set the
destination register to zero with a XOR. However, when the divisor and
destination registers are the same the explicit zeroing can be omitted.
In addition, some of the surrounding branching can be simplified as
well.

Before:
45 85 FF             test        r15d,r15d
75 05                jne         normal_path
45 33 FF             xor         r15d,r15d
EB 0C                jmp         done
normal_path:
B8 5A 00 00 00       mov         eax,5Ah
99                   cdq
41 F7 FF             idiv        eax,r15d
44 8B F8             mov         r15d,eax
done:

After:
45 85 FF             test        r15d,r15d
74 0C                je          done
B8 5A 00 00 00       mov         eax,5Ah
99                   cdq
41 F7 FF             idiv        eax,r15d
44 8B F8             mov         r15d,eax
done:
@lioncash lioncash merged commit 539c2cb into dolphin-emu:master May 20, 2021
10 checks passed
@Sintendo Sintendo deleted the jit64divwx2 branch May 28, 2021 09:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
2 participants