Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jit64: divwux - Prefer three-operand IMUL #9690

Merged
merged 1 commit into from May 13, 2021

Conversation

Sintendo
Copy link
Member

@Sintendo Sintendo commented May 5, 2021

Another tiny optimization.

By taking advantage of three-operand IMUL, we can eliminate a MOV instruction. This is a small code size win. However, due to IMUL sign extending the immediate value to 64 bits, we can only apply this when the magic number's most significant bit is zero.

To ensure this can actually happen, we also minimize the magic number by checking for trailing zeroes.

Unsigned division by 18

Before:

41 BE E4 38 8E E3    mov         r14d,0E38E38E4h
4D 0F AF F5          imul        r14,r13
49 C1 EE 24          shr         r14,24h

After:

4D 69 F5 39 8E E3 38 imul        r14,r13,38E38E39h
49 C1 EE 22          shr         r14,22h
Unsigned division by 52

Before:

B8 9E D8 89 9D       mov         eax,9D89D89Eh
48 0F AF F8          imul        rdi,rax
48 C1 EF 25          shr         rdi,25h

After:

48 69 FF 4F EC C4 4E imul        rdi,rdi,4EC4EC4Fh
48 C1 EF 24          shr         rdi,24h
Unsigned division by 125000

Before:

BF 06 BD 37 86       mov         edi,8637BD06h
49 0F AF FF          imul        rdi,r15
48 C1 EF 30          shr         rdi,30h

After:

49 69 FF 83 DE 1B 43 imul        rdi,r15,431BDE83h
48 C1 EF 2F          shr         rdi,2Fh

By taking advantage of three-operand IMUL, we can eliminate a MOV
instruction. This is a small code size win. However, due to IMUL sign
extending the immediate value to 64 bits, we can only apply this when
the magic number's most significant bit is zero.

To ensure this can actually happen, we also minimize the magic number by
checking for trailing zeroes.

Example (Unsigned division by 18)
Before:
41 BE E4 38 8E E3    mov         r14d,0E38E38E4h
4D 0F AF F5          imul        r14,r13
49 C1 EE 24          shr         r14,24h

After:
4D 69 F5 39 8E E3 38 imul        r14,r13,38E38E39h
49 C1 EE 22          shr         r14,22h
Copy link
Member

@leoetlino leoetlino left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks fine to me, but would like a second opinion

@lioncash lioncash merged commit 24b9a64 into dolphin-emu:master May 13, 2021
10 checks passed
@Sintendo Sintendo deleted the jit64divwux branch May 13, 2021 11:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
4 participants