Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SPU LLVM: Optimize GB/GBH/GBB with a GFNI path #14669

Merged
merged 3 commits into from Oct 1, 2023
Merged

Conversation

Whatcookie
Copy link
Member

By treating the first input as constant, and the second input as variable, with only 1 bit set in our constant, gf2p8affineqb will extract that selected bit from each byte of the second operand.

Brings the needed instructions down to just 2, from the worst case of 6 (GBH), and avoids round trips between vector->int->vector registers

CPUs that can take this path are 11th gen and onward for Intel, and zen4 (Ryzen 7000) and onward for AMD.

Before:
image

After:
image

- Abuses GFNI to extract bits from bytes, from 5->2 instructions in most cases
@Megamouse Megamouse added CPU Optimization Optimizes existing code LLVM Related to LLVM instruction decoders labels Sep 23, 2023
@elad335
Copy link
Contributor

elad335 commented Sep 23, 2023

It also needs to load the constant into xmm1 right? it's not 2 instructions unless Im missing something.

@Whatcookie
Copy link
Member Author

2 instructions discounting constants, gbb needs 3 instructions

@elad335 elad335 merged commit d1bea79 into RPCS3:master Oct 1, 2023
4 of 6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CPU LLVM Related to LLVM instruction decoders Optimization Optimizes existing code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants