SPU LLVM: Optimize GB/GBH/GBB with a GFNI path #14669

Whatcookie · 2023-09-23T00:31:19Z

By treating the first input as constant, and the second input as variable, with only 1 bit set in our constant, gf2p8affineqb will extract that selected bit from each byte of the second operand.

Brings the needed instructions down to just 2, from the worst case of 6 (GBH), and avoids round trips between vector->int->vector registers

CPUs that can take this path are 11th gen and onward for Intel, and zen4 (Ryzen 7000) and onward for AMD.

Before:

After:

- Abuses GFNI to extract bits from bytes, from 5->2 instructions in most cases

elad335 · 2023-09-23T05:59:05Z

It also needs to load the constant into xmm1 right? it's not 2 instructions unless Im missing something.

Whatcookie · 2023-09-23T06:39:23Z

2 instructions discounting constants, gbb needs 3 instructions

rpcs3/Emu/CPU/CPUTranslator.cpp

SPU LLVM: Optimize GB/GBH/GBB with a GFNI path

765421f

- Abuses GFNI to extract bits from bytes, from 5->2 instructions in most cases

Megamouse added CPU Optimization Optimizes existing code LLVM Related to LLVM instruction decoders labels Sep 23, 2023

Merge branch 'master' into SPU2

b8b7b68

Nekotekina reviewed Sep 23, 2023

View reviewed changes

rpcs3/Emu/CPU/CPUTranslator.cpp Show resolved Hide resolved

Merge branch 'master' into SPU2

b16e3e4

elad335 merged commit d1bea79 into RPCS3:master Oct 1, 2023
4 of 6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SPU LLVM: Optimize GB/GBH/GBB with a GFNI path #14669

SPU LLVM: Optimize GB/GBH/GBB with a GFNI path #14669

Whatcookie commented Sep 23, 2023

elad335 commented Sep 23, 2023

Whatcookie commented Sep 23, 2023

SPU LLVM: Optimize GB/GBH/GBB with a GFNI path #14669

SPU LLVM: Optimize GB/GBH/GBB with a GFNI path #14669

Conversation

Whatcookie commented Sep 23, 2023

elad335 commented Sep 23, 2023

Whatcookie commented Sep 23, 2023