Skip to content

SPU LLVM: Use 512bit xorsum for SPU verification#16642

Merged
elad335 merged 1 commit into
RPCS3:masterfrom
Whatcookie:SPU2
Jan 31, 2025
Merged

SPU LLVM: Use 512bit xorsum for SPU verification#16642
elad335 merged 1 commit into
RPCS3:masterfrom
Whatcookie:SPU2

Conversation

@Whatcookie

Copy link
Copy Markdown
Member

Use a 512bit wide "xorsum" (even for machines with 128b and 256b simd) hash in place of a full comparison for SPU verification. In theory, hash collision is nearly impossible here, even though a human could create two matching hashes by hand pretty easily, but luckily for us, all PS3 programs were written a long time ago.

With a true random distribution of numbers, the chance of collision should be astronomically small, (1 in 2^512?) but I'm not sure how the chance changes when we change from true random to valid SPU opcodes. Either way, I don't think that collisions are likely, and the performance uplift is real. Just to be safe, the xorsum path is only taken if there are atleast 3 64byte blocks to hash.

The full_width_avx512 option is also removed, since even on CPUs which experienced severe AVX-512 downclocking, 512-wide spu verification was never an issue, since it only uses simple bitwise instructions, which are very power efficient.

  • Provides a 2-3% uplift in SPU limited titles
  • Removes the full_width_avx512 option
  • Adds a precise spu verification option, for debugging (config file only)

Before: (78.4 FPS)
image

After: (80.0 FPS)
image

And yes, the uplift is similar on both AVX2 and AVX-512 targets.

- Provides a 2-3% uplift in SPU limited titles
- Removes the full_width_avx512 option
- Adds a precise spu verification option, for debugging (config file only)
@elad335 elad335 merged commit 506d921 into RPCS3:master Jan 31, 2025
@elad335 elad335 added CPU Optimization Optimizes existing code LLVM Related to LLVM instruction decoders ☘️ Power Saving Aims to reduce power consumption of RPCS3 labels Jan 31, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CPU LLVM Related to LLVM instruction decoders Optimization Optimizes existing code ☘️ Power Saving Aims to reduce power consumption of RPCS3

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants