PPU LLVM: Optimize altivec FMA with 0 addend #8013

Whatcookie · 2020-04-11T19:50:43Z

One quirk of the altivec ISA is that only floating multiply add (FMA) and floating add instructions are provided. To execute a floating multiply without an add you had to execute an FMA with an addend of 0.

Let's detect this case and emit only a floating multiply when a constant addend of 0 is used.

On skylake the gains are very small, since FMA and floating multiply ops are executed with the same latency, but on ryzen floating multiply has lower latency than FMA, so it may benefit more. Anything without native FMA support should also benefit plenty.

rpcs3/Emu/Cell/PPUTranslator.cpp

- When VMADDFP and VNMSUBFP are used with a constant addend of 0, they can be simplified into a single floating multiply

Nekotekina reviewed Apr 11, 2020

View reviewed changes

rpcs3/Emu/Cell/PPUTranslator.cpp Outdated Show resolved Hide resolved

Whatcookie force-pushed the ppu_vpu branch from ca4bd94 to b364b2c Compare April 11, 2020 20:17

elad335 requested changes Apr 11, 2020

View reviewed changes

rpcs3/Emu/Cell/PPUTranslator.cpp Outdated Show resolved Hide resolved

Whatcookie force-pushed the ppu_vpu branch from b364b2c to 7c94fa1 Compare April 11, 2020 20:30

elad335 requested changes Apr 11, 2020

View reviewed changes

rpcs3/Emu/Cell/PPUTranslator.cpp Outdated Show resolved Hide resolved

rpcs3/Emu/Cell/PPUTranslator.cpp Outdated Show resolved Hide resolved

Whatcookie force-pushed the ppu_vpu branch 2 times, most recently from 5aa0f7b to a350dd3 Compare April 11, 2020 23:31

elad335 requested changes Apr 12, 2020

View reviewed changes

rpcs3/Emu/Cell/PPUTranslator.cpp Outdated Show resolved Hide resolved

PPU LLVM: Optimize altivec FMA with 0 addend

1126402

- When VMADDFP and VNMSUBFP are used with a constant addend of 0, they can be simplified into a single floating multiply

Whatcookie force-pushed the ppu_vpu branch from a350dd3 to 1126402 Compare April 12, 2020 04:30

elad335 approved these changes Apr 12, 2020

View reviewed changes

AniLeo merged commit 6b0f7a8 into RPCS3:master Apr 12, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PPU LLVM: Optimize altivec FMA with 0 addend #8013

PPU LLVM: Optimize altivec FMA with 0 addend #8013

Whatcookie commented Apr 11, 2020

PPU LLVM: Optimize altivec FMA with 0 addend #8013

PPU LLVM: Optimize altivec FMA with 0 addend #8013

Conversation

Whatcookie commented Apr 11, 2020