Skip to content

Commit

Permalink
Optimise primitive_restart::upload_untouched() (#6881)
Browse files Browse the repository at this point in the history
* rsx: Optimise primitive_restart::upload_untouched() with SSE4.1

This optimisation is only applied when skip_restart is false.

I’ve only tested the u16 codepath, as it is the one used in NieR.

In some very unscientific profiling, this function used to take 2.76% of
the total frame time at the save point of the port town, it now takes
about 0.40%.

* rsx: Mark all SSE4.1 functions with attributes on gcc and clang

This assures the compiler we will take care of only calling these
functions after having checked that the CPU does support these
instructions.

* rsx: Add an AVX2 implementation of primitive restart ibo upload

* rsx: Remove redefinition of SSE4.1 instructions

Now that clang is aware that our functions are compiled with SSE4.1, it
lets us generate this code using its intrinsics.

* rsx: Optimise vector to scalar conversion

This is done using minpos and srli intrinsics and generate less code
than before.

Thanks Nekotekina for the suggestion!
  • Loading branch information
linkmauve authored and Nekotekina committed Oct 30, 2019
1 parent 35794dc commit cfd5cf6
Showing 1 changed file with 234 additions and 67 deletions.

0 comments on commit cfd5cf6

Please sign in to comment.