Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SPU: Lock-line reservation optimizations + Savestates bugfix #12523

Merged
merged 11 commits into from Aug 21, 2022

Conversation

elad335
Copy link
Contributor

@elad335 elad335 commented Aug 18, 2022

  • "SPU GETLLAR polling detection" setting (which was yielding resources when it's set to true) has been transformed to "SPU Reservation Busy Waiting Percentage" and is off ("0") by default. (0 is yielding resources, 100 is always busy-waiting)
  • It's now responsible for SPU events waiting as well, so the yielding can be replaced with busy-waiting there. You can test if you get perf gains with it set to on if your CPU has many threads.
  • It's not allowed to be changed for TSX users because TSX is very sensitive to memory touching. It's always as if this setting is set to "0" for it.
  • Fix the detection of GETLLAR spin and optimize the actual yielding.
  • Add some more minor optimizations.
  • Fix PPU HLE injection.
  • Fix PPU function analysis for savestates.
  • Added an experimental SPU setting called "SPU Accurate Reservations" which may benefit CPUs, especially without TSX when set to "false" ("true" is the default and was the default before this pull request). Game config file only atm.
  • Detect PPU lwmutex updates, do not notify SPU threads when it's detected.
  • Implement specialized atomic waiting for SPU reservations - emphasizing on cheap thread sleep entering and exiting cost.
  • Disable notifications when no data was changed in PUTLLC.

This pull request is known to fix audio stutters and heavy framerate drops in Demon Souls for low threaded CPUs.
Fixes #12513

@elad335 elad335 changed the title SPU Enable thread yield on GETLLAR spinning by default SPU/Savestates: Enable thread yield on GETLLAR spinning by default Aug 18, 2022
@elad335 elad335 force-pushed the savestates branch 6 times, most recently from 8f59609 to 53fa746 Compare August 18, 2022 09:16
@elad335
Copy link
Contributor Author

elad335 commented Aug 18, 2022

Added experimental SPU setting called "SPU Inaccurate Reservations" which may benefit non-TSX users. Game config file only atm.

@Nerboruto
Copy link

Nerboruto commented Aug 18, 2022

generally it seems to keep 30 fps better and more fluidity on xeon e5450 quadcore 3 ghz, without changing the optimizations I have on at the moment, 3 spurs threads and 60fps patch with 30 vblank ...
the heaviest scene is the one in the picture only that the values ​​I get are discordant and perhaps depend on windows scheduler, between 20 and 23fps we dance 3fps. some times it starts at 20 others at 23...
it would take a more empirical method to test however it seems more fluid...

rpcs3 2022-08-18 12-30-29
rpcs3 2022-08-18 12-34-14

@elad335
Copy link
Contributor Author

elad335 commented Aug 18, 2022

@Nerboruto Tested with inaccurate SPU reservations?

@Nerboruto
Copy link

@Nerboruto Tested with inaccurate SPU reservations?

I have not tried to change anything in config, now I'm going to try again ..

@Nerboruto

This comment was marked as off-topic.

@elad335 elad335 force-pushed the savestates branch 2 times, most recently from 64445ea to 47110ac Compare August 18, 2022 11:31
@elad335
Copy link
Contributor Author

elad335 commented Aug 18, 2022

Fixed a bug which made the new setting randomly not optimize in different runs.

@Nerboruto
Copy link

i tried with spu inaccurate reservation false and true...
unfortunately I still have the margin for error ... i also read 24fps.
it seems to be smoother but I have no way of having actual proof.
the game fluctuates between 20 and 30 but is more on 30, the slowdowns are there in some places.

@elad335
Copy link
Contributor Author

elad335 commented Aug 18, 2022

Added another optimization, please test and report performance regressions.

@Nerboruto

This comment was marked as off-topic.

@elad335 elad335 force-pushed the savestates branch 3 times, most recently from 71ae3c7 to 8ccd92d Compare August 18, 2022 15:08
@elad335 elad335 changed the title SPU/Savestates: Enable thread yield on GETLLAR spinning by default SPU: Enable thread yield on GETLLAR spinning by default + Savestates bugfix Aug 18, 2022
@Nerboruto
Copy link

Nerboruto commented Aug 18, 2022

well first i attach my config to see if i'm wrong something, i set it to true..
I have not seen any regression.
I put the master back to compare and in the opening scene there are about + 2fps of difference for the pr.
I tried to play a bit but the difference is obvious, pr has much more consistent and high minimum fps. in short, it is a nice performance improvement.
UPDATE: God of war gos crash if i set to true, false work good and fast...

config_BLES00932.zip

@elad335 elad335 force-pushed the savestates branch 6 times, most recently from 0a79106 to c25b5f5 Compare August 18, 2022 23:35
* Make this setting guard all reservation waitings. (renamed)
* Revert atomic list usage: it's more expensive and not needed because it has a timeout and is not optimized for the rest of the waitables.
@elad335
Copy link
Contributor Author

elad335 commented Aug 20, 2022

@Hasster1 Thanks, seemingly fixed it and some performance miss-optimizations with it.

@Hasster1
Copy link

@Hasster1 Thanks, seemingly fixed it and some performance miss-optimizations with it.

Wonderful job, nothing is freezing/crashing anymore with the inaccurate SPU reservations. Thank you.

@Hasster1
Copy link

Looks like Inaccurate SPU Reservations breaks InFamous. Freezes right after the intro without any errors.
image
RPCS3.log


if (far_jump)
{
// Replace the function with ppu_far_jump
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it works as intended. Module-local jumps don't use the global mapping.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes but it updates its PPU hash and forces recompilation. That's how it was handled before.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's good but how forcing recompilation would handle local branches? Especially with PPUTranslator having far jump part removed.

Copy link
Contributor Author

@elad335 elad335 Aug 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It uses the manually installed value because the function is removed from compilation.

Copy link
Contributor Author

@elad335 elad335 Aug 21, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code below is responsible for its compilation in llvm and that's skipped.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I see now.

@Hasster1
Copy link

Excuse me, what?
image
Also i can't find that the inaccurate reservations are enabled in the logs.
image

@Hasster1
Copy link

Oh, i see that it's been reversed to avoid confusion. Sorry.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Optimization Optimizes existing code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Trying to load a Demon's Souls savestate with the UNLOCK FPS patch enabled crashes the entire emulator
8 participants