Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SPU/PPU reservations: Optimizations #8175

Merged
merged 4 commits into from
May 13, 2020
Merged

SPU/PPU reservations: Optimizations #8175

merged 4 commits into from
May 13, 2020

Conversation

elad335
Copy link
Contributor

@elad335 elad335 commented May 8, 2020

  • Implement vm::reservation_trylock, optimized locking on reservation stores with no waiting. Always fail if reservation lock bits are set.
  • Make SPU accurate GET transfers on non-TSX not modify reservation lock bits.
  • Add some optimization regarding to unmodified data reservations writes.
  • Use optimized PPU signaling to SPU on reservation pause.
  • Improve SPU LLVM MFC transfers recompilation for non-TSX.

@BellezaEmporium
Copy link

BellezaEmporium commented May 8, 2020

I do not know if it is related, but using your release on (already broken) Gran Turismo 6, I have got a semaphore timeout (appeared once, but not the second time).

@legend800
Copy link

Got 1-2 more fps in Motorstorm Apoc. vs. today's master. I'm up to 22fps in my "test spot" now. :)

@elad335
Copy link
Contributor Author

elad335 commented May 9, 2020

I added a commit attempting to improve non-tsx performance, testers are welcome.

@BellezaEmporium
Copy link

Build failed, cannot test release.

@elad335
Copy link
Contributor Author

elad335 commented May 9, 2020

@lanesh4d0w fixed.

@MsDarkLow
Copy link
Contributor

MsDarkLow commented May 9, 2020

Saw a major improvement in Yakuza Ishin, accurate getllar was enabled.
8700k@ 5GHz

NO-TSX
Master: 21 to 24 fps
PR: 28 to 32 fps

NO-TSX @ 3.7GHz
Master: 18 to 21 fps
PR: 21 to 28 fps

TSX (NULL RENDERER)
Master: 44 to 46 fps
PR: 47 to 50 fps

Images of NO-TSX @ 5GHz

Master
Master

PR
PR

@elad335
Copy link
Contributor Author

elad335 commented May 9, 2020

The goal here is to improve performance both on non-TSX path and TSX path.
So testing both of them is important.

@BellezaEmporium
Copy link

BellezaEmporium commented May 9, 2020

Seems to have fixed Gran Turismo 6

image

image

When buying first vehicle : F {RSX [0x0bb6500]} SIG: Thread terminated due to fatal error: rsx::get_address(offset=0x346a100, location=0x1): RSXIO memory not mapped!
(in file d:\a\1\s\rpcs3\Emu\RSX\Common\texture_cache.h:1718)

Log file (also contains FW installation + config being changed to fit my processor and other things.)
RPCS3.log.gz

@xddxd
Copy link
Contributor

xddxd commented May 9, 2020

Gran Turismo 6 is still not fixed here. Are you using TSX?
Screenshot_3
RPCS3.log

@elad335 elad335 force-pushed the dma branch 2 times, most recently from d2816b2 to 8d02d6d Compare May 9, 2020 11:26
@BellezaEmporium
Copy link

BellezaEmporium commented May 9, 2020

@xddxd CPU not supported for TSX.

The first launch (v 1.00 / no savedata) seemed to work. However, on second start, it doesn't, indeed.

@BellezaEmporium
Copy link

@xddxd Works if you don't have any savedata. I suppose it's a savedata issue. (since the save data is based on the Japanese game release, the European one might not like it.)
image

@xddxd
Copy link
Contributor

xddxd commented May 9, 2020

Can confirm, it only works if there is no savedata.

@elad335
Copy link
Contributor Author

elad335 commented May 9, 2020

Added another optimization for non-TSX.

@elad335 elad335 marked this pull request as ready for review May 9, 2020 18:09
@BellezaEmporium
Copy link

No regressions so far when testing different games. No better results too.

@elad335 elad335 changed the title [WIP] SPU/PPU reservations: Optimizations SPU/PPU reservations: Optimizations May 10, 2020
@AniLeo AniLeo requested a review from Nekotekina May 11, 2020 10:59
@elad335
Copy link
Contributor Author

elad335 commented May 11, 2020

Actually wait a bit as I want to change a few lines.

@elad335 elad335 marked this pull request as draft May 11, 2020 23:16
elad335 added 3 commits May 12, 2020 17:57
- Implement vm::reservation_trylock, optimized locking on reservation stores with no waiting. Always fail if reservation lock bitsa are set.
- Make SPU accurate GET transfers on non-TSX not modify reservation lock bits.
- Add some optimization regarding to unmodified data reservations writes.
@elad335 elad335 marked this pull request as ready for review May 12, 2020 14:57
@elad335
Copy link
Contributor Author

elad335 commented May 12, 2020

Made the changes I wanted.

@AniLeo
Copy link
Member

AniLeo commented May 12, 2020

Needs testing in a few games to ensure they didn't regress

@elad335
Copy link
Contributor Author

elad335 commented May 13, 2020

I tested them locally, I know what the changes are and what they affect.

@MsDarkLow
Copy link
Contributor

MsDarkLow commented May 13, 2020

Update as per request.
8700k@ 3.7GHz
Persona 5 is tested in Central Street, this is kinda RSX bottlenecked but tested anyways.
Yakuza Ishin (NULL RENDERER) is tested in Fushimi by the Singing bar
Portal 2 is tested with SPU Thread Auto and 2 at Test Chamber 5. With TSX the game already performs nicely, but no-tsx is another story.
With SPU Thread Auto I'll stand still at the start when you descend from the elevator
With SPU Thread 2 I'll be spinning around on a button with

TSX NO TSX
PR Master PR Master
Yakuza Ishin (NULL) 33 - 36 31 - 34 25 - 27 18 - 20
Persona 5 42 - 49 42 - 47 39 - 44 39 - 42
Portal 2 (Auto) 30 30 30 19 - 22
Portal 2 (2) 30 30 28 - 30 20 - 30

@AniLeo
Copy link
Member

AniLeo commented May 13, 2020

Yakuza 0 Demo (OpenGL, Mesa, R9 280X)

Master: 25-29 FPS
PR: 30-34 FPS

RDR works better, didn't benchmark a single spot

@AniLeo AniLeo merged commit 12f0278 into RPCS3:master May 13, 2020
elad335 added a commit to elad335/rpcs3 that referenced this pull request May 13, 2020
possible broken signaling in rare occusions.
Nekotekina pushed a commit that referenced this pull request May 13, 2020
possible broken signaling in rare occusions.
elad335 added a commit to elad335/rpcs3 that referenced this pull request May 13, 2020
AniLeo pushed a commit that referenced this pull request May 14, 2020
Mask out RESULT cmd bit, do not create unbound branch blocks. (non-TSX)
@zminhquanz
Copy link

GOW 3 has improve a little bit performance , very good job developer

@legend800
Copy link

@elad335 This PR also fixed Star Wars Force Unleashed 2 loading menus. However, it only works with TSX off. Can you look into fixing the TSX path too?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants