-
-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rsx: PS3 Native frame limiter improvements, add Infinite frame limiter #12052
Conversation
elad335
commented
May 21, 2022
•
edited
Loading
edited
- Do not wait on DEVICE 0x30 semaphore, it seems like it is something to do with queue command synchronization.
- This also fixes cellGcmSetFlipWithWaitLabel which is built specifically to enable accurate RSX flipping time, its waiting command is confirmed to be placed AFTER DEVICE 0x30 waiting.
- Fix default VSYNC state to be enabled. (and set it to be enabled in cellGcmSetVBlankFrequency and cellVideOutConfigure as well)
- Add experimental "Infinite" frame limiter mode.
- Fix spurious enabling of second vblank.
I've added a new frame limiter called "Infinite" which basically imposes no limits whatsoever and sends an additional VBLANK each frame . |
We already know games are using vblank count for timing. This makes things worse imo. |
If the semaphore is not set and we have a frame to consume, go ahead and consume it. Hopefully by the time that frame is ready the old one will have been cleared |
The display queue already exists by the way, it just needs some tweaks to bind it to a RSX semaphore. I'd like to extend this in future by using the native GPU's semaphore system which does a very similar thing so its important we get the architecture right. |
3d9b80f
to
e640eb4
Compare
I've switched the auto and vblank's demeanor, also please test the new "infinite" frame limiter which is supposed to be a replacement for upping the vblank rate for higher framerates in many games. |
For clarification - off is the fastest option, infinite is supposed to be used with games that have an internal vblank-based frame limiter. |
rpcs3/Emu/RSX/RSXThread.cpp
Outdated
@@ -3258,6 +3275,11 @@ namespace rsx | |||
flip_status = CELL_GCM_DISPLAY_FLIP_STATUS_DONE; | |||
m_queued_flip.in_progress = false; | |||
|
|||
if (g_cfg.video.vblank_loop) | |||
{ | |||
post_vblank_event(rsx::uclock()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels like it should just be part of "infinite" vblank mode and does not need the extra loop option
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree.
if (a5 == 1u) | ||
{ | ||
// This function resets vsync state to enabled | ||
render->requested_vsync = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where is the counterpart to disable this? We know PS3 had engines that had tearing when fps dropped below a threshold (e.g cryengine 2)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see sometimes in logs that the games reset it afterward and after cellVideoOutConfigure, I will add to it as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, it is unlikely that they are calling videoOutConfigure during gameplay as that is a heavy operation. It is more likely to be toggled using this method. Is it known what happens if a5 is 0?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm assuming that games also react to the unused CellVideoOutCallback. (maybe it's called when the display is changed)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's controlled by CellGcmSetFlipMode. You can choose 3 modes there, 1 for vsync, 1 for scanline sync and the last mode for all out anarchy. We only need to set vsync to engage when the first mode is chosen.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm assuming that games also react to the unused CellVideoOutCallback. (maybe it's called when the display is changed)
I've checked and no game uses cellVideoOutRegisterCallback.
@@ -3251,18 +3259,25 @@ namespace rsx | |||
} | |||
} | |||
} | |||
else if (wait_for_flip_sema) | |||
else if (frame_limit == frame_limit_type::_ps3) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I'm reading this correctly the logic here is essentially "do not flip until a fresh vblank signal is received". You have the vblank signal as the flip release and this is an acquire operation here. Two problems:
- There is still the correct event sent to us which you are now short-circuiting in semaphore-acquire. I'm not sure why this approach is to be considered better as it is essentially a hackier approach. Note that this "PS3 native" sync is to be used with games that have trouble with the normal PC-friendly framelimiter system, so extra bells and whistles are unneeded for this path. Is it that it doesn't work properly with HLE? I'm scratching my head on why I should prefer this version over the existing one which more closely follows what the PS3 does.
- This whole set of variables and checks is not correctly labeled to match how display systems are architected with clients acquiring the display and the server releasing it in a cycle. Not a blocker by itself, but its not ideal and I would end up having to change it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm quite confident DEVICE 0x30 is something to do with the queue command and not vblank, as it's inserted by gcm for all display sync options and occurs before semaphore wait by cellGcmSetFlipWithWaitLabel.
This function exists specifically to allow accurate timing of flipping and this is a usage example:
Here is label 0 is used, notice that it resides after DEVICE 0x30 waiting
Furthermore, flips are executed along with the vblank signal, not after, I guess that's why TLoU regressed by the original pr.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the trace above, we can see there are 2 signals:
The only difference is that there are 2 cond vars that I see, one to ack the enqueue and another to wait for vertical sync.
Server:
label_0x30.wait_signal();
handle_enqueued_head();
label_0x30.reset()
Client:
enqueue();
label_0x30.signal();
label_0x30.wait_ack();
call(); <- CB reset??
label_0x0.wait_signal();
flip();
My guess here is that 0x0 is signaled by the hardware vblank system, 0x30 just acknowledges that a head has been enqueued. So, your PR addresses this by having the system use vblank instead of enqueue signal.
While this is fine, the second critique still holds. The implementation is convoluted and not clear at all.
My suggestion to solve this once and for all is to have a special kind of cond var with waiting for known values. Simply replace my rushed solution with the wait_for_flip_sema with proper condvar.wait() semantics to make the flow clearer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For simplicity I have omitted the 0x10 logic, I have no idea where that signal is for.
For 0x30 handling, we have the frame enqueue logic in sys_rsx and rsx_thread::end_frame() for that.
For 0x0 we can signal from vblank thread.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No I have HLEd the game this is a cellGcmSetFlipWithWaitLabel call with 0 address. It doesn't appear in other games which uses other flip functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which is named _cellGcmSetFlipCommandWithWaitLabel on firmware.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So for "normal" flip the 0x0 var is absent? That simplifies the design then and makes the 0x30_ack a wait for vblank.
Again, my biggest problem with the way it is now is that the "documentation is the code" philosophy doesn't hold up because the design is all over the place. This isn't a new problem, but if it is now well understood, we have to do it properly and this PR already modifies the whole thing anyway. I only understand what is going on because I'm familiar with most of the code there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
0x30 has nothing to do with vblank, it appears in all sync modes even with VSYNC turned off and besides the whole purpose of _cellGcmSetFlipCommandWithWaitLabel is to time the flip very closely to what CELL decides. So if 0x30 is a vblank semaphore it defies all logic as to why the function exists in the first place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems my point is getting lost in translation, that doesn't matter it's just an ack, whether its from vblank or not. I'm not happy with the implementation, but if we're being honest, you're unlikely to restructure it, so I'll merge and change it later when I have time.
* Do not wait on DEVICE 0x30 semaphore, it seems like it is something to do with queue command synchronization. - This also fixes cellGcmSetFlipWithWaitLabel which is built specifically to enable accurate RSX flipping time, its waiting command is confirmed to be placed **AFTER** DEVICE 0x30 waiting. * Fix default vsync state to be enabled. (and set it to enabled in cellGcmSetVBlankFrequency as well) * Add experimental "Infinite" frame limiter mode. * Fix spurious enabling of second vblank.
23d662d
to
ffa4342
Compare