-
Notifications
You must be signed in to change notification settings - Fork 104
[testing only] Mutter frame timing rebase #533
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This reverts commit f0d02aa.
This reverts commit 5403e9d.
…ntrol (linuxmint#425)" This reverts commit c32027f.
This reverts commit 0040039.
This reverts commit 00ccde6.
This reverts commit 42f9e08.
This reverts commit 7450dcb.
Externally setting the sync-to-vblank setting was a feature added as a workaround to old Intel and ATI graphic cards, and is not needed anymore. Furthermore, it doesn't make sense to change it on a compositor whatsoever. This commit removes all the ways to externally change this setting, as well as the now unused API. https://gitlab.gnome.org/GNOME/mutter/merge_requests/191
Since now we don't set the swap throttled value based on sync-to-vblank, we can effectively remove it from Cogl. Throttling swap buffers in Cogl is as much a historical artifact as sync-to-vblank. Furthermore, it doesn't make sense to disable it on a compositor, which is the case with the embedded Cogl. In addition to that, the winsys vfunc for updating whenever swap throttling changes could also be removed, since swap throttling is always enabled now. Removing it means less code, less branches when running, and one less config option to deal with. This also removes the micro-perf test, since it doesn't make sense for the case where Cogl is embedded into the compositor. https://gitlab.gnome.org/GNOME/mutter/merge_requests/191
This is to reflect what this flag is actually about. https://gitlab.gnome.org/GNOME/mutter/merge_requests/191
Just better wording for swap throttled.
The `last_presentation_time` is usually a little in the past (although
sometimes in the future depending on the driver). When it's over 2ms
(`sync_delay`) in the past that would trigger the while loop to count up so
that the next `update_time` is in the future.
The problem with that is for common values of `last_presentation_time`
which are only a few milliseconds ago, incrementing `update_time` by
`refresh_interval` also means counting past the next physical frame that
we haven't rendered yet. And so mutter would skip that frame.
**Example**
Given:
```
last_presentation_time = now - 3ms
sync_delay = 2ms
refresh_interval = 16ms
next_presentation_time = last_presentation_time + refresh_interval
= now + 13ms
-3ms now +13ms +29ms +45ms
----|--+------------|---------------|---------------|----
: :
last_presentation_time next_presentation_time
```
Old algorithm:
```
update_time = last_presentation_time + sync_delay
= now - 1ms
while (update_time < now)
(now - 1ms < now)
update_time = now - 1ms + 16ms
update_time = now + 15ms
next_presentation_time = now + 13ms
available_render_time = next_presentation_time - max(now, update_time)
= (now + 13ms) - (now + 15ms)
= -2ms so the next frame will be skipped.
-3ms now +13ms +29ms +45ms
----|--+------------|-+-------------|---------------|----
: : :
: : update_time (too late)
: :
last_presentation_time next_presentation_time (a missed frame)
```
New algorithm:
```
min_render_time_allowed = refresh_interval / 2
= 8ms
max_render_time_allowed = refresh_interval - sync_delay
= 14ms
target_presentation_time = last_presentation_time + refresh_interval
= now - 3ms + 16ms
= now + 13ms
while (target_presentation_time - min_render_time_allowed < now)
(now + 13ms - 8ms < now)
(5ms < 0ms)
# loop is never entered
update_time = target_presentation_time - max_render_time_allowed
= now + 13ms - 14ms
= now - 1ms
next_presentation_time = now + 13ms
available_render_time = next_presentation_time - max(now, update_time)
= (now + 13ms) - now
= 13ms which is plenty of render time.
-3ms now +13ms +29ms +45ms
----|-++------------|---------------|---------------|----
: : :
: update_time :
: :
last_presentation_time next_presentation_time
```
The reason nobody noticed these missed frames very often was because
mutter has some accidental workarounds built-in:
* Prior to 3.32, the offending code was only reachable in Xorg sessions.
It was never reached in Wayland sessions because it hadn't been
implemented yet (till e9e4b2b72).
* Even though Wayland support is now implemented the native backend
provides a `last_presentation_time` much faster than Xorg sessions
(being in the same process) and so is less likely to spuriously enter
the while loop to miss a frame.
* For Xorg sessions we are accidentally triple buffering (linuxmint#334). This
is a good way to avoid the missed frames, but is also an accident.
* `sync_delay` is presently just high enough (2ms by coincidence is very
close to common values of `now - last_presentation_time`) to push the
`update_time` into the future in some cases, which avoids entering the
while loop. This is why the same missed frames problem was also noticed
when experimenting with `sync_delay = 0`.
v2: adjust variable names and code style.
Fixes: https://bugzilla.gnome.org/show_bug.cgi?id=789186
and most of https://gitlab.gnome.org/GNOME/mutter/issues/571
https://gitlab.gnome.org/GNOME/mutter/merge_requests/520
If an update (new frame) had been scheduled already before `_clutter_stage_cogl_presented` was called then that means it was scheduled for the wrong time. Because the `last_presentation_time` has changed since then. And using an `update_time` based on an outdated presentation time results in scheduling frames too early, filling the buffer queue (triple buffering or worse) and high visual latency. So if we do receive a presentation event when an update is already scheduled, remember to reschedule the update based on the newer `last_presentation_time`. This way we avoid overfilling the buffer queue and limit ourselves to double buffering for less visible lag. Closes: https://gitlab.gnome.org/GNOME/mutter/issues/334 Prerequisite: https://gitlab.gnome.org/GNOME/mutter/merge_requests/520 https://gitlab.gnome.org/GNOME/mutter/merge_requests/281
One million is the number of microseconds in one second, which is also defined by `G_USEC_PER_SEC`. https://gitlab.gnome.org/GNOME/mutter/merge_requests/363
Instead of crazy refresh rates >1MHz falling back to 60Hz, just honour them by rendering unthrottled (same as `sync_delay < 0`). Although I wouldn't actually expect that path to ever be needed in reality, it just ensures an infinite `while` loop never happens. https://gitlab.gnome.org/GNOME/mutter/merge_requests/363
Instead of 0Hz falling back to 60Hz, use `CLUTTER_DEFAULT_FPS` which is also 60Hz by default. https://gitlab.gnome.org/GNOME/mutter/merge_requests/363
That could happen if the backend did not provide presentation timestamps,
or if the screen was not changing other than the hardware cursor:
if (stage_cogl->last_presentation_time == 0||
stage_cogl->last_presentation_time < now - 150000)
{
stage_cogl->update_time = now;
return;
}
By setting `update_time` to `now`, master_clock_get_swap_wait_time()
returns 0:
gint64 now = g_source_get_time (master_clock->source);
if (min_update_time < now)
{
return 0;
}
else
{
gint64 delay_us = min_update_time - now;
return (delay_us + 999) / 1000;
}
However, zero is a value unsupported by the default master clock
due to:
if (swap_delay != 0)
return swap_delay;
All cases are now handled by extrapolating when the next presentation
time would be and calculating an appropriate update time to meet that.
We also need to add a check for `update_time == last_update_time`, which
is a situation that just became possible since we support old (or zero)
values of `last_presentation_time`. This avoids getting more than one
stage update per frame interval when input events arrive without
triggering a stage redraw (e.g. moving the hardware cursor).
https://gitlab.gnome.org/GNOME/mutter/merge_requests/363
If `last_presentation_time` is zero (unsupported) or just very old (system was idle) then we would like to avoid that triggering a large number of loop interations. This will get us closer to the right answer without iterating. https://gitlab.gnome.org/GNOME/mutter/merge_requests/363
The presentation timing logic (via `master_clock_get_swap_wait_time`) now works unconditionally. By "works" we mean that a result of zero from `master_clock_get_swap_wait_time` actually means zero now. Previously zero could mean either a successful result of zero milliseconds or that the backend couldn't get an answer. And a non-zero result is the same as before. This works even if the screen is "idle" and even if the backend doesn't provide presentation timestamps. So now our two fallback throttling mechanisms of relying on `CLUTTER_FEATURE_SWAP_THROTTLE` and decimating to `clutter_get_default_frame_rate` can be deleted. Closes: https://gitlab.gnome.org/GNOME/mutter/issues/406 and https://bugzilla.gnome.org/show_bug.cgi?id=781835 https://gitlab.gnome.org/GNOME/mutter/merge_requests/363
After 4faeb12731b8, the maximum time allowed for an update to happen is calculated as: max_render_time_allowed = refresh_interval - 1000 * sync_delay; However, extremely small refresh intervals -- that come as consequence to extremely high refresh rates -- may fall into an odd numerical range when refresh_interval < 1000 * sync_delay. That would give us a negative time. To be extra cautious about it, add another sanity check for this case. Change suggested by Jasper St. Pierre. https://gitlab.gnome.org/GNOME/mutter/merge_requests/363
The single purpose of "threaded swap wait" was to provide the value:
`u.presentation_time = get_monotonic_time_ns ();` for use by
`clutter-stage-cogl`.
Until recently (before !363), all backends were required to provide
a nonzero value for `presentation_time` or else suffer falling back
to poor-performing throttling methods in `master_clock_next_frame_delay`.
So we needed "threaded swap wait" to support the Nvidia driver.
This is no longer true. The fallbacks don't exist any more and
`clutter_stage_cogl_schedule_update` now always succeeds even in the
absence of a `presentation_time` (since !363).
The drawbacks to keeping "threaded swap wait" are:
* `u.presentation_time = get_monotonic_time_ns ();` is a guess and not
an accurate hardware presentation time.
* It required blocking the main loop on every frame in
`_cogl_winsys_wait_for_gpu` due to `glFinish`. Any OpenGL programmer
will tell you calling `glFinish` is a bad idea because it kills CPU-GPU
parallelism. In my case, it was blocking the main loop for 1-3ms on
every mutter frame. It's easy to imagine slower (or higher resolution)
Nvidia systems would lose an even larger chunk of their frame interval
blocked in that function. This significantly crippled frame rates on
Nvidia systems.
The benefit to keeping "threaded swap wait" is:
* Its guess of `presentation_time` is likely a better guess by a few
milliseconds than the guess that `clutter_stage_cogl_schedule_update`
will make in its place.
So "threaded swap wait" provided better sub-frame phase accuracy, but at
the expense of frame rates. And as soon as it starts causing frame drops,
that one and only benefit is lost. There is no reason to keep it.
And in case you are wondering, the documentation for "threaded swap wait"
is now wrong (since !363):
> The advantage of enabling this is that it will allow your main loop
> to do other work while waiting for the system to be ready to draw
> the next frame, instead of blocking in glXSwapBuffers()."
At the time (before !363) it was true that "threaded swap wait" avoided
swap interval throttling that would occur as a result of
`master_clock_next_frame_delay` blindly returning zero and over-queuing
frames. That code no longer exists. And ironically the implementation of
"threaded swap wait" necessitates the same kind of blocking (to a lesser
extent) that it was designed to avoid. We can eliminate all blocking
however by deleting "threaded swap wait", which is now safe since !363.
https://gitlab.gnome.org/GNOME/mutter/merge_requests/602
This reverts commit 6824646.
As a protection against duplicate/early update times, it has already been replaced by commit 35aa2781 and commit 4faeb127. Removing it also prevents a common cause of frame skips: clutter_stage_cogl_schedule_update() clutter_stage_cogl_get_update_time() == -1 Closes: https://gitlab.gnome.org/GNOME/gnome-shell/issues/1411 https://gitlab.gnome.org/GNOME/mutter/merge_requests/719
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is just for anyone who wants to try the recent changes in mutter to frame timing. It also includes the proposed nvidia-specific threaded-swap removal which should improve performance.
This reverts a ton of local changes, and as a result will break cinnamon-settings while applied since some gsettings are removed, and I haven't created a complementary branch for cinnamon.
The gist of the changes here are: