perf: Improve animation performance and reduce frame drops#239
perf: Improve animation performance and reduce frame drops#239dawsers merged 1 commit intodawsers:masterfrom
Conversation
|
Thanks a lot for taking the time to do this. I have just started to look at it, but I notice you mention all the outputs are being animated no matter what. This shouldn't be the case unless there is some bug. Look at the following code in struct sway_output *output = root->outputs->items[i];
if (!output->enabled || !output->wlr_output->enabled ||
!root->filters.output_filter(output, root->filters.output_filter_data)) {
continue;
}
animate_output(output);The I am not saying there are no bugs there, but I think you haven't seen that code yet. |
|
You're right, I missed that animation_output_filter is already set before callback_step. So Fix 1 is a no-op for the common single-output case, the filter already handles it. It only matters when multiple outputs are in the animation list at once (moving windows between monitors), where each output's build_state still loops all N animating outputs. So yeah it's a minor benefit, but still correct I believe. The real fix was point 3 (vblank sync). The 16ms timer drifts against the display refresh, very noticeable on my 165Hz monitor. Driving the next frame from the commit callback keeps animations locked to vblank. This is much better with this for me. That being said, on my 60Hz monitor it still feels a bit choppy, probably because a fixed 16ms timer is indeed not an ideal solution. |
|
I think this is great. The only minor thing I am mildly against is in output_configure_scene(output, &root->root_scene->tree.node, 1.0f);by the new function output_configure_scene_for_output(output);unless you have found traversing the whole tree there causes an important delay, I would rather keep the simpler code inherited from sway. That traversal doesn't really do much, it just gathers opacity values from nodes, and updates their buffer opacity values. If the current opacity doesn't change, it is a no-op. So I don't think it is worth optimizing there for now. If you want to remove that function, I would love to merge this. After the merge, I will remove the timer used to schedule frames, because it will be unnecessary. That will involve some documentation and configuration changes, like My only concern is with people who have very high-refresh monitors and a poor CPU/GPU. They might see very bad performance. But we will see when we get there. |
|
Oh yeah it's fine, you have a better understanding of the code base than me, let's keep it simple! |
Fix O(N²) animation work on multi-monitor setups by scoping animate_root to only process the output currently being rendered, instead of re-animating all outputs from each output's render path. Sync animation frame scheduling to display vblank by requesting the next frame directly after a successful commit during animation, rather than relying solely on the independent 16ms timer whose phase drifts relative to the display refresh rate, causing periodic frame drops.
|
Thank you! |
1/ Fix O(N²) animation work on multi-monitor setups by scoping animate_root to only process the output currently being rendered, instead of re-animating all outputs from each output's render path.
2/ Reduce per-frame scene tree walk overhead by having output_configure_scene walk only the current output's layer subtrees and global trees, instead of the entire scene graph including all other outputs' nodes.
3/ Sync animation frame scheduling to display vblank by requesting the next frame directly after a successful commit during animation, rather than relying solely on the independent 16ms timer whose phase drifts relative to the display refresh rate, causing periodic frame drops.
Fix 3 in output_repaint_timer_handler() was the one that really made the difference for me. But I figured the other 2 were still worth including in that PR.