New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Track damage rect and only draw inside it #33939
Comments
Just talked with @chinmaygarde on this issue. There might be blockers in OpenGL backends such as I'll talk to @drWulf for more details on the Android side. |
Just talked with Derek (@drWulf) and got the following points:
It seems that we can get some significant performance gain for 80%+ Android devices (with |
I shouldn't be quoted on the 80% number. I'm aware of a variety of devices picking up EGL_EXT_buffer_age because of its benefits when running HWUI. I'm still trying to find accurate numbers for how many devices support that extension, but in any case on the newer devices that do the performance gains will be substantial. |
@liyuqian Can you provide some profiling that shows the problem? Here is a frame draw from the Metal backend: |
Due to #31865 (comment), this is unlikely to help #31865 much. Instead, we should come up with a test case with a complex page that has only a blinking cursor. We'll aim to improve the performance of that page significantly using damage rects. |
Any progress about tracking damage rect and drawing inside it? |
I don't think there is a metal API for partial updates, but partial updates can be approximated to a certain degree for both metal and opengl using IOSurface and multiple CALayers. The current situation where single spinner redraws entire 4K window is problematic when it comes to power consumption. As far as I can tell EGL_EXT_buffer_age is not going to help figure out which part of layer tree has changed from last frame and need to be re-rendered, that still needs to be addressed, right? |
@liyuqian do you have plan to this feature? |
Last year I wrote up an analysis of the problems to solve to get to the point of tracking damage areas in the Flutter ecosystem and I'm looking at cleaning it up for publication in the next few weeks. To answer some quick questions:
Getting that document fleshed out and published is next on my plate after I finish a couple of other smaller tasks. |
@flar, this has been bugging me for a while, and I had a bit of time to spend on it, so I pushed a proof of concept implementation in my fork. Right now it's limited to macOS, but it shows great potential with pathological cases such as the Progress Indicator example: MacBook Pro i7, iGPU When implementing it I had 3 main usecases in mind:
Which pretty much means contents that is completely outside of repaint boundary is not touched at all (in most cases not even having to be clipped by Skia). Couple of implementation remarks:
I can provide design doc if this seems useful. Example screenshots showing actual screen updates using Quartz debug |
@knopp that sounds great so far. You might want to create a WIP PR with the changes so we can start a discussion about the implementation and what more might be needed. I'd love to see that design document as well. How are you tracking changes (insertions, deletions, reorderings) to the children of a ContainerLayer? |
Good |
@flar, the tracking is quite straightforward. It does some additional work during preroll in exchange for not having to diff a tree, just a set:
So we end up with This has a nice benefit of easily being able to do diff between any frame in past for which we have Sidenote: If we knew that a layer (or at least a subrect) is fully opaque, during the first step we could easily determine occluded layers, which might be worth it for complex scenes. |
Thanks @knopp for implementing it! The overall direction looks right. Here are two suggestions that might be helpful: 1. Damage rect of two adjacent frames can derive the damage rect of multiple framesSuppose there are frames F_0, F_1, F_2, ..., F_n. We only need to compute the damage rect D_1, D_2, ..., D_n where D_i is the damage rect between F_i and F_{i-1}. If one needs to figure out the overall damage rect among multiple frames F_a, F_{a+1}, ..., F_b, we can simply use the union of D_{a+1}, D_{a+2}, ..., D_b. That's why we consider Hopefully this
2. Consider the two frame damage rect as a longest sub-sequence matching problemAs @flar pointed out in flutter/engine#21009, order matters so set comparison is insufficient. Hence we need to consider the sequence. Consider the following classic algorithmic problem: given two strings (sequences of characters), what's the longest sub-sequence they could match. For example, "hello_world" and "hi_word" both matches a sub-sequence "h_wor". The unmatched is kind of like the damage "characters". Replace "characters" with "layers" would result in our problem. In our case, we can flatten a layer tree to a layer sequence by pre-order traversals. Each layer can be abstracted as a rect and an id (all content, mutator stacks, ..., must match to have the same id). Then we want to compute the best layer sub-sequence match to minimize the damage rect which is the union of the rects of all unmatched layers. An O(n^2) dynamic programming algorithm can be used to compute the optimal solution, or maybe we can use some O(n) greedy algorithms to compute an approximately good solution. I believe Flutter framework sometimes uses the following: match as many from the beginning, match as many from the end, and give up all the middle part. It would result in the linear reconciliation as described here. (CC @Hixie for confirmation.) I think we can start with that simple linear greedy algorithm, and test the more sophisticated algorithm later once we have set up some integration tests (macrobenchmarks) as well as unit test benchmarks (microbenchmarks). |
@liyuqian, thank you for the feedback. Regarding 2) Any unmatched layer contributes to damage rect. For calculating of damage rect it doesn't really matter what order the unmatched layers were painted though, so here set comparison is enough. The paint order is however relevant for layers that match between frames. If two layers intersect and match between frames, they need both need to contribute to damage area if they are painted in different order. I addressed that here after @flar's comment. Can you give me any example of layer tree change between frames where this approach is not sufficient to calculate damage area? Regarding 1) I agree that knowing the age of back buffer age is important, but I'm not sure that simply keeping the damage rect of frames in past is ideal. Suppose the following situation: You have 2 buffers, you add layer, show it for one frame and then remove it:
At step 4, if you do diff with previous frame + union with frame before that, you will need to paint the area below the removed layer. Whereas if you keep the (this assumes situation where you flip entire framebuffers, with partial damage flip the situation is different) |
Btw, I forgot to mention, in this branch , which is rebased on the wip pr, the macOS embedder is modified to only repaint damaged region and highlight the change (so it can be seen in practice, i.e. with flutter gallery) |
It is true that simply accumulating the per-frame damage areas is enough for correctness, but it isn't optimal. However, I don't believe there are many practical cases where it is sub-optimal. The "layer comes and then goes" example defines the problem, but in practice layers don't appear and disappear in adjacent frames. Most layers that appear animate in and out so the appearance and disappearance are spread over a few frames that are going to be longer than the history of back buffers. I am not 100% certain on whether small elements like text cursors fade in or out, but even if they did no fading, their area is so small that the accumulation of damage and repainting just that tiny area would be fine. Another technique that can be used is to render lightweight annotations like a text cursor on another layer and then we simply animate the appearance and disappearance of that additional item in the annotation layer without having to redraw the tree. Unfortunately, this constrains the app design as this technique doesn't work as well if the text cursor is on a layer that has other layers overlapping it. |
Storing the frame description from past frames shouldn't bring much overhead, it does retain layers but judging from gallery most of them are either retained anyway or physicalshapelayers, which don't have lot of state. Diffing the tree is more or less linear operation (detecting reordered frames would be O(n^2) in worst case where all matching frames are painted in reverse order, which is unlikely to happen). But I'm not feeling too strongly about this. The major usa cases are things like progress indicator in otherwise static page, scrolling small listview in large window or blinking text cursor, and I agree that here it won't make any meaningful difference. |
One can treat re-ordered elements as "missing and new" flexibly in a single pass that only matches ordered elements. You have to use a "diff" algorithm because the first elements that you encounter that don't match could be due to a missing element in either tree and you can't determine that without some further traversal. For a prototype that I wrote I basically just did a "match from the beginning + match from the end" on each container and treated all children between those points as damage. This is precisely what you need for an added child (or a few children added together) and precisely what you need for a deleted child (or a few children deleted together). I'm not sure that shuffling children is a common enough app design that we need to do better than that, especially for a first pass. Here is a diff of the (massively incomplete) prototype I was working on last November (note that I believe that those changes to make ImageFilter comparable and therefore stable between frames are already in the tree pushed separately in preparation for this planned work)... |
@flar, in my implementation I first first detect which layer were added, removed or otherwise changed (i.e. moved). Those all contribute to dirty area. Those are layers for which we can't find matching LayerEntry between frames. After that I'm left with two sets of layers, which have same elements, just the (paint) order is different. Determining which are reordered is simple, at that point there are no added / removed layers. knopp/engine@b50df38#diff-32e5e35893c1cb045caca1c8fcbcff8eR139 As for image filter / backdrop layer, I serialize the filter and compare serialized data to detect if layer is changed. I also adjust paint bounds of child layers (image filter) and layers below (backdrop filter) to account for additional area (i.e. blur) |
@flar, I've added initial PR for wiring required for partial update on iOS and Android. There are couple if issue to resolve:
|
@knopp I would ask those questions in the PR. I've added some reviewers that I think might have some good answers for the various issues. What glitches do you see on 6.0 devices? |
I only have one 6.0 device. The glitches I'm seeing is that the content outside of clip area (outside area specified by EGL_KHR_partial_update) is not preserved and turns into garbage after few frames. It doesn't happen in any of my other devices. I'll see if I can get another phone with Android 6 to test this. |
Escalating priority again so that the engine team can discuss when it meets next week per discussion with Google team interested in this work. |
Just an update on things DiffContext landed flutter/engine#21824 and the PR to wire is flutter/engine#25158, though I haven't had much time to test it after rebasing, and there are still unresolved issues, hence the WIP. |
Excellent news! Thanks! |
Thanks for the update. Ray will work on gathering additional traces from the customer as well. |
@knopp can we see this optimization with the next Flutter stable release? Also do you have any statistics for fps, jank, cpu usage and battery consumption of the Flutter sample app before and after this PR? I wonder if calculating the diff takes too much time or not. Anyhow thank you very much for trying to increase performance on Flutter. We desperately need it because we are a video feed app and our battery usage is unacceptable:/ |
@knopp @flar Why Disable partial repaint for devices older than Android 9 ?
We can see in the Android source code that Android 6.0 started to use eglSwapBuffersWithDamageKHR for swapbuffer. eglSetDamageRegionKHR was also introduced in Android 7.0. eglSwapBuffersWithDamageKHR: https://cs.android.com/android/platform/superproject/+/android-6.0.0_r1:frameworks/base/libs/hwui/renderthread/EglManager.cpp;l=282 |
I've had glitches with some devices on old android versions, but those seems to have been resolved with adding support for quadruple buffering and specifying larger clip alignment. Still, given the heterogeneity of android devices we decided to only enable it for newer API levels and possibly revisit the threshold later. |
@knopp Thanks Reply. |
(As I understand it, the current status here is that this is implemented for Android and iOS but not yet the other platforms.) |
From new feature request scan: The missing pieces on desktop platforms are around wiring up through the embedder API. Since we have issues filed for those elsewhere, I will close this issue. |
@zanderso Can you please link those issues so we can follow? Thanks |
This thread has been automatically locked since there has not been any recent activity after it was closed. If you are still experiencing a similar issue, please open a new bug, including the output of |
Currently, Flutter redraws every pixel even if there's a very tiny part of the screen that's animating (e.g.,
CircularProgressIndicator
, caret inTextField
).We should compute the damage rect (which shouldn't be hard due to our
repaint
mechanism), reuse the buffer from previous frames, and draw the new frame with the clip set to the damage rect.Hopefully this could have a significant impact on CPU/GPU usages for issues like #31865
The text was updated successfully, but these errors were encountered: