Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upClip chain refactor for local picture raster support. #2951
Conversation
|
This still needs some more work, specifically:
Anecdotally it seems to be quite a bit faster on the sites I've tested against than previously, although I haven't done detailed performance numbers yet. It's probably not quite ready for review yet, but if anyone wants to start taking a look at it, go for it |
|
If anyone has any general comments on the approach taken, and anything obvious missing, please let me know :) |
9a56c52
to
a65f564
|
I added some poorly drawn ASCII diagrams to try and describe how the data structures fit together (a65f564#diff-97cd013eed8f4591bd7659faf4352fd2R23). I have fixed the test failues in wrench, and most of the test failures in gecko try. Tomorrow I will try to resolve the remaining failures in gecko try, which are mostly 3d transform related. |
patch. The overall goal is to make clip chain handling not depend on screen rectangle generation. This will make it much easier to implement local space picture rasterization for perspective elements. In addition to that, we unify how user / hierarchical and primitive clip chain nodes are handled. Clip chain instances are built on demand now, as required by primitives. Right now, none of the clip chain information is cached between primitives. In general, now that the calculations are done in local space, they are much faster - however we can easily start to cache the common clip-chain state (per raster coordinate system) if it ever shows up in a profile. Additionally, since we only build clip chains on demand, and clip chains for non- visible primitives are not built, which is a decent saving on many pages. Remove ClipNode altogether. Instead, during flattening we map directly from a ClipId to a ClipChainId as appropriate. Clip chains are now created during flattening and the links remain constant. It's only during construction of a clip chain instance that we include and exclude parts of the specified clip chain. This unifies how we handle user defined clip chains and legacy hierarchical clip chains. At the same time, unify how the optional primitive specific complex clip is handled with clip chains. Now, if a primitive has a complex clip, it's added as a normal clip chain node, and linked to the parent clip chain that this primitive would otherwise be attached to. Primitive runs now only depend on the spatial node. This results in far fewer primitive run breaks. ClipWorkItems are now stored as a single contiguous array in the ClipStore. To achieve this they are stored as an index buffer, and a clip node range is simply an (index, count). These are a lot cheaper to pass to clip masks, and also require fewer heap allocations. Port hit testing to use the same clip chain data structures that the main clip code uses, now that the hierarchical and user defined clip chains are handled in a uniform way. Remove the need for Arc in clip chain code, since clip chains are now stored in a contiguous index buffer style structure.
|
Try run is looking very green now - just a few orange items with unrelated intermittents. r? @kvark |
|
This is an amazing change. I like how the function signatures got lighter, and the forest of clip types is now easier to grasp.
webrender/src/clip.rs, line 29 at r1 (raw file):
maybe rename to webrender/src/clip.rs, line 48 at r1 (raw file):
maybe webrender/src/clip.rs, line 92 at r1 (raw file):
sounds more like webrender/src/clip.rs, line 140 at r1 (raw file):
we should eventually get rid of webrender/src/clip.rs, line 189 at r1 (raw file):
why not just webrender/src/clip.rs, line 201 at r1 (raw file):
does this variant give us anything over webrender/src/clip.rs, line 225 at r1 (raw file):
this is confusing, given that "clip space" has a much different semantics in computer graphics webrender/src/clip.rs, line 338 at r1 (raw file):
nit: let's list those explicitly webrender/src/clip.rs, line 356 at r1 (raw file):
would it make sense to use webrender/src/clip.rs, line 401 at r1 (raw file):
this collection is not needed. webrender/src/clip.rs, line 490 at r1 (raw file):
typo "wher" webrender/src/clip.rs, line 491 at r1 (raw file):
perspective transform? webrender/src/clip.rs, line 524 at r1 (raw file):
the second condition could shadow the first one (i.e the first one seems unnecessary) webrender/src/clip.rs, line 528 at r1 (raw file):
we'll need to route the chasing logic here (as a followup by me, presumably) webrender/src/clip.rs, line 587 at r1 (raw file):
interesting, so webrender/src/clip.rs, line 880 at r1 (raw file):
that function is so nice, can't get my eyes off it :) webrender/src/clip.rs, line 904 at r1 (raw file):
nit: could also contain the intersection with webrender/src/clip.rs, line 918 at r1 (raw file):
nit: might be tidier to do webrender/src/clip.rs, line 932 at r1 (raw file):
iirc, it's fairly fast webrender/src/display_list_flattener.rs, line 759 at r1 (raw file):
shouldn't it be called webrender/src/display_list_flattener.rs, line 1284 at r1 (raw file):
let me get this straight: so, webrender/src/display_list_flattener.rs, line 1291 at r1 (raw file):
uh, we need to completely split webrender/src/display_list_flattener.rs, line 1295 at r1 (raw file):
I'll clean this up webrender/src/hit_test.rs, line 78 at r1 (raw file):
does it make sense versus passing webrender/src/hit_test.rs, line 152 at r1 (raw file):
nit: |
webrender/src/clip.rs, line 29 at r1 (raw file): Previously, kvark (Dzmitry Malyshau) wrote…
Done. webrender/src/clip.rs, line 48 at r1 (raw file): Previously, kvark (Dzmitry Malyshau) wrote…
Done. webrender/src/clip.rs, line 92 at r1 (raw file): Previously, kvark (Dzmitry Malyshau) wrote…
Done. webrender/src/clip.rs, line 140 at r1 (raw file): Previously, kvark (Dzmitry Malyshau) wrote…
Agreed. webrender/src/clip.rs, line 189 at r1 (raw file): Previously, kvark (Dzmitry Malyshau) wrote…
I tried this locally - but it seems a bit unwieldy. The lack of is_empty() on stable and the way we use the count field here makes the code seem a bit tidier with a custom range, I think. webrender/src/clip.rs, line 201 at r1 (raw file): Previously, kvark (Dzmitry Malyshau) wrote…
A (perhaps small) performance optimization - no translate / offsetting required in these cases, which is probably quite a common use case. webrender/src/clip.rs, line 225 at r1 (raw file): Previously, kvark (Dzmitry Malyshau) wrote…
Done. webrender/src/clip.rs, line 338 at r1 (raw file): Previously, kvark (Dzmitry Malyshau) wrote…
Done webrender/src/clip.rs, line 356 at r1 (raw file): Previously, kvark (Dzmitry Malyshau) wrote…
I hadn't for now, to minimize changes to the segmenting code in this patch. But it could make sense to do this as a follow up, yep. webrender/src/clip.rs, line 401 at r1 (raw file): Previously, kvark (Dzmitry Malyshau) wrote…
Done. webrender/src/clip.rs, line 490 at r1 (raw file): Previously, kvark (Dzmitry Malyshau) wrote…
Done. webrender/src/clip.rs, line 491 at r1 (raw file): Previously, kvark (Dzmitry Malyshau) wrote…
Done. webrender/src/clip.rs, line 524 at r1 (raw file): Previously, kvark (Dzmitry Malyshau) wrote…
Done. webrender/src/clip.rs, line 528 at r1 (raw file): Previously, kvark (Dzmitry Malyshau) wrote…
Sure, if you're happy to do that webrender/src/clip.rs, line 587 at r1 (raw file): Previously, kvark (Dzmitry Malyshau) wrote…
Yep - because Local means that the offsets between the clip and the primitive cannot change due to scrolling. webrender/src/clip.rs, line 880 at r1 (raw file): Previously, kvark (Dzmitry Malyshau) wrote…
webrender/src/clip.rs, line 904 at r1 (raw file): Previously, kvark (Dzmitry Malyshau) wrote…
Yea - I was thinking that since contains_rect is cheaper, and perhaps the common path, it might be slightly faster. But we should profile as a follow up and see which is actually more common. webrender/src/clip.rs, line 918 at r1 (raw file): Previously, kvark (Dzmitry Malyshau) wrote…
I think this would probably make sense if we do the contains_rect suggestion above, based on some profiling. webrender/src/clip.rs, line 932 at r1 (raw file): Previously, kvark (Dzmitry Malyshau) wrote…
Yea, I expect it will be fine too. webrender/src/display_list_flattener.rs, line 759 at r1 (raw file): Previously, kvark (Dzmitry Malyshau) wrote…
Done. webrender/src/display_list_flattener.rs, line 1284 at r1 (raw file): Previously, kvark (Dzmitry Malyshau) wrote…
Yea - Martin and I had discussed that we would like to tidy up the public API here and make it explicit, but we agreed to leave this for now, until there's a better time for Gecko to make those API changes. webrender/src/display_list_flattener.rs, line 1291 at r1 (raw file): Previously, kvark (Dzmitry Malyshau) wrote…
Yep - see above comment :) webrender/src/display_list_flattener.rs, line 1295 at r1 (raw file): Previously, kvark (Dzmitry Malyshau) wrote…
webrender/src/hit_test.rs, line 78 at r1 (raw file): Previously, kvark (Dzmitry Malyshau) wrote…
Perhaps, I don't think it matters too much either way. webrender/src/hit_test.rs, line 152 at r1 (raw file): Previously, kvark (Dzmitry Malyshau) wrote…
Done |
|
@kvark Thanks for the review! Those naming suggestions are much better than I had |
|
|
@bors-servo r+ |
|
|
|
Once this lands we should be on the lookout for any regressions in number of clip masks we draw, or any hit-test issues. I wouldn't be surprised to have a few issues like that which need to be fixed up |
|
@bors-servo r=kvark (is bors stuck?) |
|
|
|
Clip chain refactor for local picture raster support. There's a large number of related changes in this patch. The overall goal is to make clip chain handling not depend on screen rectangle generation. This will make it much easier to implement local space picture rasterization for perspective elements. In addition to that, we unify how user / hierarchical and primitive clip chain nodes are handled. Clip chain instances are built on demand now, as required by primitives. Right now, none of the clip chain information is cached between primitives. In general, now that the calculations are done in local space, they are much faster - however we can easily start to cache the common clip-chain state (per raster coordinate system) if it ever shows up in a profile. Additionally, since we only build clip chains on demand, and clip chains for non- visible primitives are not built, which is a decent saving on many pages. Remove ClipNode altogether. Instead, during flattening we map directly from a ClipId to a ClipChainId as appropriate. Clip chains are now created during flattening and the links remain constant. It's only during construction of a clip chain instance that we include and exclude parts of the specified clip chain. This unifies how we handle user defined clip chains and legacy hierarchical clip chains. At the same time, unify how the optional primitive specific complex clip is handled with clip chains. Now, if a primitive has a complex clip, it's added as a normal clip chain node, and linked to the parent clip chain that this primitive would otherwise be attached to. Primitive runs now only depend on the spatial node. This results in far fewer primitive run breaks. ClipWorkItems are now stored as a single contiguous array in the ClipStore. To achieve this they are stored as an index buffer, and a clip node range is simply an (index, count). These are a lot cheaper to pass to clip masks, and also require fewer heap allocations. Port hit testing to use the same clip chain data structures that the main clip code uses, now that the hierarchical and user defined clip chains are handled in a uniform way. Remove the need for Arc in clip chain code, since clip chains are now stored in a contiguous index buffer style structure. <!-- Reviewable:start --> --- This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/webrender/2951) <!-- Reviewable:end -->
|
|
|
Hmmm, on subsequent tries I'm seeing failures, but when I revert those back to this patch, I'm still seeing the same failures - but those same failures were not occurring on the try run above. |
|
Hmm, and it's actually a single test that is showing up in multiple places due to being run on each platform - |
|
Oh, I think I see what has happened. Ugh.
Investigating now to see why this patch breaks that test. |
|
By Bug 1483610 Comment 5, the change might cause huge gecko performance regression. |
gw3583 commentedAug 6, 2018
•
edited by larsbergstrom
There's a large number of related changes in this
patch. The overall goal is to make clip chain handling
not depend on screen rectangle generation. This
will make it much easier to implement local space
picture rasterization for perspective elements. In
addition to that, we unify how user / hierarchical and
primitive clip chain nodes are handled.
Clip chain instances are built on demand now, as
required by primitives. Right now, none of the
clip chain information is cached between primitives.
In general, now that the calculations are done in
local space, they are much faster - however we can
easily start to cache the common clip-chain state
(per raster coordinate system) if it ever shows up
in a profile. Additionally, since we only build
clip chains on demand, and clip chains for non-
visible primitives are not built, which is a
decent saving on many pages.
Remove ClipNode altogether. Instead, during flattening
we map directly from a ClipId to a ClipChainId
as appropriate. Clip chains are now created during
flattening and the links remain constant. It's only
during construction of a clip chain instance that we
include and exclude parts of the specified clip
chain. This unifies how we handle user defined clip
chains and legacy hierarchical clip chains. At the
same time, unify how the optional primitive specific
complex clip is handled with clip chains. Now, if
a primitive has a complex clip, it's added as a
normal clip chain node, and linked to the parent
clip chain that this primitive would otherwise
be attached to.
Primitive runs now only depend on the spatial node.
This results in far fewer primitive run breaks.
ClipWorkItems are now stored as a single contiguous
array in the ClipStore. To achieve this they are
stored as an index buffer, and a clip node range
is simply an (index, count). These are a lot
cheaper to pass to clip masks, and also require
fewer heap allocations.
Port hit testing to use the same clip chain data
structures that the main clip code uses, now that
the hierarchical and user defined clip chains are
handled in a uniform way.
Remove the need for Arc in clip chain code, since
clip chains are now stored in a contiguous index
buffer style structure.
This change is