Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upCalculate local clipping rectangle per-primitive run #2200
Conversation
|
Gecko try in progress here: https://treeherder.mozilla.org/#/jobs?repo=try&revision=181a770af0c959460218df6c7d0a481ada2c70d1 |
|
Looks like there are two unexpected passes and one test which also seems to be crashing in other, unrelated try pushes. It's WebRender-related though, so I'll do some more investigation into that particular panic. |
433cba6
to
b210343
|
The Talos panic does not seem to be related to this PR and is being tracked here: https://bugzilla.mozilla.org/show_bug.cgi?id=1423203#c10 |
|
Looks like a great change! I've got a few concerns, plus questions about the description:
Wouldn't this prevent us from culling the primitive runs earlier?
How did the old (hierarchical update) approach prevent the custom clip chains? |
| node.scroll_offset = offsets.zw; | ||
|
|
||
| vec4 misc = TEXEL_FETCH(sClipScrollNodes, uv0, 0, ivec2(6, 0)); | ||
| vec4 misc = TEXEL_FETCH(sClipScrollNodes, uv0, 0, ivec2(4, 0)); | ||
| node.is_axis_aligned = misc.x == 0.0; |
This comment has been minimized.
This comment has been minimized.
kvark
Dec 8, 2017
Member
looks like reference_frame_relative_scroll_offset and scroll_offset are left uninitialized?
This comment has been minimized.
This comment has been minimized.
| @@ -122,9 +122,12 @@ pub struct ClipScrollNode { | |||
| /// The axis-aligned coordinate system id of this node. | |||
| pub coordinate_system_id: CoordinateSystemId, | |||
|
|
|||
| pub coordinate_system_relative_transform: TransformOrOffset, | |||
This comment has been minimized.
This comment has been minimized.
kvark
Dec 8, 2017
Member
Hmm, this needs a comment.
If this is relative to the current coordinate system ID, then the transform could be axis-alinged rotation/scale? what else?
This comment has been minimized.
This comment has been minimized.
| /// A linear ID / index of this clip-scroll node. Used as a reference to | ||
| /// pass to shaders, to allow them to fetch a given clip-scroll node. | ||
| pub node_data_index: ClipScrollNodeIndex, | ||
|
|
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
| source_transform: source_transform.unwrap_or(PropertyBinding::Value(identity)), | ||
| source_perspective: source_perspective.unwrap_or(identity), | ||
| origin_in_parent_reference_frame, | ||
| uninvertible: false, |
This comment has been minimized.
This comment has been minimized.
kvark
Dec 8, 2017
Member
nit: perhaps make it positive-meaning, like "can_invert" or "is_invertible"
This comment has been minimized.
This comment has been minimized.
| /// A transformation with an inverse. If the inverse isn't present, this isn't a 2D | ||
| /// transformation, which means we need to fall back to using inverse_rect_footprint. | ||
| /// Since this operation is so expensive, we avoid it for the 2D case. | ||
| Transform { |
This comment has been minimized.
This comment has been minimized.
| TransformOrOffset::Transform { inverse: Some(inverse), .. } => | ||
| inverse.transform_rect(&rect), | ||
| TransformOrOffset::Transform { transform, inverse: None } => | ||
| transform.inverse_rect_footprint(rect), |
This comment has been minimized.
This comment has been minimized.
|
Reviewed 27 of 27 files at r1. webrender/res/prim_shared.glsl, line 167 at r1 (raw file): Previously, kvark (Dzmitry Malyshau) wrote…
I think those fields are unused now, so can probably be removed from both the CPU and GPU versions of the ClipScrollNodeData. webrender/src/prim_store.rs, line 1240 at r1 (raw file):
Should this be in the PrimitiveContext? Not sure... webrender/src/prim_store.rs, line 1578 at r1 (raw file):
Could you expand on that comment? webrender/src/util.rs, line 372 at r1 (raw file):
nit: grammar Comments from Reviewable |
|
This is a great change! I left a few additional minor comments in reviewable. Apart from that, I just have one other concern. in the case of property animation on transforms, it would seem we can get in a situation where the local rect in the primitive changes. But since we don't invalidate the GPU cache request, we won't update the cache with the new local clip rect. So, we would need to detect if the local clip rect for a primitive run has changed since the last frame (which should be quite rare), and then invalidate the GPU blocks for the primitives that make up that run. Does that sound right? |
|
|
b210343
to
632345b
|
Okay. Sorry for the delay on the new patch. After talking with Glenn about this, I finally have a new version which adds an entirely new texture for the local clips rects (now called "clip chain rects"). The primitive rectangle is calculated early by intersecting the local clip rect with the appropriate clip chain rect when the primitive is loaded in the shader. This ensures that that they clip chain rectangles are up to date even with property binding updates, but prevents a lot of churn in the GPU cache. The addition of the new texture is unfortunate, but this seemed like the lesser hack of all the things considered. |
|
Here is the try run for this version: https://treeherder.mozilla.org/#/jobs?repo=try&revision=15c6957adf8edbd4d347c3bdbfe9665bd6633dcd |
I haven't reviewed the new code, but this sounds suspicious. We shouldn't be moving away from GPU cache back in the land of free-for-all textures just because some use cases are slow to update. I'd rather have us focused on getting the GPU cache performance solid, since it is simpler and has greater positive impact on the rest of the code. If the clip rect updates are slow, we may try the following things:
|
|
The issue it prevents in the GPU cache is that when we invalidate the item for a text run, that involves re-uploading the entire set of glyphs for a text run, which involves a lot of extra work (we found about an extra 0.5ms in compositor CPU time, but there is also the backend time in re-copying those glyphs across). We could have an extra entry in the GPU cache, but that would involve adding an extra It's not ideal, but I think it's probably the best solution for now. In the future, once we are able to compress the primitive instance structures further, it probably makes sense to remove this and use a normal GPU cache entry instead? |
|
This looks good to me. The try shows two new passes. @kvark Anything you want to check over on this or is it ready to go? |
|
|
|
Reviewed 4 of 11 files at r2. Comments from Reviewable |
|
@mrobinson This just needs a rebase, then r=glennw,kvark. |
632345b
to
9faaada
|
Okay. I've rebased this PR, but it seems that the newly added test (reftests/boxshadow/overlap1.yaml) is failing with my PR, so I'll investigate this after Christmas. |
|
|
Instead of using the hierarchical nature of the ClipScrollTree to calculate local clipping rectangles for primitives, do it when we are about to render a particular primitive run. This will allow us to handle custom clip chains while maintaining local clipping rectangle optimizations. It also allows us to greatly simplify a lot of shader code that used to deal with both clipping and scrolling nodes, as the local clipping rectangle is now bundled into the primitives local_clip_rect. TransformOrOffset is the main struct that handles the efficient transformation of clipping rectangles between compatible coordinate systems. It will avoid doing expensive matrix operations as much as possible, which should make these calculations cheaper -- important now that they are done per-primitive run.
It's unused now.
9faaada
to
7ad7225
|
Looks like the failing test is just a subpixel difference, so I've updated the png reference file and will push the rebased branch. |
|
@bors-servo r=glennw,kvark. |
|
|
…kvark. Calculate local clipping rectangle per-primitive run Instead of using the hierarchical nature of the ClipScrollTree to calculate local clipping rectangles for primitives, do it when we are about to render a particular primitive run. This will allow us to handle custom clip chains while maintaining local clipping rectangle optimizations. It also allows us to greatly simplify a lot of shader code that used to deal with both clipping and scrolling nodes, as the local clipping rectangle is now bundled into the primitives local_clip_rect. TransformOrOffset is the main struct that handles the efficient transformation of clipping rectangles between compatible coordinate systems. It will avoid doing expensive matrix operations as much as possible, which should make these calculations cheaper -- important now that they are done per-primitive run. <!-- Reviewable:start --> --- This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/webrender/2200) <!-- Reviewable:end -->
|
|
|
@mrobinson This PR seems to have introduced or exposed a crash in WR, which manifests as the talos-g1 test failing. There's a backtrace from the log at https://bugzilla.mozilla.org/show_bug.cgi?id=1426116#c7 |
|
@staktrace I've posted a fix for this issue here: #2249 Sorry for the delay, it was due to the holidays. |
|
Confirmed that the fix worked, thanks! |
mrobinson commentedDec 8, 2017
•
edited by larsbergstrom
Instead of using the hierarchical nature of the ClipScrollTree to calculate local clipping rectangles for primitives, do it when we are about to render a particular primitive run. This will allow us to handle custom clip chains while maintaining local clipping rectangle optimizations. It also allows us to greatly simplify a lot of shader code that used to deal with both clipping and scrolling nodes, as the local clipping rectangle is now bundled into the primitives local_clip_rect.
TransformOrOffset is the main struct that handles the efficient transformation of clipping rectangles between compatible coordinate systems. It will avoid doing expensive matrix operations as much as possible, which should make these calculations cheaper -- important now that they are done per-primitive run.
This change is