Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upAdd framework for dynamic/cache render tasks, and use for box shadows. #485
Conversation
|
It's unfortunate that the PR has a single commit that combines two major changes (cached tasks and box shadows), but otherwise looks good! |
|
|
||
| void main(void) { | ||
| vec2 pos = vPos.xy; | ||
| vec2 p0Rect = vBoxShadowRect.xy, p1Rect = vBoxShadowRect.zw; |
This comment has been minimized.
This comment has been minimized.
kvark
Oct 27, 2016
Member
I'm getting confused about on how we pass/name rectangles in. This one vBoxShadowRect appears to have xy = left-top point, with zw - right bottom point. But the others (like local_clip_rect) have zw = (width, height).
This comment has been minimized.
This comment has been minimized.
pcwalton
Oct 27, 2016
Collaborator
Yeah, we didn't bother to establish a convention. This is my fault. :(
I think we should pick one and stick with it.
This comment has been minimized.
This comment has been minimized.
glennw
Oct 28, 2016
Author
Member
I think we need to be able to have both - depending on what the FS does, it may be far more efficient to have one or the other representation. But we should definitely at least have a naming convention for these cases.
| vBlurRadius = bs.border_radius_edge_size_blur_radius_inverted.z; | ||
| vInverted = bs.border_radius_edge_size_blur_radius_inverted.w; | ||
| vBoxShadowRect = vec4(bs.bs_rect.xy, bs.bs_rect.xy + bs.bs_rect.zw); | ||
| vPos = (pos - 1.0 - p0) / uDevicePixelRatio + bs.bs_rect.xy - vec2(2.0 * vBlurRadius); |
This comment has been minimized.
This comment has been minimized.
| vMirrorPoint = 0.5 * prim.local_rect.zw / patch_size; | ||
|
|
||
| vec2 texture_size = vec2(textureSize(sCache, 0)); | ||
| vCacheUvRectCoords = vec4(patch_origin, patch_origin + patch_size_device_pixels) / texture_size.xyxy; |
This comment has been minimized.
This comment has been minimized.
kvark
Oct 27, 2016
Member
wait, are our textures (namely, sCache) sized up to the CSS pixels or to the main framebuffer size?
(going to check it out)
This comment has been minimized.
This comment has been minimized.
| @@ -1397,6 +1414,22 @@ impl Renderer { | |||
| gl::clear(gl::COLOR_BUFFER_BIT); | |||
| } | |||
|
|
|||
| // Draw any cache primitives for this target. | |||
| if !target.box_shadow_cache_prims.is_empty() { | |||
| gl::disable(gl::BLEND); | |||
This comment has been minimized.
This comment has been minimized.
| } | ||
|
|
||
| for target in &mut self.targets { | ||
| target.build(ctx, render_tasks); | ||
| let child_pass_index = RenderPassIndex(self.pass_index.0 - 1); |
This comment has been minimized.
This comment has been minimized.
kvark
Oct 27, 2016
Member
is this new index guaranteed to exist?
perhaps, we can have let child_pass_index = self.pass_index.get_child().unwrap(); instead
This comment has been minimized.
This comment has been minimized.
glennw
Oct 28, 2016
Author
Member
Yes, this pass index is guaranteed to exist. If it doesn't it indicates a very serious bug! :)
|
|
||
| void main(void) { | ||
| vec2 pos = vPos.xy; | ||
| vec2 p0Rect = vBoxShadowRect.xy, p1Rect = vBoxShadowRect.zw; |
This comment has been minimized.
This comment has been minimized.
pcwalton
Oct 27, 2016
Collaborator
Yeah, we didn't bother to establish a convention. This is my fault. :(
I think we should pick one and stick with it.
| vBoxShadowRect = vec4(bs.bs_rect.xy, bs.bs_rect.xy + bs.bs_rect.zw); | ||
| vPos = (pos - 1.0 - p0) / uDevicePixelRatio + bs.bs_rect.xy - vec2(2.0 * vBlurRadius); | ||
|
|
||
| gl_Position = uTransform * vec4(pos, 0, 1); |
This comment has been minimized.
This comment has been minimized.
pcwalton
Oct 27, 2016
Collaborator
Shouldn't this be vec4(pos, 0.0, 1.0)? I always forget how strict shaders are but I remember getting burned by stuff like this…
This comment has been minimized.
This comment has been minimized.
| PrimitiveContainer::BoxShadow(box_shadow_gpu, instance_rects) => { | ||
| // TODO(gw): Account for zoom factor! | ||
| let edge_size = box_shadow_gpu.edge_size.ceil() * self.device_pixel_ratio; | ||
| let edge_size = edge_size as i32 + 2; // Account for bilinear filtering |
This comment has been minimized.
This comment has been minimized.
| RenderTaskId::Dynamic(key) => { | ||
| let index = RenderTaskIndex(self.render_task_data.len()); | ||
| let key = (key, pass); | ||
| debug_assert!(self.dynamic_tasks.contains_key(&key) == false); |
This comment has been minimized.
This comment has been minimized.
pcwalton
Oct 27, 2016
Collaborator
I assume you said == false here to avoid the double ! being confusing?
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
| self.add_primitive(&prim_rect, | ||
| clip_rect, | ||
| clip, | ||
| PrimitiveContainer::BoxShadow(prim_gpu, instance_rects)); |
This comment has been minimized.
This comment has been minimized.
This adds the framework for tiles and primitives to add dependent render tasks. The proof of concept in this patch uses them to speed up box shadow rendering, but the dynamic render tasks can be used for many things in the future. For example, we can use this system to render rotated clip masks, or cache text runs to a texture. The box shadows are rendered by adding a child task that draws the upper left corner of a box shadow. The primitive shader for box shadows then applies this rendered patch over the box shadow by clamping and mirroring the texture coordinates in the fragment shader. Dynamic render tasks are added to passes via a cache key that is defined based on the type of item being cached. This means that dynamic render tasks are du-duplicated during allocation and assignment to render passes and targets. This means that if different tiles require the same cache primitive, it will only get rendered once. It also means that if two primitive exist that require the same resource type (e.g. two box shadows with different colors but same size and blur radius) then the cache item will only be drawn once to the child render target. With this change, GPU time on transparent_rects demo drops from 16ms down to 9ms on my test setup.
|
@pcwalton Rebased, addressed comments and pushed updates. |
|
@bors-servo: r+ |
|
|
Add framework for dynamic/cache render tasks, and use for box shadows. This adds the framework for tiles and primitives to add dependent render tasks. The proof of concept in this patch uses them to speed up box shadow rendering, but the dynamic render tasks can be used for many things in the future. For example, we can use this system to render rotated clip masks, or cache text runs to a texture. The box shadows are rendered by adding a child task that draws the upper left corner of a box shadow. The primitive shader for box shadows then applies this rendered patch over the box shadow by clamping and mirroring the texture coordinates in the fragment shader. Dynamic render tasks are added to passes via a cache key that is defined based on the type of item being cached. This means that dynamic render tasks are du-duplicated during allocation and assignment to render passes and targets. This means that if different tiles require the same cache primitive, it will only get rendered once. It also means that if two primitive exist that require the same resource type (e.g. two box shadows with different colors but same size and blur radius) then the cache item will only be drawn once to the child render target. With this change, GPU time on transparent_rects demo drops from 16ms down to 9ms on my test setup. <!-- Reviewable:start --> --- This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/webrender/485) <!-- Reviewable:end -->
|
|
glennw commentedOct 27, 2016
•
edited by larsbergstrom
This adds the framework for tiles and primitives to add dependent
render tasks. The proof of concept in this patch uses them to
speed up box shadow rendering, but the dynamic render tasks can
be used for many things in the future. For example, we can use
this system to render rotated clip masks, or cache text runs
to a texture.
The box shadows are rendered by adding a child task that draws the
upper left corner of a box shadow. The primitive shader for box
shadows then applies this rendered patch over the box shadow by
clamping and mirroring the texture coordinates in the fragment
shader.
Dynamic render tasks are added to passes via a cache key that is
defined based on the type of item being cached. This means that
dynamic render tasks are du-duplicated during allocation and
assignment to render passes and targets. This means that if different
tiles require the same cache primitive, it will only get rendered
once. It also means that if two primitive exist that require the
same resource type (e.g. two box shadows with different colors but
same size and blur radius) then the cache item will only be drawn
once to the child render target.
With this change, GPU time on transparent_rects demo drops from
16ms down to 9ms on my test setup.
This change is