Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use picture caching for text shadows. #3270

Merged
merged 2 commits into from Nov 8, 2018
Merged

Use picture caching for text shadows. #3270

merged 2 commits into from Nov 8, 2018

Conversation

@gw3583
Copy link
Collaborator

gw3583 commented Nov 5, 2018

This patch allows text shadows to be cached and persisted both
between frames and new display lists, in the texture cache.

Previously, text shadows could be cached between frames, but would
be flushed and redrawn when a new display list was processed.

The primitive and clip interning system is used to determine when
the contents of a cached picture is valid.

For now, this is only applied in very limited scenarios, in order
to ensure the core functionality is correct (specifically, blur
filters that contain glyphs and line decorations, which covers
most text shadows). In the future, this will be expanded to cover
(a) more primitive types (b) more surface / filter effects and
(c) at a finer granularity to allow partial picture caching (tiles).

As part of this change, we can remove all references to the global
scene id, as this is no longer required.

This brings the GPU time in the cloud ad [1] from 43 ms down to
8 ms for me. This is still higher than we want, due to the very high
number of cached text shadows that get composited. This should be
resolved as we support nested picture caching.

[1] https://bugzilla.mozilla.org/show_bug.cgi?id=1436193


This change is Reviewable

@gw3583
Copy link
Collaborator Author

gw3583 commented Nov 5, 2018

r? @kvark

Initial try run looks good:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=037fa9053b15dfb9876f55378e9cc5921d8ce56d

I also kicked off a pending try on all platforms:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=e71266f050b3a2fb150073e6f99ac2f9a739f2ed

I'm not sure how good our CI testing is for things that cached and then invalidated. It seems to work fine to me from some manual browsing, but it'd be worth some sanity checks of the cache invalidation logic to see if it seems sound to you.

@gw3583
Copy link
Collaborator Author

gw3583 commented Nov 5, 2018

Looks like there is a failing crash test on OSX, but I haven't been able to reproduce locally yet.

@gw3583
Copy link
Collaborator Author

gw3583 commented Nov 5, 2018

Both of the crash test failures on OS X are explicit panics due to render task size. However, one of the tests doesn't appear to have any text shadows at all, so it may be something unrelated to this patch. Still investigating.

@gw3583 gw3583 force-pushed the gw3583:s8 branch from bb5440c to 5ac4a2c Nov 5, 2018
@gw3583
Copy link
Collaborator Author

gw3583 commented Nov 5, 2018

Ah, it seems that Gecko sometimes provides an extremely large text shadow in the chrome on the first frame or two, and this patch was not catching that. Updated the patch and kicked off a new try:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=05153326984fe8dad10b663ee41829781789a1b1

Hopefully that should resolve those crash test failures.

@bors-servo
Copy link
Contributor

bors-servo commented Nov 5, 2018

The latest upstream changes (presumably #3230) made this pull request unmergeable. Please resolve the merge conflicts.

@gw3583 gw3583 force-pushed the gw3583:s8 branch from 5ac4a2c to 19d3c5e Nov 5, 2018
@gw3583
Copy link
Collaborator Author

gw3583 commented Nov 5, 2018

Rebased. The latest try run looks good - just one new PASS in blur-cap-large-radius-on-software.html

@bors-servo
Copy link
Contributor

bors-servo commented Nov 6, 2018

The latest upstream changes (presumably #3276) made this pull request unmergeable. Please resolve the merge conflicts.

@gw3583 gw3583 force-pushed the gw3583:s8 branch from 19d3c5e to 1bbe627 Nov 6, 2018
@gw3583
Copy link
Collaborator Author

gw3583 commented Nov 6, 2018

Rebased.

@bors-servo
Copy link
Contributor

bors-servo commented Nov 7, 2018

The latest upstream changes (presumably #3282) made this pull request unmergeable. Please resolve the merge conflicts.

This patch allows text shadows to be cached and persisted both
between frames and new display lists, in the texture cache.

Previously, text shadows could be cached between frames, but would
be flushed and redrawn when a new display list was processed.

The primitive and clip interning system is used to determine when
the contents of a cached picture is valid.

For now, this is only applied in very limited scenarios, in order
to ensure the core functionality is correct (specifically, blur
filters that contain glyphs and line decorations, which covers
most text shadows). In the future, this will be expanded to cover
(a) more primitive types (b) more surface / filter effects and
(c) at a finer granularity to allow partial picture caching (tiles).

As part of this change, we can remove all references to the global
scene id, as this is no longer required.

This brings the GPU time in the cloud ad [1] from 43 ms down to
8 ms for me. This is still higher than we want, due to the very high
number of cached text shadows that get composited. This should be
resolved as we support nested picture caching.

[1] https://bugzilla.mozilla.org/show_bug.cgi?id=1436193
@gw3583 gw3583 force-pushed the gw3583:s8 branch from 1bbe627 to 1d0f068 Nov 7, 2018
@gw3583
Copy link
Collaborator Author

gw3583 commented Nov 7, 2018

Rebased.

@kvark
Copy link
Member

kvark commented Nov 7, 2018

@gw3583 sorry about the delay, reviewing it now.

Copy link
Member

kvark left a comment

Looks great! Concerns below:

Reviewed 17 of 17 files at r1.
Reviewable status: all files reviewed, 14 unresolved discussions (waiting on @gw3583)


webrender/src/freelist.rs, line 75 at r1 (raw file):

        FreeListHandle {
            index: u32::MAX,
            epoch: Epoch(u32::MAX),

this is confusing. There is Epoch::invalid which is defined differently. Which one is really invalid? :)


webrender/src/intern.rs, line 80 at r1 (raw file):

    index: u32,
    epoch: Epoch,
    uid: ItemUid<T>,

basic question: why is the UID needed in addition to index + epoch?


webrender/src/intern.rs, line 85 at r1 (raw file):

impl <T> Handle<T> where T: Copy {
    pub fn get_uid(&self) -> ItemUid<T> {

nit: just fn uid() would match the guidelines better


webrender/src/picture.rs, line 444 at r1 (raw file):

    /// A descriptor for this surface that can be used as a cache key.
    descriptor: Option<SurfaceDescriptor>,

maybe rename to surface or surface_desc?


webrender/src/prim_store.rs, line 126 at r1 (raw file):

        ref_spatial_node_index: SpatialNodeIndex,
        target_node_index: SpatialNodeIndex,
        clip_scroll_tree: &ClipScrollTree,

nit: I wonder if we should make it a method of the cst: clip_scroll_tree.map(ref_spatial_node_index, target_node_index)


webrender/src/render_task.rs, line 1156 at r1 (raw file):

            if !retain {
                let handle = mem::replace(handle, FreeListHandle::invalid());
                cache_entries.free(handle);

where does the new code call map.remove()?


webrender/src/surface.rs, line 70 at r1 (raw file):

    fn new(scale_offset: &ScaleOffset) -> Self {
        ScaleOffsetKey {
            scale_x: quantize(scale_offset.scale.x),

if we are quantifying, why not storing integers as the result? That would avoid the Hash problems


webrender/src/surface.rs, line 134 at r1 (raw file):

impl TransformKey {
    pub fn new() -> Self {

nit: rename to local? Although, there isn't much use to it anyway


webrender/src/surface.rs, line 164 at r1 (raw file):

    /// The list of primitives that are part of this surface.
    /// The uid uniquely identifies the content of the primitive.
    pub primitive_ids: Vec<PrimitiveUid>,

is it an ordered list or really just a set?


webrender/src/surface.rs, line 202 at r1 (raw file):

        clip_store: &ClipStore,
    ) -> Option<Self> {
        let mut transform_map = FastHashSet::default();

slightly confusing that it's a set that has _map suffix in the name


webrender/src/surface.rs, line 221 at r1 (raw file):

                // TODO(gw): This needs to be a bit more careful once we create
                //           descriptors for pictures that might be pass-through.
                if clip_chain_node.spatial_node_index < prim_instance.spatial_node_index &&

ok, this is really scary (to compare indices for ordering). Could you explain this part please?


webrender/src/surface.rs, line 296 at r1 (raw file):

                if raster_spatial_node.coordinate_system_id == surface_spatial_node.coordinate_system_id {
                    TransformKey::new()

if the coordinate systems match, there can still be offset/scale applied, right? this looks like we are ignoring it


wrench/reftests/filters/filter-blur.png, line 0 at r1 (raw file):
why are those changing?


wrench/reftests/filters/reftest.list, line 17 at r1 (raw file):

== filter-invert-2.yaml filter-invert-2-ref.yaml
platform(linux,mac) fuzzy(1,133) == filter-large-blur-radius.yaml filter-large-blur-radius.png
== draw_calls(4) color_targets(4) alpha_targets(0) filter-small-blur-radius.yaml filter-small-blur-radius.png

why do we have more targets now?

Copy link
Collaborator Author

gw3583 left a comment

Reviewable status: all files reviewed, 14 unresolved discussions (waiting on @kvark)


webrender/src/freelist.rs, line 75 at r1 (raw file):

Previously, kvark (Dzmitry Malyshau) wrote…

this is confusing. There is Epoch::invalid which is defined differently. Which one is really invalid? :)

There were some other changes made to Freelist recently, that introduced the invalid epoch. I updated this PR to make use of that.


webrender/src/intern.rs, line 80 at r1 (raw file):

Previously, kvark (Dzmitry Malyshau) wrote…

basic question: why is the UID needed in addition to index + epoch?

Epoch advances each time a new scene is built, but we want an ID that is stable across scene builds.


webrender/src/intern.rs, line 85 at r1 (raw file):

Previously, kvark (Dzmitry Malyshau) wrote…

nit: just fn uid() would match the guidelines better

Fixed.


webrender/src/picture.rs, line 444 at r1 (raw file):

Previously, kvark (Dzmitry Malyshau) wrote…

maybe rename to surface or surface_desc?

Fixed


webrender/src/prim_store.rs, line 126 at r1 (raw file):

Previously, kvark (Dzmitry Malyshau) wrote…

nit: I wonder if we should make it a method of the cst: clip_scroll_tree.map(ref_spatial_node_index, target_node_index)

I think it's probably fine as is, but can make that change if you'd prefer? It might make sense to actually move the entire CoordinateSpaceMapping definition to be a separate file / module.


webrender/src/render_task.rs, line 1156 at r1 (raw file):

Previously, kvark (Dzmitry Malyshau) wrote…

where does the new code call map.remove()?

These are handled by the return value of the retain closure?


webrender/src/surface.rs, line 70 at r1 (raw file):

Previously, kvark (Dzmitry Malyshau) wrote…

if we are quantifying, why not storing integers as the result? That would avoid the Hash problems

We could, although I'm not sure if the casting to / from integers would be more costly than just quantizing and hashing floats. Perhaps something to consider as a follow up?


webrender/src/surface.rs, line 134 at r1 (raw file):

Previously, kvark (Dzmitry Malyshau) wrote…

nit: rename to local? Although, there isn't much use to it anyway

Fixed


webrender/src/surface.rs, line 164 at r1 (raw file):

Previously, kvark (Dzmitry Malyshau) wrote…

is it an ordered list or really just a set?

It's an ordered list. We want the cache key to be different if we receive a picture that has the same primitives but in a different ordering.


webrender/src/surface.rs, line 202 at r1 (raw file):

Previously, kvark (Dzmitry Malyshau) wrote…

slightly confusing that it's a set that has _map suffix in the name

Renamed to relevant_spatial_nodes


webrender/src/surface.rs, line 221 at r1 (raw file):

Previously, kvark (Dzmitry Malyshau) wrote…

ok, this is really scary (to compare indices for ordering). Could you explain this part please?

That first condition just seems wrong, I'm not sure why that was there. The second check is the same condition that the clip node collector uses (that is, any clip nodes positioned by a node earlier in the hierarchy than the picture itself get applied when the surface is composited). I'll kick off another try run with this change just to make sure it doesn't break anything.


webrender/src/surface.rs, line 296 at r1 (raw file):

Previously, kvark (Dzmitry Malyshau) wrote…

if the coordinate systems match, there can still be offset/scale applied, right? this looks like we are ignoring it

That will still actually be handled - the CoordinateSpaceMapping constructor will create a ScaleOffset variant, which will be converted to the appropriate key.


wrench/reftests/filters/filter-blur.png, line at r1 (raw file):

Previously, kvark (Dzmitry Malyshau) wrote…

why are those changing?

We temporarily regress here, in that picture caching of blurs only works on text shadows, and not arbitrary pictures. This will be fixed as we port more primitives to be interned.


wrench/reftests/filters/reftest.list, line 17 at r1 (raw file):

Previously, kvark (Dzmitry Malyshau) wrote…

why do we have more targets now?

These tests no longer get drawn into the texture cache, for now. They get drawn as normal render tasks, until we port solid primitives to be interned.

@gw3583
Copy link
Collaborator Author

gw3583 commented Nov 7, 2018

The updated try run looks good.

Copy link
Member

kvark left a comment

Thanks for the updates!

Reviewed 4 of 4 files at r2.
Reviewable status: all files reviewed, 4 unresolved discussions (waiting on @gw3583)


webrender/src/render_task.rs, line 1156 at r1 (raw file):

Previously, gw3583 (Glenn Watson) wrote…

These are handled by the return value of the retain closure?

d'oh, indeed :)


webrender/src/surface.rs, line 70 at r1 (raw file):

Previously, gw3583 (Glenn Watson) wrote…

We could, although I'm not sure if the casting to / from integers would be more costly than just quantizing and hashing floats. Perhaps something to consider as a follow up?

sure! would be good to have a small comment about this in the code


webrender/src/surface.rs, line 221 at r1 (raw file):

Previously, gw3583 (Glenn Watson) wrote…

That first condition just seems wrong, I'm not sure why that was there. The second check is the same condition that the clip node collector uses (that is, any clip nodes positioned by a node earlier in the hierarchy than the picture itself get applied when the surface is composited). I'll kick off another try run with this change just to make sure it doesn't break anything.

I was looking at this node collector code earlier, and I'm still worried about this code. The fact that it was there for a while and the try pushes are green isn't enough to prove the it's correct :) What is the exact semantics that you expect from spatial_node_index1 < spatial_node_index2 comparison? If a node was added later than another node, I don't think this us much insight about how they are placed. We can definitely say that spatial_node_index1 is not a child of spatial_node_index2, but that's pretty much it?


webrender/src/surface.rs, line 296 at r1 (raw file):

Previously, gw3583 (Glenn Watson) wrote…

That will still actually be handled - the CoordinateSpaceMapping constructor will create a ScaleOffset variant, which will be converted to the appropriate key.

but we aren't invoking the CoordinateSpaceMapping constructor here, are we? And if there is an assumption that by this point we don't expect any scale or offset (and this is what TransformKey::Local is for), we should be asserting on that.

@gw3583 gw3583 force-pushed the gw3583:s8 branch from 0aeac1d to eaee681 Nov 8, 2018
@gw3583 gw3583 force-pushed the gw3583:s8 branch from eaee681 to 14a8a4a Nov 8, 2018
Copy link
Collaborator Author

gw3583 left a comment

Reviewable status: 16 of 17 files reviewed, 4 unresolved discussions (waiting on @kvark)


webrender/src/render_task.rs, line 1156 at r1 (raw file):

Previously, kvark (Dzmitry Malyshau) wrote…

d'oh, indeed :)

Done.


webrender/src/surface.rs, line 70 at r1 (raw file):

Previously, kvark (Dzmitry Malyshau) wrote…

sure! would be good to have a small comment about this in the code

Added a comment.


webrender/src/surface.rs, line 221 at r1 (raw file):

Previously, kvark (Dzmitry Malyshau) wrote…

I was looking at this node collector code earlier, and I'm still worried about this code. The fact that it was there for a while and the try pushes are green isn't enough to prove the it's correct :) What is the exact semantics that you expect from spatial_node_index1 < spatial_node_index2 comparison? If a node was added later than another node, I don't think this us much insight about how they are placed. We can definitely say that spatial_node_index1 is not a child of spatial_node_index2, but that's pretty much it?

This code is definitely not nice - it's mostly an artifact of trying to make things work without changes to gecko / WR APIs. However, with your work on Gecko clip changes, we should actually discuss this and fix it properly so that the implementation details are not so hacky.

Having said that, let me try to describe what this and the clip node collector code assume. As you said above, what we can infer from the indices is that a smaller index is a parent (or an unrelated ancestor, but this doesn't happen in practice, even though the API doesn't enforce this). And if the clip node is positioned by a parent of the stacking context, then we apply the clip to the stacking context as a whole, rather than the individual items.

Happy to discuss on IRC if you'd prefer - it's a complex issue to try and describe.


webrender/src/surface.rs, line 296 at r1 (raw file):

Previously, kvark (Dzmitry Malyshau) wrote…

but we aren't invoking the CoordinateSpaceMapping constructor here, are we? And if there is an assumption that by this point we don't expect any scale or offset (and this is what TransformKey::Local is for), we should be asserting on that.

Oh, you're right - that code was totally bogus! Not only was the raster key wrong, I was storing it in the descriptor and not the cache key, so it wouldn't invalidate any caches anyway, oops! So, I've added it to the cache key, and now the logic says that if we end up with a scale-offset transform key, zero out the offset. This means that we don't care / invalidate a cached surface if the only difference is the offset between the two raster transforms.

@gw3583
Copy link
Collaborator Author

gw3583 commented Nov 8, 2018

@gw3583
Copy link
Collaborator Author

gw3583 commented Nov 8, 2018

@kvark Even if try passes and you're happy with the changes above, let's hold off on merging this for now, as it sounds like we might need some urgent fixes for a potential texture cache corruption issue.

@kvark
kvark approved these changes Nov 8, 2018
Copy link
Member

kvark left a comment

Looks good now! I still want the node indices comparison to be improved, but we already have it in node collector, so this PR isn't introducing a new flaw.

Reviewed 1 of 1 files at r3.
Reviewable status: all files reviewed, 1 unresolved discussion (waiting on @gw3583)


webrender/src/surface.rs, line 221 at r1 (raw file):

or an unrelated ancestor, but this doesn't happen in practice, even though the API doesn't enforce this

This is my worry. We should at the very minimum put a debug assertion in place to prevent this. Ideally, not comparing the node indices at all :)

@gw3583
Copy link
Collaborator Author

gw3583 commented Nov 8, 2018

@kvark Sounds good, thanks! I'll get this merged tonight / tomorrow morning then 👍

@gw3583
Copy link
Collaborator Author

gw3583 commented Nov 8, 2018

We should be fine to land this now, since #3285 is in the queue and also in gecko inbound.

@bors-servo r=kvark

@bors-servo
Copy link
Contributor

bors-servo commented Nov 8, 2018

📌 Commit 14a8a4a has been approved by kvark

bors-servo added a commit that referenced this pull request Nov 8, 2018
Use picture caching for text shadows.

This patch allows text shadows to be cached and persisted both
between frames and new display lists, in the texture cache.

Previously, text shadows could be cached between frames, but would
be flushed and redrawn when a new display list was processed.

The primitive and clip interning system is used to determine when
the contents of a cached picture is valid.

For now, this is only applied in very limited scenarios, in order
to ensure the core functionality is correct (specifically, blur
filters that contain glyphs and line decorations, which covers
most text shadows). In the future, this will be expanded to cover
(a) more primitive types (b) more surface / filter effects and
(c) at a finer granularity to allow partial picture caching (tiles).

As part of this change, we can remove all references to the global
scene id, as this is no longer required.

This brings the GPU time in the cloud ad [1] from 43 ms down to
8 ms for me. This is still higher than we want, due to the very high
number of cached text shadows that get composited. This should be
resolved as we support nested picture caching.

[1] https://bugzilla.mozilla.org/show_bug.cgi?id=1436193

<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/webrender/3270)
<!-- Reviewable:end -->
@bors-servo
Copy link
Contributor

bors-servo commented Nov 8, 2018

Testing commit 14a8a4a with merge 7f83e1f...

@bors-servo
Copy link
Contributor

bors-servo commented Nov 8, 2018

💔 Test failed - status-taskcluster

@zptan
Copy link

zptan commented Nov 8, 2018

@bors-servo retry

@bors-servo
Copy link
Contributor

bors-servo commented Nov 8, 2018

@zptan: 🔑 Insufficient privileges: not in try users

@kvark
Copy link
Member

kvark commented Nov 8, 2018

@bors-servo retry

@bors-servo
Copy link
Contributor

bors-servo commented Nov 8, 2018

Testing commit 14a8a4a with merge 55be584...

bors-servo added a commit that referenced this pull request Nov 8, 2018
Use picture caching for text shadows.

This patch allows text shadows to be cached and persisted both
between frames and new display lists, in the texture cache.

Previously, text shadows could be cached between frames, but would
be flushed and redrawn when a new display list was processed.

The primitive and clip interning system is used to determine when
the contents of a cached picture is valid.

For now, this is only applied in very limited scenarios, in order
to ensure the core functionality is correct (specifically, blur
filters that contain glyphs and line decorations, which covers
most text shadows). In the future, this will be expanded to cover
(a) more primitive types (b) more surface / filter effects and
(c) at a finer granularity to allow partial picture caching (tiles).

As part of this change, we can remove all references to the global
scene id, as this is no longer required.

This brings the GPU time in the cloud ad [1] from 43 ms down to
8 ms for me. This is still higher than we want, due to the very high
number of cached text shadows that get composited. This should be
resolved as we support nested picture caching.

[1] https://bugzilla.mozilla.org/show_bug.cgi?id=1436193

<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/webrender/3270)
<!-- Reviewable:end -->
@bors-servo
Copy link
Contributor

bors-servo commented Nov 8, 2018

☀️ Test successful - status-appveyor, status-taskcluster
Approved by: kvark
Pushing 55be584 to master...

@bors-servo bors-servo merged commit 14a8a4a into servo:master Nov 8, 2018
3 of 4 checks passed
3 of 4 checks passed
code-review/reviewable 1 discussion left (gw3583)
Details
Taskcluster (pull_request) TaskGroup: success
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
homu Test successful
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

4 participants
You can’t perform that action at this time.