Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upIntroduce structure interner, and use for clip interning. #3075
Conversation
|
r? @kvark This does seem to give a noticeable reduction in the number of GPU cache rows on several pages that I tested on. Of course, the major wins wouldn't come until we use it for more than just clips. The main remaining bit of work (apart from review comments) is that I haven't hooked up the interner and data store to the capture / replay code yet. Try run looks good: (lots of blue and orange, but they are unrelated as far as I can tell - still some jobs pending though) |
|
And also r? @nical too, since this touches scene builder thread and such things. |
|
Looks good to me, just want to make sure this doesn't interfere with my world domination plans concerning the hit tester, see comment. |
| ); | ||
| self.hit_tester = Some(frame_builder.create_hit_tester(&self.clip_scroll_tree)); | ||
| self.hit_tester = Some(frame_builder.create_hit_tester( |
This comment has been minimized.
This comment has been minimized.
nical
Sep 18, 2018
Collaborator
My only worry is creating a dependency between frame building and building the hit tester. We really need to be able to build the latter independently. Here we are first mutably passing the clip data store to the frame builder and then using it to build the hit tester.
Upon inspecting the code that mutates the data store, it looks like only GPU thingies are modified so it should be safe to create the hit tester independently, but I'd feel better if we moved the line that creates the hit tester above the FrameBuilder::build call to make sure dependencies don't creep in.
This comment has been minimized.
This comment has been minimized.
| pub struct Interner<S : Eq + Hash + Clone + Debug, M> { | ||
| // Uniquely map an interning key to a handle | ||
| map: FastHashMap<S, Handle<M>>, | ||
| // List of free slots in the data store for re-used. |
This comment has been minimized.
This comment has been minimized.
webrender/src/batch.rs, line 1778 at r1 (raw file):
does it make sense to merge webrender/src/clip.rs, line 743 at r1 (raw file):
perhaps, this would let us not have both Au and non-Au versions of the clip structs webrender/src/clip.rs, line 748 at r1 (raw file):
I agree that names are missing here webrender/src/clip.rs, line 835 at r1 (raw file):
I do wonder if we can make this more generic, i.e. just have webrender/src/intern.rs, line 36 at r1 (raw file):
one way to use a freelist here is to have it built into enum Item<T> {
Valid { epoch: Epoch, data: T },
Free { next: usize },
}We could easily hide the enumeration tag by using NonZeroU64: struct Epoch(NonZeroU64);This way we'd also not need the Perhaps, this approach would improve data locality and make the code more idiomatic/typed? webrender/src/intern.rs, line 45 at r1 (raw file):
let's document what the epoch applies to: scene builder, individual interned item, etc webrender/src/intern.rs, line 84 at r1 (raw file):
let's have all those internal comments as doc comments ( webrender/src/intern.rs, line 100 at r1 (raw file):
what is the marker for? webrender/src/intern.rs, line 126 at r1 (raw file):
what is guaranteeing that the index isn't greater than the length? webrender/src/intern.rs, line 143 at r1 (raw file):
nit: consider webrender/src/intern.rs, line 174 at r1 (raw file):
that's quite a few epochs, epochs everywhere! webrender/src/intern.rs, line 195 at r1 (raw file):
if webrender/src/intern.rs, line 254 at r1 (raw file):
do we really need the "used for some time" semantics here? I thought just removing ones not used for each incoming scene build would be sufficient. webrender/src/intern.rs, line 284 at r1 (raw file):
the method itself does much more than just getting the updates, so we should probably rename it to something other than webrender/src/intern.rs, line 284 at r1 (raw file):
hm, I am worried a bit that the interner keeps its own epoch. Why isn't the epoch really driven by the scene builder, so that one build = one epoch? webrender/src/render_backend.rs, line 1305 at r1 (raw file):
isn't this done already? webrender_api/src/units.rs, line 222 at r1 (raw file):
we have 4 traits that could pretty much be |
webrender/src/batch.rs, line 1778 at r1 (raw file): Previously, kvark (Dzmitry Malyshau) wrote…
Not currently, since the clip_store is reccycled per frame, while the clip_data_store is persistent for the Document lifetime. It's likely we can tidy this up a bit in the future though. webrender/src/clip.rs, line 743 at r1 (raw file): Previously, kvark (Dzmitry Malyshau) wrote…
Done. webrender/src/clip.rs, line 748 at r1 (raw file): Previously, kvark (Dzmitry Malyshau) wrote…
Would you prefer this is fixed up in this patch, or done at some later time? webrender/src/clip.rs, line 835 at r1 (raw file): Previously, kvark (Dzmitry Malyshau) wrote…
Perhaps - although in some cases (e.g the box-shadow) it's quite different between the structs. webrender/src/intern.rs, line 36 at r1 (raw file): Previously, kvark (Dzmitry Malyshau) wrote…
The complexity in this case is that the items live in the data store (frame builder thread), while the free-list itself lives in (and is managed by) the interner in the scene builder thread. webrender/src/intern.rs, line 45 at r1 (raw file): Previously, kvark (Dzmitry Malyshau) wrote…
Done. webrender/src/intern.rs, line 84 at r1 (raw file): Previously, kvark (Dzmitry Malyshau) wrote…
Done. webrender/src/intern.rs, line 100 at r1 (raw file): Previously, kvark (Dzmitry Malyshau) wrote…
To ensure a Handle from an unrelated data store type cannot be used. webrender/src/intern.rs, line 126 at r1 (raw file): Previously, kvark (Dzmitry Malyshau) wrote…
The next_index field in the scene builder size is the exactly length of the array in the data store side - the free-list management ensures we only ever insert into an existing slot or append at the end of the array. webrender/src/intern.rs, line 143 at r1 (raw file): Previously, kvark (Dzmitry Malyshau) wrote…
Done webrender/src/intern.rs, line 174 at r1 (raw file): Previously, kvark (Dzmitry Malyshau) wrote…
Yep - this could be cleaned up a bit when we move epoch into the scene builder, as you suggested above. webrender/src/intern.rs, line 195 at r1 (raw file): Previously, kvark (Dzmitry Malyshau) wrote…
It's not the same - the S is the source data type, the T is the target data type that implements webrender/src/intern.rs, line 254 at r1 (raw file): Previously, kvark (Dzmitry Malyshau) wrote…
That would be sufficient for correctness - what I'm trying to handle here is the case where JS is removing / adding elements, and so it's useful to have them exist for sometime after they were removed in case they are re-added. It's debatable how common this actually is though! webrender/src/intern.rs, line 284 at r1 (raw file): Previously, kvark (Dzmitry Malyshau) wrote…
Naming is hard! Made it slightly better. webrender/src/intern.rs, line 284 at r1 (raw file): Previously, kvark (Dzmitry Malyshau) wrote…
That's a good idea, we should do this. Are you ok with doing this as I add the next interned data structure type in a follow up, or would you prefer to do it now? webrender/src/render_backend.rs, line 1305 at r1 (raw file): Previously, kvark (Dzmitry Malyshau) wrote…
I haven't worked out the saving of the scene-side structure yet - will implement that today. webrender_api/src/units.rs, line 222 at r1 (raw file): Previously, kvark (Dzmitry Malyshau) wrote…
Indeed! |
|
@kvark Thanks for the review! Still need to implement the capture functionality, and any follow ups from the review that remain above. |
webrender/src/clip.rs, line 748 at r1 (raw file): Previously, gw3583 (Glenn Watson) wrote…
fine to be left out of this PR :) |
|
@kvark Added one more commit with the capture / replay integration. I think it's possible to have races where the interner / data store get out of sync, but maybe it's really unlikely in practice. What do you think? (I'm going to do some testing locally to see if I can ever get them out of sync). |
|
@kvark I did some testing with the simple capture / replay integration and it does seem to work. I couldn't reproduce any out of sync issues, so it may be fine as is. I also rebased it, so I think this is ready to go once you check over the last commit. Kicked off a try run (pending) too, to make sure I didn't break anything with the review comment fixes: |
|
|
|
(rebased again) |
|
Try run still looks good |
|
|
This patch introduces a new data structure, that allows generic structures to be interned. This works similarly to how normal string interning works, however it is specialized to the thread model for WR (explained at the top of intern.rs). The effect of this change is that clip nodes are de-duplicated and persisted between both frames and display lists. This has three primary benefits: * Since they are de-duplicated, the handle for an interned structure uniquely identifies it. This is very useful for future use where we want to be able to quickly and cheaply compare if the contents of a cached picture matches that of a new display list. * Since they are persisted between display lists, the GPU cache handles for the nodes remain valid. This means far fewer GPU cache update patches for types that are interned. * Since they are de-duplicated by content hash value, there are fewer clips overall used by the frame builder. The plan in the future is to extend this to other primitive types, as well as gradient stops, text runs etc. This will allow us to very quickly check if a cached picture remains valid, even in the presence of a completely new display list. This adds a small amount of overhead to the scene builder thread, (extra hashing) but reduces the CPU time in the render backend and compositor threads, which is also a good tradeoff.
|
Rebased again - just removed the modified box shadow reftest image, which is no longer different with this change after @pcwalton 's box shadow border radius fix. |
|
I'd like to hear more about a case where the interner gets out of sync with the data store. I think that, as long as our communication channel is one-way (going through WR pipeline of API -> Backend(A) -> SceneBuilder -> Backend(B) -> Renderer), and there is only one channel, there should be no data-races.
webrender/src/scene_builder.rs, line 257 at r3 (raw file):
note the naming is inconsistent with |
|
@bors-servo r+ |
|
|
Introduce structure interner, and use for clip interning. This patch introduces a new data structure, that allows generic structures to be interned. This works similarly to how normal string interning works, however it is specialized to the thread model for WR (explained at the top of intern.rs). The effect of this change is that clip nodes are de-duplicated and persisted between both frames and display lists. This has three primary benefits: * Since they are de-duplicated, the handle for an interned structure uniquely identifies it. This is very useful for future use where we want to be able to quickly and cheaply compare if the contents of a cached picture matches that of a new display list. * Since they are persisted between display lists, the GPU cache handles for the nodes remain valid. This means far fewer GPU cache update patches for types that are interned. * Since they are de-duplicated by content hash value, there are fewer clips overall used by the frame builder. The plan in the future is to extend this to other primitive types, as well as gradient stops, text runs etc. This will allow us to very quickly check if a cached picture remains valid, even in the presence of a completely new display list. This adds a small amount of overhead to the scene builder thread, (extra hashing) but reduces the CPU time in the render backend and compositor threads, which is also a good tradeoff. <!-- Reviewable:start --> --- This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/webrender/3075) <!-- Reviewable:end -->
|
|
gw3583 commentedSep 18, 2018
•
edited by larsbergstrom
This patch introduces a new data structure, that allows generic
structures to be interned. This works similarly to how normal
string interning works, however it is specialized to the
thread model for WR (explained at the top of intern.rs).
The effect of this change is that clip nodes are de-duplicated
and persisted between both frames and display lists.
This has three primary benefits:
structure uniquely identifies it. This is very useful for
future use where we want to be able to quickly and cheaply
compare if the contents of a cached picture matches that
of a new display list.
cache handles for the nodes remain valid. This means far
fewer GPU cache update patches for types that are interned.
are fewer clips overall used by the frame builder.
The plan in the future is to extend this to other primitive types,
as well as gradient stops, text runs etc. This will allow us to
very quickly check if a cached picture remains valid, even in the
presence of a completely new display list.
This adds a small amount of overhead to the scene builder thread,
(extra hashing) but reduces the CPU time in the render backend and
compositor threads, which is also a good tradeoff.
This change is