Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upAdded a bloom filter to CSS selector matching. #3212
Conversation
highfive
commented
Sep 5, 2014
hoppipolla-critic-bot
commented
Sep 5, 2014
|
Critic review: https://critic.hoppipolla.co.uk/r/2506 This is an external review system which you may optionally use for the code review of your pull request. In order to help critic track your changes, please do not make in-place history rewrites (e.g. via |
|
Can you give some numbers on sample pages (say, Wikipedia) which shows how many bloom filter invalidations we have in practice? |
| use std::mem; | ||
| use style::Stylist; | ||
| use style::{Stylist,SimpleSelector}; |
This comment has been minimized.
This comment has been minimized.
| @@ -638,8 +648,22 @@ impl LayoutTask { | |||
| } | |||
| self.screen_size = current_screen_size; | |||
|
|
|||
| // TODO(cgaebel): Instead of counting the max dom selectors, maybe just | |||
| // use a 1 MB bloom filter? Or use d-left hasing to allow one that grows | |||
This comment has been minimized.
This comment has been minimized.
| let (max_dom_node_selectors, descendant_simple_selectors) = | ||
| profile(time::LayoutMaxSelectorMatchesCategory, | ||
| self.time_profiler_chan.clone(), | ||
| // TODO(cagebel): Parallelize this traversal. |
This comment has been minimized.
This comment has been minimized.
| @@ -213,6 +214,93 @@ impl<'a> ParallelPreorderFlowTraversal for AssignISizesTraversal<'a> { | |||
|
|
|||
| impl<'a> ParallelPostorderFlowTraversal for AssignBSizesAndStoreOverflowTraversal<'a> {} | |||
|
|
|||
| /// A pair of the bloom filter used for css selector matching, and the node to | |||
| /// which it applies. This is used to efficiently do `Desendant` selector | |||
This comment has been minimized.
This comment has been minimized.
| /// As we walk down the DOM tree a task-local bloom filter is built of all the | ||
| /// CSS `SimpleSelector`s which are part of a `Descendant` compound selector | ||
| /// (i.e. paired with a `Descendant` combinator, in the `next` field of a | ||
| /// `CompoundSelector`. |
This comment has been minimized.
This comment has been minimized.
pcwalton
Sep 5, 2014
Contributor
Huh. Is this how other browser engines do it? I would have naively thought that you would just keep track of all classes/tag names/IDs.
This comment has been minimized.
This comment has been minimized.
cgaebel
Sep 5, 2014
Author
Contributor
Ok. I changed it to do that instead. That seems a lot smarter than what I was doing! It saves a pass, and is a lot simpler. I'll update the comment in a minute.
This comment has been minimized.
This comment has been minimized.
pcwalton
Sep 5, 2014
Contributor
Here's what Gecko uses (it's an nsIAtom, which is the type used by class names/tag names/IDs): http://mxr.mozilla.org/mozilla-central/source/layout/style/nsRuleProcessorData.h#72
Looks like Gecko uses element tags and classes: http://mxr.mozilla.org/mozilla-central/source/layout/style/nsCSSRuleProcessor.cpp#3636
This comment has been minimized.
This comment has been minimized.
cgaebel
Sep 5, 2014
Author
Contributor
It looks like gecko is also keeping an stack of atoms for "what's above me", in addition to the bloom filter. This wouldn't be too crazy to also implement.
| (Some(p), None) => new_bloom(Some(p)), | ||
| // Found cached bloom filter. | ||
| (Some(p), Some((bf, old_node))) => { | ||
| // Heey, the cached parent is our parent! We can reused the bloom |
This comment has been minimized.
This comment has been minimized.
| /// Quickly figures out whether or not the compound selector is worth doing more | ||
| /// work on. If the simple selectors don't match, or there's a child selector | ||
| /// that does not appear in the bloom parent bloom filter, we can exit early. | ||
| fn can_fast_reject<E: TElement, N: TNode<E>>( |
This comment has been minimized.
This comment has been minimized.
pcwalton
Sep 5, 2014
Contributor
Does your Rust have where clauses? If so, you can write this as fn can_fast_reject<E,N>(...) where E: TElement, N: TNode<E>, which is easier to indent.
This comment has been minimized.
This comment has been minimized.
| // Key Stretching | ||
| // ============== | ||
| // | ||
| // Siphash is expensive. Instead of running it `NUMBER_OF_HASHES`, which would |
This comment has been minimized.
This comment has been minimized.
pcwalton
Sep 5, 2014
Contributor
You don't need a cryptographically strong hash at all here. Just use xxhash or something.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
cgaebel
Sep 5, 2014
Author
Contributor
I have switched to FNV hashing, but now this PR is blocked on: servo/stringcache#11.
EDIT: Unblocked now.
| /// The return value of this function is extremely sketchy. | ||
| /// The 'static lifetime on SimpleSelector is so that lifetime parameters on | ||
| /// `SharedLayoutContext` can be avoided. | ||
| pub fn max_selector_matches<E: TElement, N: TNode<E>>(&self, node: &N) |
This comment has been minimized.
This comment has been minimized.
|
|
||
| /// For the css selection bloom filter, we need to get an estimate of the | ||
| /// matching descendant simple selector in the tree. This function will | ||
| /// return a set of them. |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
pcwalton
Sep 5, 2014
Contributor
I'd rather do what Chrome/Blink does than add a whole new pass. New passes are expensive.
This comment has been minimized.
This comment has been minimized.
|
| let mut ret = false; | ||
|
|
||
| for shash in stretch(&mut to_rng(hash)) { | ||
| ret |= self.definitely_excludes_shash(shash); |
This comment has been minimized.
This comment has been minimized.
huonw
Sep 5, 2014
Contributor
Isn't this doing more work than necessary, since it doesn't shortcircuit. Could this whole function be stretch(&mut to_rng(hash)).any(|shash| self.definitely_excludes_shash(shash))?
This comment has been minimized.
This comment has been minimized.
cgaebel
Sep 5, 2014
Author
Contributor
I did that at first, but it was slower. Apparently branches are expensive.
This comment has been minimized.
This comment has been minimized.
|
|
||
| /// A bloom filter can tell you if an element /may/ be in it. It cannot be | ||
| /// certain. But, assuming correct usage, this query will have a low false | ||
| // positive rate. |
This comment has been minimized.
This comment has been minimized.
|
For this wikipedia page, the bloom filter success stats are:
|
|
I believe I've addressed all the code review comments. Does everyone want to take another look? :) |
|
Timings? |
|
This PR consistently has trouble with the unit-doc job on travis, and we end up having problems building string-cache with errors like |
|
@jdm: How do I run the unit-doc job on my local machine? |
|
A few things I'd like to see any or all of:
|
130fcb7
to
7dffa22
|
7dffa22
to
614aba2
|
Ok I have some more numbers to report now! All this data is for when I just opened
If you want to play with the profiling data yourself, you can get it here. Run 1 is without the bloom filter, Run 2 is with. This still isn't ready to merge because I haven't fixed the build on all platforms. I'll work on having that done by the end of today. |
|
That is a nice looking table. |
|
Not to diminish the awesomeness of this work, but I think the relative benefit of the bloom filter will be reduced a lot when we start using atoms to compare classes (PR incoming soon). |
|
That might change the absolute performance gain, but won't change the hit rate. Let's rerun numbers with your atom patch. |
|
Right. |
|
This is the number of bloom filter invalidations as a function of the number of threads used for layout. Getting data for 1 thread isn't fair, because it uses a different codepath.
Given that the filter allows us to quickly reject 100k selectors, unless the average depth of a selector is greater than 50 (for 128 threads), this seems like a win. |
b4f1f61
to
cad8329
|
Whaaaat the build is passing!? Cool. Can I get an r+? |
| @@ -278,16 +279,31 @@ pub enum StyleSharingResult<'ln> { | |||
| } | |||
|
|
|||
| pub trait MatchMethods { | |||
| /// Inserts and removes the matching `Child` selectors from a bloom filter. | |||
| /// This is used to speed up CSS selector matching to remove unnecessary | |||
| /// tree climbs for `Child` queries. | |||
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
pcwalton
Sep 15, 2014
Contributor
Also, you aren't putting selectors in the bloom filter: you're putting local names, namespaces, IDs, and classes.
| /// | ||
| /// Since a work-stealing queue is used for styling, sometimes, the bloom filter | ||
| /// will no longer be the for the parent of the node we're currently on. When | ||
| /// this happens, the task local bloom filter will be throw away and rebuilt. |
This comment has been minimized.
This comment has been minimized.
| /// | ||
| /// If one does not exist, a new one will be made for you. If it is out of date, | ||
| /// it will be thrown out and a new one will be made for you. | ||
| fn take_task_local_bloom_filter<'ln>( |
This comment has been minimized.
This comment has been minimized.
| } | ||
| } | ||
|
|
||
| fn put_task_local_bloom_filter<'ln>(bf: BloomFilter, unsafe_node: &UnsafeLayoutNode) { |
This comment has been minimized.
This comment has been minimized.
|
r+ with nits addressed. We may want to move away from FNV in the future but this is fine for now. |
cad8329
to
acd83ff
Added a bloom filter to CSS selector matching.
|
@cgaebel btw, this works:
(See the tables section) :) |
|
Whoa. Very slick. Gotta change up my ascii table style, now! |
cgaebel commentedSep 5, 2014
Every other browser engine uses a bloom filter to quickly reject
Descendantselectors. This adds that feature to servo, and it even works (mostly!) in
parallel!