-
Notifications
You must be signed in to change notification settings - Fork 81
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
3rd-gen HNTrie #326
Comments
Forgot to reference the issue in the commit message: gorhill/uBlock@1b6fea1 New benchmark page for the 3rd-gen hntrie which is suitable to be used for large set of hostnames -- thus can be used in place of a Transcribing results from my side, the dictionary creation benchmark:
Reminder: creation occurs at filter list load time, thus a one-time sort of event. Look-up operations (below) occur multiple time for every single network request being inspected. Regarding "Trie-based unserialized": it is meant to be used in the selfie-loading code, which will be a big improvement compared to the current use of a The "look-up" benchmark:
I expect memory usage benefits as well, I will look into this below as time allow. |
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
I would like to keep the comments section for developer notes focused on noteworthy information. Unless the relative results in your benchmarks differ significantly from my posted relative results, there is no point posting your results, this will just end up drowning key information for whoever come here from the release notes to inquire about the details of the issue. |
This looks to be a very good change, expanding the use of Hntrie from just Are you also planning on using 3rd-gen tries for the many hostname rules of EasyList and EasyPrivacy? Almost all of them have some sort of field, including the large collection of (And sorry for my off-topic posts last night. I now realize that the other benchmark test is not that relevant here, since it's only for buckets up to 1000 hostnames.) |
There are many instances of |
I patched your commits from yesterday into my fork, including the new Hntrie JS version (Wasm disabled). I decided to do a stress test in a throwaway profile to ensure it's working properly. And it does indeed. I imported several large Hosts lists resulting in about 130k total of these rules, all stored in a very large trie. (Plus another 70k rules from the Easys, uAssets, and Adguard.) I went to a number of request-heavy sites and everything worked well. No problems. (Note to readers: My real static rulesets are only about 50k total. I don't use any Hosts lists. That was purely for testing this new 3rd-gen trie.) I'm now running the new JS hntrie in both of my real Firefox profiles. Everything is good. You do excellent quality work, gorhill! |
A placeholder reference issue to document the code change related to 3rd-generation HNTrie -- will flesh out as time allow.
The main goal of 3rd-generation HNTrie is to be usable as a replacement of
Set()
for large collection of hostnames -- the 1st- and 2nd-gen versions of HNTrie were not designed to replace the use ofSet()
inFilterHostnameDict
.Pure-hostname filters -- stored in
FilterHostnameDict
-- are the most common static network filters. For instance, all filters in a hosts file are pure hostname ones. With default filter lists, there are many instances ofFilterHostnameDict
, with the largest holding over 34,000 distinct hostnames.The text was updated successfully, but these errors were encountered: