Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement hosts-only mode #2115

Merged
merged 2 commits into from
Aug 5, 2021
Merged

Implement hosts-only mode #2115

merged 2 commits into from
Aug 5, 2021

Conversation

mjethani
Copy link
Contributor

@mjethani mjethani commented Aug 4, 2021

This PR introduces a hosts-only mode. The idea is to benchmark with only "host filters." An engine could be used in this mode for DNS-level or proxy-level blocking.

A hosts.txt file is included based on The Block List Project (see #2115 (comment)).

Pass the HOSTS_ONLY=1 option to make to run in hosts-only mode.

@@ -80,10 +80,31 @@ function isSupportedUrl(url) {
);
}

function looksLikeHostFilter(raw) {
// https://en.wikipedia.org/wiki/Hostname#Syntax
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gorhill this is my idea of what a host filter is. I hope I've got it right! Let me know if you think I've missed something.

@remusao remusao added the PR: Internal 🏠 Changes only affect internals label Aug 5, 2021
@gorhill
Copy link
Contributor

gorhill commented Aug 5, 2021

The resulting list following extracting looks-like-filter function is rather small, something around 3,800 if I remember correctly. Typically, I expect DNS block list to be much larger. I went ahead and created a hosts.txt file combining both "Ads" and "Tracking" lists from https://github.com/blocklistproject/Lists#lists. The result is over 169K filters, and a good stress test as a result. I think the --hosts-only mode should use that typical DNS-based block list.

Tell me if you are interested.

@mjethani
Copy link
Contributor Author

mjethani commented Aug 5, 2021

Tell me if you are interested.

Looks good to me! The license is Unlicense. We could include the generated hosts.txt file in this repo itself, right next to easylist.txt. If it's too big, we could zip it up.

@gorhill how do I get this file from you? Maybe you could submit it as a patch, or upload it somewhere so I can fetch it and include it in this patch.

@gorhill
Copy link
Contributor

gorhill commented Aug 5, 2021

Looks like I can attach it here: hosts.txt.zip

@mjethani mjethani marked this pull request as ready for review August 5, 2021 17:32
@mjethani mjethani requested a review from remusao as a code owner August 5, 2021 17:32
@mjethani
Copy link
Contributor Author

mjethani commented Aug 5, 2021

@remusao the patch is a lot simpler now and marked ready for review.

Copy link
Collaborator

@remusao remusao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great, thanks a lot!

@remusao remusao merged commit 5e0c162 into ghostery:master Aug 5, 2021
@mjethani mjethani deleted the hosts-only branch August 7, 2021 11:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
PR: Internal 🏠 Changes only affect internals
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants