Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use tsurlfilter's DNS engine #2136

Merged
merged 1 commit into from
Aug 9, 2021

Conversation

mjethani
Copy link
Contributor

@mjethani mjethani commented Aug 9, 2021

This patch uses the DNS engine in @adguard/tsurlfilter for hosts-only mode.

@mjethani mjethani requested a review from remusao as a code owner August 9, 2021 16:44
@mjethani
Copy link
Contributor Author

mjethani commented Aug 9, 2021

@ameshkov FYI

Hosts-only mode is for testing blocking at the host level only. The DnsEngine object is perfect for this.

In actual DNS mode there would be no URL from which to extract the hostname, but the extractHostname() function is extremely fast and I don't think it makes any significant difference to the benchmark results.

const result = TSUrlFilter.hostsOnly ?
this.matchHostname(url) :
this.matchRequest({ url, frameUrl, type });
return result !== null && !result.whitelist;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something worth noting is that the DNS engine also supports whitelisting rules. This is also true of other content blockers like uBlock Origin. But we have no whitelisting rules in our hosts.txt file here.

Copy link
Contributor

@ameshkov ameshkov Aug 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, it also supports a little different subset of rules & modifiers, the syntax is explained here:
https://github.com/AdguardTeam/AdGuardHome/wiki/Hosts-Blocklists

Although, tsurlfilter is only capable of basic matching since it is only used for simple rules validation by different automation scripts. AdGuard DNS and AdGuard Home use another content blocking library.

@remusao remusao added the PR: Internal 🏠 Changes only affect internals label Aug 9, 2021
@remusao remusao merged commit 8544c57 into ghostery:master Aug 9, 2021
@mjethani mjethani deleted the tsurlfilter-dnsengine branch August 9, 2021 20:36
@mjethani
Copy link
Contributor Author

@ameshkov thanks for helping with the integration.

The only thing left now is serialization and deserialization. I could not find anything in the README.md file for this. I see there's a RuleStorage object, but I'm not sure how to use it.

Since the library is under development, it's not so important to implement this now.

In the future, when an API is available, we'll have to implement the following two functions in blockers/tsurlfilter.js in this repo:

class TSUrlFilter {
  async serialize() {
    // return data in serialized format
  }

  async deserialize(serialized) {
    // initialize engine with given data in serialized format
  }
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
PR: Internal 🏠 Changes only affect internals
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants