Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Apply Chromium's built-in adblocker (subresource filter) to every page #42

Closed
wants to merge 1 commit into from

Conversation

Zoraver
Copy link
Contributor

@Zoraver Zoraver commented Aug 22, 2019

Addresses #10. Tested in the emulator on a number of webpages and behaved as described below.

The subresource filter does not seem to be applied unless Subresource Filter Rules has a version greater than 0.0.0.0 (visible and updatable on chrome://components). I do not know if Vanadium updates these components automatically; I never observed it in the emulator, but I also never left the emulator running for more than a few minutes at a time. On my own phone running GrapheneOS, I believe that the Subresource Filter Rules component had a valid version, but I manually updated it after and can't remember for sure.

When a site on which ads are being blocked is first loaded, a snackbar appears that states that ads are being blocked. When "more info" is pressed, it says that "this site shows intrusive or misleading ads" and gives the user the option to disable blocking. At first I considered this annoying, but upon further reflection, I decided that it was acceptable, as reimplementing a user interface to allow users to disable blocking for a specific site would be outside the scope of this project.

@csagan5
Copy link

csagan5 commented Aug 22, 2019

@Zoraver I find your approach interesting. Is cloud-based safebrowsing disabled in current Vanadium? Does this type of filtering work without the API calls (for which there is no valid key anyways)?

@thestinger
Copy link
Member

@Zoraver I don't think it will ever update it automatically without Play Services present, and it'll need to be switched to obtaining it from https://grapheneos.org/. I don't know exactly what this involves doing.

@thestinger
Copy link
Member

thestinger commented Aug 22, 2019

@csagan5 Safe Browsing isn't explicitly disabled but isn't available because the default mode depends on Play Services which isn't present where Vanadium is used, so the Safe Browsing code isn't used. Vanadium is currently only for use within GrapheneOS. I don't want to have Safe Browsing based on Play Services enabled, but I do want to have the option of using the local database implementation. I'm unsure if I would want to have it enabled by default, but I definitely want it as an optional feature. Unfortunately, I couldn't get it working, so I haven't touched this yet. The toggle is currently a no-op since there's no Play Services and it's always effectively disabled. In theory, it should be possible to add an API key for a cloud project with Safe Browsing enabled and use safe_browsing_mode=3 to support Safe Browsing via the local database in the browser. The requests it makes download databases for on truncated hashes (4 bytes) of the URL, but there are other options like simply downloading the entire database, which I don't think is really that large. These requests could also be made to a https://grapheneos.org/ subdomain rather than Google. The same applies to a couple other things like downloading the GPS almanacs in the OS.

@csagan5
Copy link

csagan5 commented Aug 22, 2019

Safe Browsing isn't explicitly disabled but isn't available because the default mode depends on Play Services which isn't present where Vanadium is used, so the Safe Browsing code isn't used.

So am I correct to conclude that the built-in adblocker wouldn't work, even with this patch, because of #35?

I have an interest in the built-in adblocker for the reasons explained in #10 (comment)

All things considered it would allow to stay closer to upstream.

@Zoraver
Copy link
Contributor Author

Zoraver commented Aug 22, 2019

@csagan5 It does work with this patch, as long as the user manually updates the Subresource Filter Rules component.

@thestinger I will look into the component installer / updater later this week to see if I can find a way to change where it obtains components.

@macchrome
Copy link

macchrome commented Aug 23, 2019

https://github.com/chromium/chromium/blob/master/components/subresource_filter/FILTER_LIST_GENERATION.md

Place unindexed filters in the respective place: Subresource Filter\Unindexed Rules;
Subresource Filter\Indexed Rules is created. Easy on Windows, Mac, Linux; probably not so under (unrooted) Android.

@thestinger
Copy link
Member

It doesn't really need to be filtered though. I'd rather just have the full EasyList / EasyPrivacy as the standard option and alternatives could be offered later.

@thestinger
Copy link
Member

i.e. I think only the last step for converting to the right format would be necessary. I'd like to use the full EasyList + EasyPrivacy as the default. Later on, it can be made into something more configurable, but the priority would just be getting that as the baseline.

I don't know how the component updater works, like whether it has signatures rather than just relying on HTTPS, and I'm curious how hard that's going to be to set up for https://releases.grapheneos.org/ or a new subdomain dedicated to this.

In terms of the infobar notice, I think it would be best to simply remove that for the time being. Most pages are going to have blocked content and I don't think it's important to say when it's happening but rather provide an easy way to disable it per site. Ideally, there would be a per-page toggle in the site settings menu opened from pressing the status icon on the navigation bar and then site settings. There's already a global site settings toggle, which is a good start.

@thestinger
Copy link
Member

@Zoraver I think they use Play Services to schedule the updates, and maybe they'd be open to a portable JobScheduler implementation upstream, but in the meantime, it could just trigger a check for component updates at each initial launch by hard-wiring triggering it. I'm sure the end result would look trivial, and the main difficulty is figuring out how to deal with such a large codebase. It would also make sense to bundle initial copies of the components with the browser, at least if that's already possible, and I'd be fine with starting out without automatic updates, especially if people can manually update via with the components page so I can at least suggest that to them if they run into an issue with a filter.

@Zoraver
Copy link
Contributor Author

Zoraver commented Aug 27, 2019

@thestinger I have removed the infobar, and Chromium already has a per-page toggle for ad blocking in the site settings menu that I failed to notice earlier.

According to the documentation for the component updater, components are delivered as signed CRX files via Omaha.

From this file, I found that the extension id for the Subresource Filter Rules component is gcmjkmgdlgnkkcocmoeiminaijmmjnii. Using the extension id, the CRX can be downloaded here. I am still working on finding a way to change where the component updater gets components.

@Zoraver
Copy link
Contributor Author

Zoraver commented Aug 27, 2019

The URL that the component updater queries appears to be set in this file. I changed the value kUpdaterJSONDefaultUrl to the URL of a server that I control and it received a POST when I manually checked for updates to the Subresource Filter Rules component.

@csagan5
Copy link

csagan5 commented Sep 5, 2019

The URL that the component updater queries appears to be set in this file.

In Bromite those URLs are already dummied out to about:blank; I will start trying your patches for the upcoming v77 release of Bromite.

@csagan5
Copy link

csagan5 commented Sep 5, 2019

I have been looking at the format of the subresource filters: https://cs.chromium.org/chromium/src/out/android-Debug/gen/components/url_pattern_index/flat/url_pattern_index_generated.h?l=213

 bool Verify(flatbuffers::Verifier &verifier) const {
    return VerifyTableStart(verifier) &&
           VerifyField<uint8_t>(verifier, VT_OPTIONS) &&
           VerifyField<uint16_t>(verifier, VT_ELEMENT_TYPES) &&
           VerifyField<uint8_t>(verifier, VT_ACTIVATION_TYPES) &&
           VerifyField<uint8_t>(verifier, VT_URL_PATTERN_TYPE) &&
           VerifyField<uint8_t>(verifier, VT_ANCHOR_LEFT) &&
           VerifyField<uint8_t>(verifier, VT_ANCHOR_RIGHT) &&
           VerifyOffset(verifier, VT_DOMAINS_INCLUDED) &&
           verifier.VerifyVector(domains_included()) &&
           verifier.VerifyVectorOfStrings(domains_included()) &&
           VerifyOffset(verifier, VT_DOMAINS_EXCLUDED) &&
           verifier.VerifyVector(domains_excluded()) &&
           verifier.VerifyVectorOfStrings(domains_excluded()) &&
           VerifyOffset(verifier, VT_URL_PATTERN) &&
           verifier.VerifyString(url_pattern()) &&
           VerifyField<uint32_t>(verifier, VT_ID) &&
           VerifyField<uint32_t>(verifier, VT_PRIORITY) &&
           verifier.EndTable();
  }
};

And I found it incredibly familiar. For comparison, this is the top of Bromite AdBlock engine adblock_entries.h:

#ifndef NET_URL_REQUEST_ADBLOCK_ENTRIES_H_
#define NET_URL_REQUEST_ADBLOCK_ENTRIES_H_

namespace net {

#define ADBLOCK_FLAG_EXCEPTION 1
#define ADBLOCK_FLAG_MATCH_DOMAIN 2
#define ADBLOCK_FLAG_MATCH_BEGIN 4
#define ADBLOCK_FLAG_MATCH_END 8
#define ADBLOCK_FLAG_HAS_WILDCARD 16
#define ADBLOCK_FLAG_HAS_SEPARATOR 32
#define ADBLOCK_FLAG_MATCH_CASE 64
#define ADBLOCK_FLAG_THIRD_PARTY 128
#define ADBLOCK_FLAG_FIRST_PARTY 256
#define ADBLOCK_FLAG_RESOURCE_TYPE_IN 512
#define ADBLOCK_FLAG_RESOURCE_TYPE_NOT_IN 1024
#define ADBLOCK_FLAG_RT_STYLESHEET 131072
#define ADBLOCK_FLAG_RT_SCRIPT 196608
#define ADBLOCK_FLAG_RT_IMAGE 262144
#define ADBLOCK_FLAG_RT_MEDIA 327680
#define ADBLOCK_FLAG_RT_XHR 851968
#define ADBLOCK_FLAG_RT_PING 917504
#define ADBLOCK_FLAG_RT_CSP_REPORT 1048576

struct adblock_entry {
  const char **matches;
  int flags;
  const char **domains, **domains_skip;
};

For those not familiar with the Bromite/NoChromo engine, it uses anchors and I personally added the domains whitelist/blacklist feature which was not present in the original NoChromo engine.

You can see how anchors work here: https://github.com/bromite/bromite/blob/76.0.3809.129/build/patches/Bromite-adblock-engine.patch#L596

To me it looks like a better/tidier implementation of the engine which is used in Bromite, but I cannot claim much: there might be prior art which I am not aware of.

Just the combination of anchors and domains whitelist/blacklist is something that I immediately recognised, together with the recently added resource-type-based flags (also a new feature I added on top of original NoChromo version).

This also means that the native engine used in Chromium is not that different from Bromite's one, and switching to it should be easier.

I will most likely start with a hard-coded block of filters, but I would not be surprised if converting Bromite adblock filters to the subresource filters is a 1:1 operation (given all the above considerations) and thus also easier to integrate with a custom URL download functionality.

@thestinger thestinger changed the base branch from pie to 10 September 5, 2019 17:32
@csagan5
Copy link

csagan5 commented Sep 7, 2019

@Zoraver does the enabled update in 89768a8 work?

I am currently experimenting with a dedicated AdBlockUpdater that does not use components.

@Zoraver
Copy link
Contributor Author

Zoraver commented Sep 7, 2019

@csagan5 It does, but it should be considered a temporary solution that can be replaced by a better one if or when one is developed. Essentially, the patch causes Vanadium to check for updates to the Subresource Filter Rules component each time the browser is started. When GrapheneOS starts to ship its own version of this component, the id being passed to OnDemandUpdate() will need to be updated.

@csagan5
Copy link

csagan5 commented Sep 7, 2019

I see; I am currently developing a dedicated filters update service, I will eventually release it as part of next Bromite release if I get it working.

@csagan5
Copy link

csagan5 commented Sep 9, 2019

@Zoraver the update service is almost complete, I am testing it now.

I am thinking about how to provide an UI option to specify the filters URL. Would you be interested in developing it? My best pick would be to copy https://cs.chromium.org/chromium/src/chrome/android/java/src/org/chromium/chrome/browser/preferences/HomepageEditor.java, make it accessible from settings and add a new profile preference for it.

@Zoraver
Copy link
Contributor Author

Zoraver commented Sep 12, 2019

@csagan5 I would be happy to, but I am not sure if I will have time to work on it until next week.

@csagan5
Copy link

csagan5 commented Sep 12, 2019

@Zoraver no problem; meanwhile I have verified that the AdBlockUpdate service works for Bromite, I am cleaning up the patches and preparing a release. I asked in case you would like to work on that, otherwise I will have to, eventually.

As for the patches conflict I see here: perhaps it would be best to use an approach with a list file like this? It would allow for less renumbering at least.

@csagan5
Copy link

csagan5 commented Sep 15, 2019

Quoting self for context:

So am I correct to conclude that the built-in adblocker wouldn't work, even with this patch, because of #35?

I have an interest in the built-in adblocker for the reasons explained in #10 (comment)

I have figured out that in current Chromium version (v77) enabled presets/configurations are only used in coordination with SafeBrowsing, see calls to ActivationStateComputingNavigationThrottle::NotifyPageActivationWithRuleset.

This method is called either via the full SafeBrowsing throttle chain, or when the kAdTagging feature is enabled by default. This is a feature enabled to measure the impact of the ad filter on 1% of traffic of all users, as if it were enabled on all sites.

They are clearly testing what would be the impact in performance; since in Bromite I have completely disabled SafeBrowsing, I must avoid going through the SafeBrowsing throttle chain. I understand how Vanadium might differ here, but I am reporting my findings as they might be useful anyways.

So for now I am experimenting with calling ActivationStateComputingNavigationThrottle::NotifyPageActivationWithRuleset as the ad tagging feature does, for all sites (but with real filtering consequences).

@csagan5
Copy link

csagan5 commented Sep 16, 2019

The patch for the Bromite AdBlockerUpdate service is released: https://github.com/bromite/bromite/blob/master/build/patches/Bromite-AdBlockUpdaterService.patch

@Zoraver I included your 2 patches from here as well, as separate patches, although they are unnecessary and might remove them in the future - they are only relevant if SafeBrowsing is being built in.

@csagan5
Copy link

csagan5 commented Sep 27, 2019

although they are unnecessary and might remove them in the future - they are only relevant if SafeBrowsing is being built in.

Correction: the patch to hide the ads popup works also without SafeBrowsing, so I will keep that one

@thestinger
Copy link
Member

So does https://github.com/bromite/bromite/blob/master/build/patches/Bromite-AdBlockUpdaterService.patch work as a standalone patch, or are there some other dependencies?

Do you know how this interacts with having the Safe Browsing feature built in? I'm curious about this part:

I included your 2 patches from here as well, as separate patches, although they are unnecessary and might remove them in the future - they are only relevant if SafeBrowsing is being built in.

Since it implies that they interact together and I'm not entirely clear on what happens in that case. You're a lot more familiar with all of this blocking and Safe Browsing code at this point.

@csagan5
Copy link

csagan5 commented Sep 28, 2019

You're a lot more familiar with all of this blocking and Safe Browsing code at this point.

With Safe Browsing my experience is of the "destructive" type, as I have tried to disable it at all costs over the major releases and currently the patch holds, although it is cumbersome so it is not a position I really like to be in regarding patch maintenance. Such patch was originally based on the ungoogled-chromium patch, but it might differ sensibly by now as I have always built it only for Android (it needed extra changes for such target since the very beginning).

I will explain a bit the interaction between the Subresource filter and Safe Browsing, as I understood it (although some drawing would be better):

  • in current Chromium subresource filtering happens through safe browsing, although the two at some point were more independent (as I have written before, they start to entangle further in v77)
  • in current Chromium there is however "AdTagging", which happens independently of safe browsing being even compiled in or not, and this ad tagging is basically a dry-run of the subresource filters on 1% of all pages for metrics collections reasons (see this comment)

The way subresource filtering happens in safe browsing is through "presets", so what I mentioned before in this comment, then corrected in this comment, is relevant only for the presets patches and not the ads popup patch.

Basically @Zoraver has efficiently enabled the filter for all pages by using the same mechanism used in Safe browsing, which is by adding a new preset.

However in Bromite Safe browsing has been historically always disabled because of the tight coupling with the server-side part, so I had to go another route (after testing around with presets) which is by copying the "ad tagging" feature and making it effective instead of a dry-run. This works fine because as I mentioned before ad tagging is independent of safe browsing.

The AdBlockUpdaterService patch covers two purposes:

  • provide a filters update mechanism that does not need download of a component/extension, and replace it for the subresource filter (it uses a simple HTTPS GET request and evaluates Last-Modified header to infer an incremental version)
  • activate subresource filter similarly to ad tagging, bypassing the need to go through Safe browsing and presets

For Vanadium if you want to continue building with Safe browsing (suggested, since the safe-browsing-disabling patch is a burden) you need only (1) and not (2), although the patch as-is would work if you avoid also adding the presets patches from @Zoraver (otherwise double filtering might happen, not sure).

The silver lining here as you can sense is that upstream is putting resource filtering in the same realm as security/safety (the scope of safe browsing), and very close to "report home X was blocked" functionality; I do not agree with this strategy.

I can assist you with the integration of this patch if you need to split (1) from (2); I also have a patch lying around that adds log lines to understand better how the presets work.

@thestinger
Copy link
Member

@Zoraver I'm still very interested in this feature but I lack the time / energy to work on advancing it myself. Let me know if you plan to work on this again in the near future and I can make some time available to discuss / review the approach.

@thestinger
Copy link
Member

Going to close this PR since it will need to be reopened but this feature is definitely welcome, and I just didn't want to land it in the form of using the upstream filter component.

@thestinger thestinger closed this Jan 27, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants