Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove regional Google domains to reduce size of entitylist #79

Closed

Conversation

say-yawn
Copy link
Contributor

About this PR

Due to the large changes on #78 we found that Shavar service was getting higher hits than expected since the Firefox client has a limit on the size of the file that can be downloaded. This causes the error on the client side and since the updated file has not been retrieved by the Firefox client Shavar gets constant hits to download the said file causing continuous massive hits. While we are figuring out what to do with the large change on the entity list we are removing the regional Google domains to unblock other changes from going into effect.

Acceptance Criteria

  • Regional Google domains are removed
  • Shavar service's hits are lowered
  • Follow-up with fix for regional Google domain additions to entity lists

@englehardt englehardt self-requested a review October 28, 2019 23:07
@englehardt
Copy link
Contributor

englehardt commented Oct 28, 2019

Thanks! To be clear this is reverting the changes in disconnectme/disconnect-tracking-protection@f2550bc that were added for the reasons described in disconnectme/disconnect-tracking-protection#100. These changes push the generated entity list over the url classifier's chunk size limit, which is causing all Firefox clients to fail to update any tracking protection list.

The next steps here are to:

  1. Create a new whitelist similar in configuration to mozstd-trackwhite-digest256, perhaps mozstd-trackwhite-google-digest256.
  2. Duplicate the Google entity over to that whitelist.
  3. Consume the new whitelist alongside the standard whitelist in all channels and for all classifier features.
  4. Exclude the Google entity from mozstd-trackwhite-digest256.
  5. Add the changes from this PR back into the main list

EDIT: I updated the steps above after thinking through our currently workflow. The new steps will continue to allow Disconnect to make PRs against a single json file, and we'll just split the file out into two shavar lists for delivery. We'll have to update the list creation script to do that, but we should be able to use inclusion and exclusion categories similar to how we slice up the blocklist.

Copy link
Contributor

@englehardt englehardt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@skim1102 You removed the domains from the "properties" section, but we should remove them from the "resources" section instead.

This reverts commit c61158b. Domains removed from the "properties" section when it should have been removed form the "resources".
@say-yawn
Copy link
Contributor Author

Closing and moving to #80 to maintain git history on properties.

@say-yawn say-yawn closed this Oct 29, 2019
@say-yawn say-yawn deleted the remove-regional-google-domains-from-entities-list branch October 29, 2019 20:02
@whimboo
Copy link

whimboo commented Nov 1, 2019

Please note that also the Firefox-UI tests for safebrowsing detected this broken behavior. For details see https://bugzilla.mozilla.org/show_bug.cgi?id=1571134.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants