Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ability to debounce URLs based on a regex applied to the path #23121

Closed
ShivanKaul opened this issue May 28, 2022 · 14 comments · Fixed by brave/brave-core#13551
Closed

Add ability to debounce URLs based on a regex applied to the path #23121

ShivanKaul opened this issue May 28, 2022 · 14 comments · Fixed by brave/brave-core#13551

Comments

@ShivanKaul
Copy link
Collaborator

ShivanKaul commented May 28, 2022

Summary of feature

To expand the existing debounce capability in Brave, we'd like to able to specify a generic regex pattern for picking out destination URLs from a given URL. Note that for security reasons, we will only apply this regex pattern on the URL's path for now.

The motivating use-case is to debounce AMP cache URLs to the canonical URLs. These URLs look something like https://www-theverge-com.cdn.ampproject.org/c/s/www.theverge.com/platform/amp/2018/9/20/17881766/bing-google-amp-support-mobile-news -- we would want to debounce that URL to https://www.theverge.com/platform/amp/2018/9/20/17881766/bing-google-amp-support-mobile-news. We also need the ability to predicate a debounce rule on a user preference, to support use-cases like De-AMP where a user turning off the preference should turn off the relevant debouncing rule.

There are several other examples captured in this issue: #22429. Note that given that we're limited to applying the regex to the URL's path, we can capture only a subset of those.

Description of changes

  1. This would be a new action: regex-path supported in debounce.json.
  2. The include and exclude pattern would function as before.
  3. The param would be a regular expression that has a capture group that should pick out one (and only one) well-formed URL on evaluation when applied on the original URL's path.
  4. The evaluation of the param expression will be done when applying the rule, because that's when we have the original URL.
  5. We should URL unescape the target URL.
  6. There will be a new key called prepend_scheme: http|https that, only if specified, will add the specified scheme (http or https) to the captured string value. Note that as a safety check, if the captured string value is already a valid GURL AND prepend_scheme is specified, then we error out. Once we've prepended the scheme, re-try parsing the captured string value as a GURL. prepend_scheme helps us capture the case in AMP cache URLs where the scheme is not specified in the original URL and would thus never be parsed into a valid GURL.
  7. NOTE: There should be support for a new pref key in the debounce.json JSON that will be optionally specified. If it exists, then the value should be a brave-core supported user preference, for e.g. "pref" : "brave.de_amp.enabled" (as specified in https://github.com/brave/brave-core/blob/master/components/de_amp/common/pref_names.cc#L12).

debounce rules file

debounce.json is the file that contains all the debounce rules. It's located locally at:

  1. ~/Library/Application Support/BraveSoftware/Brave-Browser-Nightly/afalakplffnnnlkncjhbmahjfjhmlkal/[latest version]/1/debounce.json if you're on a Mac,
  2. /c/Users/.../AppData/Local/BraveSoftware/Brave-Browser-Nightly/User Data/afalakplffnnnlkncjhbmahjfjhmlkal/[latest version]/1/debounce.json on Windows,
  3. ~/.config/BraveSoftware/Brave-Browser-Nightly/afalakplffnnnlkncjhbmahjfjhmlkal/[latest version]/1/debounce.json on Linux.
  4. Not sure where the file lives on iOS: /ios/debounce.json? @cuba
  5. Not sure if debouncing was even tested on Android: Incorporate "debouncing" lists as part of top-level domain blocking #15090

The file is updated as part of the Brave Local Data Updater component. After editing it you need to restart your browser; make sure you're editing the latest version of the file. It looks something like this:

[
  {
    "include": [
      "*://out.reddit.com/*"
    ],
    "exclude": [
    ],
    "action": "redirect",
    "param": "url"
  },
...
]

After the changes mentioned above, the following rule would be supported:

[
  {
    "include": [
      "*://brave.com/*"
    ],
    "pref": "brave.de_amp.enabled",
    "exclude": [
    ],
    "prepend_scheme": "https",
    "action": "regex-path",
    "param": "^/(.*)$"
  },
...
]

With this rule, the URL https://brave.com/https://braveattentiontoken.com would be debounced to https://braveattentiontoken.com if the user preference brave.de_amp.enabled (the De-AMP pref) is switched on. Note that https://brave.com/xyz would not be debounced despite matching the param pattern because xyz is not a valid URL even after prepending https scheme to it.

Test Cases

Note: if debouncing flag is off in brave://flags, none of these should work.

@fmarier added some test rules to the production debounce.json file, so the tests on https://dev-pages.brave.software/navigation-tracking/debouncing.html should work even without modifying the debounce.json file.


In order to comprehensively test this feature, we'll need to make edits to the local debounce.json file as mentioned above. Given that this is an enhancement of the debouncing feature, it would be good to add these test case rules to the existing debounce.json file and make sure to regression test the feature using https://dev-pages.brave.software/navigation-tracking/debouncing.html.

  1. If action is not regex-path, and param is a regex, fail i.e. don't debounce based on that rule but other rules work. Go to https://brave.com/blog/, should not get debounced. Sample rule.
  2. param is a malformed regex, fail. Go to https://brave.com/blog/, should not get debounced. Sample rule.
  3. param captures no strings, fail. Go to https://test.com/https://brave.com, should not load (because it doesn't exist). Sample rule. Now, replace the param with ^/(.*)$ - https://test.com/https://brave.com should now redirect to https://brave.com.
  4. param captures > 1 strings, fail. Go to https://test.com/https://brave.com, should not get debounced. Sample rule.
  5. param captures a string that is not a URL, fail. Go to https://test.com/brave.com, should not get debounced. Sample rule.
  6. pref is a pref that does not exist, fail. Go to https://test.com/https://brave.com/, should not load. Now, change pref to brave.de_amp.enabled and make sure De-AMP is enabled in brave://settings/shields. Now https://test.com/https://brave.com/ redirects to https://brave.com. Sample rule.
  7. pref is not specified (should work by default), pass.
  8. pref is a valid pref, should correctly gate rule application (both true and false). Have pref be brave.de_amp.enabled and toggle the De-AMP setting off and on and make sure we only debounce https://test.com/https://brave.com/ when it's on. Sample rule.
  9. prepend_scheme is specified and original URL is a valid URL without the scheme. Success. Go to https://test.com/brave.com/ - should redirect to https://brave.com. Sample rule.
  10. prepend_scheme is specified and original URL is a valid URL with the scheme. Failure. Go to https://test.com/https://brave.com/ - should not load. Sample rule.
@ShivanKaul ShivanKaul added OS/Android Fixes related to Android browser functionality OS/Desktop labels May 28, 2022
@ShivanKaul ShivanKaul added this to the 1.41.x - Nightly milestone May 28, 2022
@ShivanKaul ShivanKaul self-assigned this May 28, 2022
@ShivanKaul ShivanKaul added the privacy/debounce URL debouncer label May 28, 2022
@ShivanKaul ShivanKaul changed the title [Debounce] Add ability to debounce based on a regex Add ability to debounce URLs based on a regex applied to the path Jun 9, 2022
@ShivanKaul ShivanKaul added enhancement OS/iOS Fixes related to iOS browser functionality OS/Android Fixes related to Android browser functionality and removed OS/Android Fixes related to Android browser functionality labels Jun 17, 2022
@LaurenWags
Copy link
Member

Marking QA/Blocked pending additional information re: Test Plan since modifying files is not possible on mobile

cc @kjozwiak

@ShivanKaul
Copy link
Collaborator Author

@pilgrim-brave does debouncing work on mobile?
@LaurenWags @kjozwiak could we proceed with Desktop QA first?

@pes10k
Copy link
Contributor

pes10k commented Jul 12, 2022

@LaurenWags @kjozwiak you should be able to use the third test here: https://dev-pages.brave.software/navigation-tracking/debouncing.html

@LaurenWags
Copy link
Member

@pes10k @ShivanKaul is the test that @pes10k mentions via #23121 (comment) sufficient?

There's a lot mentioned under Test Cases from #23121 (comment) - do these need to be run as well?

@pes10k
Copy link
Contributor

pes10k commented Jul 13, 2022

@LaurenWags i think just the one in #23121 (comment) should be sufficient

@LaurenWags
Copy link
Member

@pes10k hm, is this what's expected for the 3rd test (Regex) on https://dev-pages.brave.software/navigation-tracking/debouncing.html ?

Screen Shot 2022-07-14 at 10 42 29 AM

@pes10k
Copy link
Contributor

pes10k commented Jul 14, 2022

@LaurenWags it looks like you have a cached, old version of the test page (maybe?). I cant think of how you'd get that error message othewise, and i just tested and got the expected result.

Could you clear caches and try again?

Screen Shot 2022-07-14 at 12 20 06

@LaurenWags
Copy link
Member

@pes10k I was using a fresh profile, however I hadn't restarted so I didn't pull griffin seed until next launch.

Assuming we're controlling this via griffin so what I saw makes sense? meaning, what I saw was the "Disabled" or N/A (error) case?

@pes10k
Copy link
Contributor

pes10k commented Jul 17, 2022

Yep! What you saw makes sense :)

@LaurenWags
Copy link
Member

Verified using

Brave | 1.42.68 Chromium: 103.0.5060.114 (Official Build) beta (x86_64)
-- | --
Revision | a1c2360c5b02a6d4d6ab33796ad8a268a6128226-refs/branch-heads/5060@{#1124}
OS | macOS Version 12.4 (Build 21F79)

Disabled case

  1. With a fresh profile, launched above version
  2. Be sure not to restart so griffin info is not pulled
  3. Visit https://dev-pages.brave.software/navigation-tracking/debouncing.html
  4. Run the Regex test
  5. Get "error" condition as below:

disabled - error

Enabled case

  1. With a fresh profile, launched above version
  2. Close and relaunch so as to pull griffin seed
  3. Visit https://dev-pages.brave.software/navigation-tracking/debouncing.html
  4. Run the Regex test
  5. "Observed Referrer" matches the "Enabled" column value for Regex:

enabled

@kjozwiak
Copy link
Member

kjozwiak commented Jul 18, 2022

@pes10k @ShivanKaul as per the above, it sounds like running through https://dev-pages.brave.software/navigation-tracking/debouncing.html should be good enough for Android as well?

As @LaurenWags mentioned above, we don't have the ability to edit files on Android due to the encryption enclaves unless the device has been rooted.

@pes10k I was using a fresh profile, however I hadn't restarted so I didn't pull griffin seed until next launch.

Assuming we're controlling this via griffin so what I saw makes sense? meaning, what I saw was the "Disabled" or N/A (error) case?

yup, controlled using BraveDebounceStudy:Enabled as per brave/brave-variations#195.

@pes10k
Copy link
Contributor

pes10k commented Jul 18, 2022

yep! That should work just fine

@kjozwiak
Copy link
Member

kjozwiak commented Jul 18, 2022

Verification PASSED on Win 11 x64 using the following build(s):

Brave | 1.42.68 Chromium: 103.0.5060.114 (Official Build) beta (64-bit)
-- | --
Revision | a1c2360c5b02a6d4d6ab33796ad8a268a6128226-refs/branch-heads/5060@{#1124}
OS | Windows 11 Version 21H2 (Build 22000.795)

Disabled case

  1. With a fresh profile, launched above version
  2. Be sure not to restart so griffin info is not pulled
  3. Check brave://version and ensure that BraveDebounceStudy:Enabled isn't present/used
  4. Visit https://dev-pages.brave.software/navigation-tracking/debouncing.html
  5. Run the Regex test
  6. Get "error" condition as below:

image

Enabled case

  1. With a fresh profile, launched above version
  2. Close and relaunch so as to pull griffin seed (ensure that BraveDebounceStudy:Enabled is visible)
  3. Visit https://dev-pages.brave.software/navigation-tracking/debouncing.html
  4. Run the Regex test
  5. "Observed Referrer" matches the "Enabled" column value for Regex:
Example Example
image image

Verification PASSED on PopOS 22.04 x64 using the following build(s):

Brave | 1.42.68 Chromium: 103.0.5060.114 (Official Build) beta (64-bit)
--- |---
Revision | a1c2360c5b02a6d4d6ab33796ad8a268a6128226-refs/branch-heads/5060@{#1124}
OS | Linux

Disabled case

  1. With a fresh profile, launched above version
  2. Be sure not to restart so griffin info is not pulled
  3. Check brave://version and ensure that BraveDebounceStudy:Enabled isn't present/used
  4. Visit https://dev-pages.brave.software/navigation-tracking/debouncing.html
  5. Run the Regex test
  6. Get "error" condition as below:

image

Enabled case

  1. With a fresh profile, launched above version
  2. Close and relaunch so as to pull griffin seed (ensure that BraveDebounceStudy:Enabled is visible)
  3. Visit https://dev-pages.brave.software/navigation-tracking/debouncing.html
  4. Run the Regex test
  5. "Observed Referrer" matches the "Enabled" column value for Regex:
Example Example
image image

@kjozwiak
Copy link
Member

kjozwiak commented Jul 18, 2022

Verification PASSED on Pixel 6 using the following build(s):

Brave | 1.42.67 Chromium: 103.0.5060.114 (Official Build) beta (32-bit)
--- | ---
Revision | a1c2360c5b02a6d4d6ab33796ad8a268a6128226-refs/branch-heads/5060@{#1124}
OS | Android 13

Disabled case

  1. With a fresh profile, launched above version
  2. Be sure not to restart so griffin info is not pulled
  3. Check brave://version and ensure that BraveDebounceStudy:Enabled isn't present/used
  4. Visit https://dev-pages.brave.software/navigation-tracking/debouncing.html
  5. Run the Regex test
  6. Get "error" condition as below:

Screenshot_20220718-164255

Enabled case

  1. With a fresh profile, launched above version
  2. Close and relaunch so as to pull griffin seed (ensure that BraveDebounceStudy:Enabled is visible)
  3. Visit https://dev-pages.brave.software/navigation-tracking/debouncing.html
  4. Run the Regex test
  5. "Observed Referrer" matches the "Enabled" column value for Regex:
Example Example
Screenshot_20220718-164926 Screenshot_20220718-164309

Verification PASSED on Samsung S8 Tab Ultra using the following build(s):

Brave | 1.42.67 Chromium: 103.0.5060.114 (Official Build) beta (32-bit)
--- | ---
Revision | a1c2360c5b02a6d4d6ab33796ad8a268a6128226-refs/branch-heads/5060@{#1124}
OS | Android 12; Build/SP1A.210812.016

Disabled case

  1. With a fresh profile, launched above version
  2. Be sure not to restart so griffin info is not pulled
  3. Check brave://version and ensure that BraveDebounceStudy:Enabled isn't present/used
  4. Visit https://dev-pages.brave.software/navigation-tracking/debouncing.html
  5. Run the Regex test
  6. Get "error" condition as below:

Screenshot_20220718-171653_Brave - Beta

Enabled case

  1. With a fresh profile, launched above version
  2. Close and relaunch so as to pull griffin seed (ensure that BraveDebounceStudy:Enabled is visible)
  3. Visit https://dev-pages.brave.software/navigation-tracking/debouncing.html
  4. Run the Regex test
  5. "Observed Referrer" matches the "Enabled" column value for Regex:
Example Example
Screenshot_20220718-171732_Brave - Beta Screenshot_20220718-171706_Brave - Beta

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment