Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: replace safe browsing with web risk #2305

Merged
merged 2 commits into from
Mar 21, 2024
Merged

chore: replace safe browsing with web risk #2305

merged 2 commits into from
Mar 21, 2024

Conversation

gweiying
Copy link
Contributor

@gweiying gweiying commented Mar 21, 2024

Context

Replaces Google Safe Browsing with Google Web Risk API, which has no daily quotas on number of links scanned.

Notes

  • There seems to be minor differences in the threat types detected by Web Risk and Safe Browsing.
    • Web Risk: MALWARE SOCIAL_ENGINEERING UNWANTED_SOFTWARE SOCIAL_ENGINEERING_EXTENDED_COVERAGE
    • Safe Browsing: MALWARE SOCIAL_ENGINEERING UNWANTED_SOFTWARE POTENTIALLY_HARMFUL_APPLICATION
    • Checking w the Google team right now whether this will cause any adverse effect /regression
  • Required abit of refactoring as the return types for Safe Browsing is different from Web Risk
    Safe Browsing return:
    // pass in list of urls, 
    // if threats found, returns list of matched threats
    {"matches": [{  "threatType":  "MALWARE", "platformType":  "WINDOWS", "threatEntryType": "URL", "threat": {"url": "http://www.urltocheck1.org/"}, "threatEntryMetadata": { "entries": [{ "key": "malware_threat_type", "value": "landing" }]}, "cacheDuration": "300.000s"}]}
    // if no threats 
    {}
    
    Web Risk return:
    // only single lookup endpoint available
    // if threat found, returns threat as object
    { "threat": { "threatTypes": ["MALWARE"], "expireTime": "2024-03-20T05:29:41.898456500Z"} }
    // if no threats
    {}
    

Tests

Tests on staging

  • Try creating short link to malicious url, should see error "Link is likely malicious, please contact us for help"
  • Create short link to safe url, should be able to create short link
  • [API] Create short link to safe url, should be able to create short link
  • [API] Try creating short link to malicious url, should see error "Link is likely malicious, please contact us for help"
  • [Bulk] Try creating short links with bulk upload of 500 links, all safe, should be able to bulk create
  • [Bulk] Try creating short links with bulk upload of 500 links, at least one malicious link, should not be able to bulk create
  • See usage on Web Risk API on Google console

Deploy Notes

Before deploying,

  • Activate Web Risk API on edu prod
  • Activate Web Risk API on go prod
  • Activate Web Risk API on health prod

After deployment, tests on production, 3 environments

  • Try creating short link to malicious url, should see error "Link is likely malicious, please contact us for help"
  • Create short link to safe url, should be able to create short link
  • [API] Create short link to safe url, should be able to create short link
  • [API] Try creating short link to malicious url, should see error "Link is likely malicious, please contact us for help"
  • [Bulk] Try creating short links with bulk upload of 500 links, all safe, should be able to bulk create
  • [Bulk] Try creating short links with bulk upload of 500 links, at least one malicious link, should not be able to bulk

New environment variables:
None, reusing existing Safe Browsing API keys

@gweiying gweiying marked this pull request as ready for review March 21, 2024 06:56
Copy link
Contributor

@halfwhole halfwhole left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

  • agree that it looks ok to just set the cache time to 300s, no real need to follow the given expireTime so closely
  • minor question, non-blocking: SOCIAL_ENGINEERING_EXTENDED_COVERAGE wasn't there before and so it looks like we don't need to add it right away, but i'm wondering if it's good to add in eventually?
  • previously isThreatBulk could handle bulk requests all at once (up to 500), but now it sends individual requests for each URL, should we test that bulk uploads are okay too?
  • in the future, refactoring to use "web risk" instead of "safe browsing" would be nice!

@gweiying
Copy link
Contributor Author

LGTM!

* agree that it looks ok to just set the cache time to 300s, no real need to follow the given `expireTime` so closely

* minor question, non-blocking: `SOCIAL_ENGINEERING_EXTENDED_COVERAGE` wasn't there before and so it looks like we don't need to add it right away, but i'm wondering if it's good to add in eventually?

* previously `isThreatBulk` could handle bulk requests all at once (up to 500), but now it sends individual requests for each URL, should we test that bulk uploads are okay too?

* in the future, refactoring to use "web risk" instead of "safe browsing" would be nice!

Good points! Thanks for the quick review.

Added in SOCIAL_ENGINEERING_EXTENDED_COVERAGE as part of the threat types covered, I agree that it's good to add and err on the safe side!

Also ran tests on bulk upload and updated the test list above, thanks for flagging out.

@gweiying gweiying requested a review from halfwhole March 21, 2024 08:07
@gweiying gweiying merged commit 1e1f53b into develop Mar 21, 2024
20 checks passed
@gweiying gweiying deleted the chore/web-risk branch March 21, 2024 08:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants