Add links to company websites for breach resolution #2961

flozia · 2023-03-31T15:01:05Z

References:

Jira: MNTOR-1504

Description

Adds a link to the breach resolution when passwords or security questions were involved.

Screenshot

How to test

Login with an account or add an email address that was exposed in a breach.
Visit /user/breaches
If the breach includes passwords or security questions there should be a link to the company website in the associated resolution step.

Checklist (Definition of Done)

Localization strings (if needed) have been added.
Commits in this PR are minimal and have descriptive commit messages.
I've added or updated the relevant sections in readme and/or code comments
I've added a unit test to test for potential regressions of this bug.
Product Owner accepted the User Story (demo of functionality completed) or waived the privilege.
All acceptance criteria are met.
Jira ticket has been updated (if needed) to match changes made during the development process.
Jira ticket has been updated (if needed) with suggestions for QA when this PR is deployed to stage.

Vinnl

Left one blocking comment, unfortunately, though probably fairly easy to resolve :)

src/utils/breach-resolution.js

pdehaan · 2023-03-31T22:22:45Z

Not sure if I mentioned this in another thread, but there is a small-ish risk of linking to external sites. A lot of the domains via HIBP are no longer resolving.

I looked at one point and think my scripts were guessing about 25% of the outbound domains were invalid.
I had a few minutes so I ended up rebuilding my tool: https://github.com/pdehaan/blurts-https-stats which conveniently pipes the report out to README.md if you just want a quick scroll. Or, the raw "alive"-vs-"dead" JSON can be found in https://github.com/pdehaan/blurts-https-stats/blob/main/checker.json.

Some very rough stats, if you like grepping JSON files:

"link": occurs 624× (which accounts for empty domains or duplicated domains)
"status": "dead" occurs 198× (31.7%)
"status": "alive" occurs 426× (68.2%)

BUT the misleading thing is that a lot of those broken 31% reported dead links still work, but the npm module is just reporting bad HTTPS certificates as broken (or getting caught by Cloudflare captchas).

PROTIP: If you want to play a fun game, we could WHOIS all the broken domains and see which domains are available to register, then pick them all up and redirect them to https://monitor.firefox.com.

flodolo

Requesting changes to make sure the rationale is clarified.

locales/en/breaches.ftl

flozia · 2023-04-03T11:12:20Z

Not sure if I mentioned this in another thread, but there is a small-ish risk of linking to external sites. A lot of the domains via HIBP are no longer resolving.

I looked at one point and think my scripts were guessing about 25% of the outbound domains were invalid. I had a few minutes so I ended up rebuilding my tool: https://github.com/pdehaan/blurts-https-stats which conveniently pipes the report out to README.md if you just want a quick scroll. Or, the raw "alive"-vs-"dead" JSON can be found in https://github.com/pdehaan/blurts-https-stats/blob/main/checker.json.

Some very rough stats, if you like grepping JSON files:

"link": occurs 624× (which accounts for empty domains or duplicated domains)

"status": "dead" occurs 198× (31.7%)

"status": "alive" occurs 426× (68.2%)

BUT the misleading thing is that a lot of those broken 31% reported dead links still work, but the npm module is just reporting bad HTTPS certificates as broken (or getting caught by Cloudflare captchas).

PROTIP: If you want to play a fun game, we could WHOIS all the broken domains and see which domains are available to register, then pick them all up and redirect them to https://monitor.firefox.com.

That’s interesting. Thank you so much for the tool @pdehaan and for listing the stats! As @flodolo mentioned there had been some back and forth on if we would like to show those links or not. We communicated the risks with PM but I’m raising this again to make sure we are aligned.

pdehaan · 2023-04-03T15:40:05Z

@flozia To be clear, the broken links are already on the site on the details page, ie: https://monitor.firefox.com/breach-details/LeakedReality
If I click on the domain near the top I see a "Hmm. We’re having trouble finding that site." error page in Firefox Nightly. A WHOIS search shows the domain is registered, but maybe DNS just isn't set up anymore. Even curl shows an error trying to inspect the page:

curl -fL https://leakedreality.com # HTTPS
curl: (6) Could not resolve host: leakedreality.com

curl -fL http://leakedreality.com # try HTTP
curl: (6) Could not resolve host: leakedreality.com

echo $? # 6

… but I can see how adding a new [broken] CTA link would be frustrating and confusing to end users.

It took a few clicks, but https://monitor.firefox.com/breach-details/Abandonia2022 is another example. Clicking the domain link on the details page takes me to a "Unable to connect. An error occurred during a connection to abandonia.com." error page in Nightly. Changing https:// to vanilla http:// fixes the broken link, but probably not intuitive.

Or https://monitor.firefox.com/breach-details/GGCorp external domain link takes me to a scary error page (presumably because of a bad/invalid HTTPS cert):

Warning: Potential Security Risk Ahead
Nightly detected a potential security threat and did not continue to ggcorp.me. If you visit this site, attackers could try to steal information like your passwords, emails, or credit card details.

mansaj · 2023-04-03T15:55:16Z

@pdehaan Looking at the JSON generated by the script.. I noticed that lots of the sites marked as “dead” / 403s are accessible? Are these false positives?

pdehaan · 2023-04-03T16:22:11Z

@mansaj #2961 (comment) "I noticed that lots of the sites marked as “dead” / 403s are accessible? Are these false positives?"

I think the link-check module I used might be considering certificate errors to be "dead" links? So there were sites that were reported as dead, but clicking them showed me the website but there was some console logging and/or certificate issues that most people wouldn't really notice. Per tcort/link-check#47 (comment) there might also be some 403 issues if sites are behind Cloudflare or maybe other proxy services as well.

Per link-check README:

A link is said to be 'alive' if an HTTP HEAD or HTTP GET for the given URL eventually ends in a 200 OK response. To minimize bandwidth, an HTTP HEAD is performed. If that fails (e.g. with a 405 Method Not Allowed), an HTTP GET is performed. Redirects are followed.

GitHub search showed this as one of the very few results for "dead" in their codebase: https://github.com/tcort/link-check/blob/430027f6be03b904db508042945bf2a75472d330/lib/LinkCheckResult.js#L10

I can't think of another great solution. I tried using curl or native fetch instead but neither were great, both were super slow (these scripts take 10-20 minutes to scrape 640 domains w/ 10s timeouts). Another option might be using something like Playwright and trying to click the link from the /breach-details/* page and then taking a screenshot if it's a non-2xx response. If the screenshot shows something maybe we don't care. But if we get no response/bytes/pixels, maybe we consider that a bad link. Although I imagine that's still prone to Cloudflare sandboxes and other gotchas.

…b.com/pdehaan/blurts-https-stats

…ecurtity questions

…list

locales/en/breaches.ftl

…stion

flozia · 2023-04-18T13:44:57Z

Thanks for your input, everyone! After this PR was on hold while aligning with PM and UX, I’m moving this out of “draft” mode.

We decided to compile a first blocklist from @pdehaan’s https://github.com/pdehaan/blurts-domains-playwright/blob/main/stats-https.json. We will use the env variable HIBP_BREACH_LINK_BLOCKLIST which will include domains that do not resolve with 200.
@flodolo The strings were updated and include b tags and a marker breached-company-link that can be replaced by a link — or stripped if it is one we do not want to show.

flodolo · 2023-04-18T13:54:34Z

2. @flodolo The strings were updated and include b tags and a marker breached-company-link that can be replaced by a link — or stripped if it is one we do not want to show.

When you say stripped, you mean that the markup is removed, but the text remains? i.e. it's become inactive text

flodolo

(never mind my previous comment, the code is clear enough)

Vinnl

Only easy fixes or things that can be ignored, I think :)

src/utils/breachResolution.js

Vinnl · 2023-04-18T15:13:25Z

src/utils/breachResolution.js

+    case BreachDataTypes.Passwords:
+    case BreachDataTypes.SecurityQuestions: {


Personal preference, so feel free to ignore, but I'd keep the code simple here and just replace <breached-company-link> in every string, rather than just the passwords and security questions resolutions.

No hard preference from my side: I thought being a bit more specific here was OK since we were cautious about how and where we link out to. Less complex is also a good thing, though: d574bee.

Vinnl · 2023-04-18T15:19:36Z

src/utils/breachResolution.test.js

+  // There should be a resolution for `BreachDataTypes.Phone`,
+  // `BreachDataTypes.Passwords` and `BreachDataTypes.SecurityQuestions`.
+  // The last two should fallback to a more generic header string that does not
+  // include the breached company's domain, which we don't know:


I don't know how easy it is to mock AppConstants, but if it's easy (but only then - you've been working on this PR for long enough), maybe a test for the blocklist would be a good addition?

Good point! I create an issue for this and will address this in a follow-up in order to not block this PR.

tsconfig.json

Co-authored-by: Vincent <Vinnl@users.noreply.github.com>

…/blurts-server into MNTOR-1504-Add-links-to-websites

pdehaan · 2023-04-18T16:16:36Z

.env-dist

@@ -58,6 +58,8 @@ HIBP_THROTTLE_DELAY=2000
 HIBP_THROTTLE_MAX_TRIES=5
 # Authorization token for HIBP to present to /hibp/notify endpoint
 HIBP_NOTIFY_TOKEN=unsafe-default-token-for-dev
+# Domains we prefer to not link to
+HIBP_BREACH_DOMAIN_BLOCKLIST=a-blocked-domain.com,another-blocked-domain.org


Is there a limit how long an ENV var can be?
It feels like this could be REALLY long if we end up blocking over 20-30 domains.

According to SRE, there is no limit that we would hit.

I'm wondering if it's better as an ENV var that has to be coordinated w/ SRE, versus maybe some JSON file that lives in the repo that is a single source of truth that we can audit occasionally. (unless there are reasons to keep it as a secret/ENV). 🤷
Although I guess I could technically recreate the blocklist locally by scraping the 650 breaches on the Monitor site and see if the outbound link is a link or not.

I’m still not sure about that as well, but one good argument for handling the list in the env is that we would be able to make adjustments without a release. Especially in the beginning, when we might need to test and audit the sites manually.

pdehaan · 2023-04-18T16:19:59Z

src/utils/breachResolution.js

      const args = {
        companyName: b.Name,
+        breachedCompanyLink: showLink ? `https://${b.Domain}` : '',


aside: in my personal testing, I think http:// had better results than https:// (somewhere between 5-10% more 2xx/3xx results).

Interesting, thanks for the note. With us trying to be cautious where we link out to I think I’d feel more comfortable linking out to https://.

flozia · 2023-04-19T08:14:35Z

(never mind my previous comment, the code is clear enough)

Thank you for your renewed review @flodolo. Unfortunately, there has been another change to the strings with two additions: A note on using 2FA and the link to Firefox Password Manager — sorry to burn your cycles on these.

flozia requested a review from flodolo as a code owner March 31, 2023 15:01

flozia force-pushed the MNTOR-1504-Add-links-to-websites branch from 6eaf2e5 to 858f19f Compare March 31, 2023 15:01

chore: Add links to company websites for breach resolution

fdac319

flozia force-pushed the MNTOR-1504-Add-links-to-websites branch from 858f19f to fdac319 Compare March 31, 2023 15:06

flozia requested review from Vinnl and toufali March 31, 2023 15:10

Vinnl reviewed Mar 31, 2023

View reviewed changes

src/utils/breach-resolution.js Outdated Show resolved Hide resolved

flozia self-assigned this Mar 31, 2023

flodolo requested changes Apr 1, 2023

View reviewed changes

locales/en/breaches.ftl Outdated Show resolved Hide resolved

flozia marked this pull request as draft April 3, 2023 16:09

flozia added the needs-PM label Apr 3, 2023

flozia added 3 commits April 4, 2023 15:41

chore: Add list of links with their status generated by https://githu…

b075643

…b.com/pdehaan/blurts-https-stats

chore: Add variable breach recommendation strings for passwords and s…

8944a11

…ecurtity questions

chore: Only show breached company links if they are not on our block …

e930adb

…list

flozia force-pushed the MNTOR-1504-Add-links-to-websites branch from d7a1f84 to e930adb Compare April 5, 2023 14:41

flozia added 2 commits April 5, 2023 16:53

merge: main -> MNTOR-1504-Add-links-to-websites

58f3d2a

chore: Update tsconfig module target

f5d2da1

flodolo requested changes Apr 5, 2023

View reviewed changes

locales/en/breaches.ftl Outdated Show resolved Hide resolved

flozia removed the needs-PM label Apr 5, 2023

flozia force-pushed the MNTOR-1504-Add-links-to-websites branch 3 times, most recently from 3e51d02 to a0290e5 Compare April 5, 2023 17:14

fix: Breach resolution test

1b25b2c

flozia force-pushed the MNTOR-1504-Add-links-to-websites branch from a0290e5 to 1b25b2c Compare April 5, 2023 17:23

chore: Use different string id for headers with links

aa00a5a

flozia added 4 commits April 18, 2023 11:44

Merge branch 'main' into MNTOR-1504-Add-links-to-websites

c0a14d0

chore: Update breach resolution strings for password and security que…

2d1f4ca

…stion

fix: Update breach resolution test

5a26bbb

chore: Update comment

5b539ab

flozia marked this pull request as ready for review April 18, 2023 13:45

flozia requested review from Vinnl, pdehaan and flodolo April 18, 2023 13:45

flodolo approved these changes Apr 18, 2023

View reviewed changes

Vinnl approved these changes Apr 18, 2023

View reviewed changes

flozia and others added 7 commits April 18, 2023 17:26

chore: Split domain list and check for exact matches

3f7097b

Co-authored-by: Vincent <Vinnl@users.noreply.github.com>

chore: Remove async

2724dca

Co-authored-by: Vincent <Vinnl@users.noreply.github.com>

chore: Revert changes in tsconfig.json

d88ab70

Merge branch 'MNTOR-1504-Add-links-to-websites' of github.com:mozilla…

91dfced

…/blurts-server into MNTOR-1504-Add-links-to-websites

chore: Revert changes in tsconfig.json

32f95cf

fix: Remove JSDoc for hideBreachLink

54ce192

chore: Remove unused strings

95f2f3e

pdehaan reviewed Apr 18, 2023

View reviewed changes

chore: Replace not only in passwords and security questions

d574bee

flozia mentioned this pull request Apr 18, 2023

Don’t freeze AppConstants for tests #2998

Merged

flozia added 3 commits April 18, 2023 22:42

Merge branch 'main' into MNTOR-1504-Add-links-to-websites

c1eb5ec

chore: Add notes on 2FA and Firefox Password Manager back in

e05ea1a

chore: Update breach resolution test

94152f5

flodolo approved these changes Apr 19, 2023

View reviewed changes

flozia merged commit 268df94 into main Apr 19, 2023
10 checks passed

flozia deleted the MNTOR-1504-Add-links-to-websites branch April 19, 2023 08:41

flozia mentioned this pull request Apr 19, 2023

Additional tests for blocklisted breach resolution links #3000

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add links to company websites for breach resolution #2961

Add links to company websites for breach resolution #2961

flozia commented Mar 31, 2023

Vinnl left a comment •

edited

pdehaan commented Mar 31, 2023

flodolo left a comment

flozia commented Apr 3, 2023

pdehaan commented Apr 3, 2023

mansaj commented Apr 3, 2023

pdehaan commented Apr 3, 2023

flozia commented Apr 18, 2023

flodolo commented Apr 18, 2023

flodolo left a comment

Vinnl left a comment

Vinnl Apr 18, 2023

flozia Apr 18, 2023

Vinnl Apr 18, 2023

flozia Apr 18, 2023 •

edited

pdehaan Apr 18, 2023

flozia Apr 18, 2023

pdehaan Apr 18, 2023

flozia Apr 18, 2023

pdehaan Apr 18, 2023

flozia Apr 18, 2023

flozia commented Apr 19, 2023

		case BreachDataTypes.Passwords:
		case BreachDataTypes.SecurityQuestions: {

Add links to company websites for breach resolution #2961

Add links to company websites for breach resolution #2961

Conversation

flozia commented Mar 31, 2023

References:

Description

Screenshot

How to test

Checklist (Definition of Done)

Vinnl left a comment • edited

Choose a reason for hiding this comment

pdehaan commented Mar 31, 2023

flodolo left a comment

Choose a reason for hiding this comment

flozia commented Apr 3, 2023

pdehaan commented Apr 3, 2023

mansaj commented Apr 3, 2023

pdehaan commented Apr 3, 2023

flozia commented Apr 18, 2023

flodolo commented Apr 18, 2023

flodolo left a comment

Choose a reason for hiding this comment

Vinnl left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

flozia Apr 18, 2023 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

flozia commented Apr 19, 2023

Vinnl left a comment •

edited

flozia Apr 18, 2023 •

edited