Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PS-1123] Improve hostname and domain retrieval #3168

Merged
merged 18 commits into from
Oct 24, 2022

Conversation

djsmith85
Copy link
Contributor

@djsmith85 djsmith85 commented Jul 24, 2022

Type of change

- [X] Bug fix
- [ ] New feature development
- [X] Tech debt (refactoring, code cleanup, dependency upgrades, etc)
- [ ] Build/deploy pipeline (DevOps)
- [ ] Other

Objective

Finally got around finishing up bitwarden/jslib#547

Improve the retrieval and validation of hostnames and domains, which affects url matching/auto-fill

The main change is replacing tldjs with tldts

Newly supported scenarios:

  • IPv6
  • subdomains with an underscore
  • Fixes a scenario where IPv4 addresses would not be returned as they got prefixed with http:// (Regex failed)
  • Fully up-to-date public-suffix list instead of relying on a regex only including the most common tld's (Utils.tldEndingRegex)
  • Support for parsing urls with umlauts
  • Better detection if a url is a website and if it's launchable from the UI (Removed Utils.tldEndingRegex).

Asana task: https://app.asana.com/0/1153292148278596/1201431275009049/f

Could possibly improve or fix the following issues:

Code changes

Improving hostname and domain detection/validation

  • Uninstall tldjs and install tldts as replacement

    • package.json:, apps/cli/package.json:
    • package-lock.json: , apps/cli/package-lock.json: Re-built using npm i
  • libs/common/spec/misc/utils.spec.ts: Added unit tests for Utils.getDomain and Utils.getHostname

  • libs/common/src/misc/utils.ts:

    • Replaced tldjs with tldts
    • Removed tldEndingRegex
    • Added check for about: protocol
    • Removed no longer needed IP4 regex
    • getHostname no longer uses Utils.getUrl internally but relies on tldts.getHostname
  • libs/common/src/misc/tldjs.noop.ts: Deleted tldjs.noop implementation

  • apps/desktop/tsconfig.json: Removed usage of tldjs.noop

  • apps/web/tsconfig.json: Removed usage of tldjs.noop

isWebsite/launch detection

  • libs/common/spec/view/loginUriView.spec.ts: Added unit tests for getters (isWebsite, launchUri, canLaunch) of loginUriView
  • libs/common/src/models/view/loginUriView.ts: Replaced usage of Utils.tldEndingRegex with improved Utils.getDomain

Fix accessibility-cookie

  • apps/desktop/src/app/accounts/accessibility-cookie.component.ts: With the removal of the tldjs.noop-impl which previously always returned a hostname instead of the domain when calling tldjs.getDomain(), the call to Utils.getDomain needed to get changed to Utils.getHostName().

Before you submit

- [X] I have checked for **linting** errors (`npm run lint`) (required)
- [X] I have added **unit tests** where it makes sense to do so (encouraged but not required)
- [ ] This change requires a **documentation update** (notify the documentation team)
- [ ] This change has particular **deployment requirements** (notify the DevOps team)

@djsmith85 djsmith85 requested review from eliykat and a team July 24, 2022 11:28
Copy link
Member

@MGibson1 MGibson1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a straight improvement. One question before approving, though

@@ -1,7 +0,0 @@
export function getDomain(host: string): string | null {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this no longer required?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't say why this was initially introduced, but with this in place on web and desktop it would actually always return the hostname instead of the domain.

I just tested this with the 2fa reports on web and it actually produces more hits with my changes. As this now grabs the domain and checks it against the list from 2fa.directory. A hostname would not match with an entry.

MGibson1
MGibson1 previously approved these changes Sep 26, 2022
Copy link
Member

@MGibson1 MGibson1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. I'm going to tag this with needs-qa, but it seems like an amorphous "please test autofill" to me, do you have any better qa notes than that?

@MGibson1 MGibson1 added the needs-qa Marks a PR as requiring QA approval label Sep 26, 2022
@djsmith85
Copy link
Contributor Author

@MGibson1 Thanks for the review, I'll add some more detailed notes for QA on the ticket.

@eliykat eliykat removed their request for review September 28, 2022 00:16
@djsmith85
Copy link
Contributor Author

Received approval from QA, just need a quick review of the 3 last merge commits pulling in master, before merging.

@djsmith85 djsmith85 requested a review from a team October 22, 2022 07:47
@djsmith85 djsmith85 removed the needs-qa Marks a PR as requiring QA approval label Oct 22, 2022
Copy link
Member

@justindbaur justindbaur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fantastic, love the tests. I would love to have a test defining the behavior of a file protocol (i.e file:///C:/something/form.pdf) going through getDomain and getHostname I'd imagine it would be null but mostly just want to define our expectation with something like that.

expect(Utils.getDomain("https://my_vault.bitwarden.com/")).toBe("bitwarden.com");
});

it("should support urls containing umlauts", () => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TIL what that is called.

@djsmith85
Copy link
Contributor Author

Fantastic, love the tests. I would love to have a test defining the behavior of a file protocol (i.e file:///C:/something/form.pdf) going through getDomain and getHostname I'd imagine it would be null but mostly just want to define our expectation with something like that.

Good catch with the file-schema links. I had to add some exceptions to getHostname as they were returning C as host.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants