Conversation
May also fix bitwarden/clients#2177. Would need to check what method the Excluded Domains page calls during validation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work @djsmith85. See comments below. In answer to your discussion questions:
- let's deprecate tldjs and the old tldjs logic. We seem to have lots of test case coverage here, that lets us be aggressive in our refactoring. There's no point keeping around old stuff "just in case".
- let's try to get rid of the noop implementation by moving this to a separate class and only using it in the clients that require it. Utils is supposed to be shared by all Typescript clients, stubbing out dependencies with noop implementation is a bad pattern imo.
common/src/misc/utils.ts
Outdated
@@ -1,4 +1,5 @@ | |||
import * as tldjs from 'tldjs'; | |||
import * as tldts from 'tldts'; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to import it this way when it's a native Typescript project? We should be able to use the usual import format.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not necessarily, but I did do it this way, as the getHostname
method clashes with ours in utils.ts
common/src/misc/tldts.noop.ts
Outdated
@@ -0,0 +1,7 @@ | |||
export function parse(url: string, options?: object): null { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume we have a noop version so that other clients can use the Utils class without needing this dependency. But this seems like a bad pattern. Can we extract this out to a new class/service instead, so that it's only imported by the clients that need it, it doesn't interfere with Utils, and we don't need the noop?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See comment here: #547 (comment)
common/src/misc/utils.ts
Outdated
return null; | ||
} | ||
|
||
uriString = uriString.trim(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd include this in the previous if
block just for brevity:
`if (uriString == null || uriString.trim() == '')
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Calling trim()
was actually important before executing the old parsing logic, as it would throw an exception if it was not trimmed beforehand
common/src/misc/utils.ts
Outdated
if (tldts.getHostname != null) { | ||
// Does uriString contain invalid characters | ||
// TODO Needs to possibly be extended, although '!' is a reserved character | ||
if (uriString.indexOf('!') > 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can tldts do this kind of validation for us? Crafting regexes here seems like a losing battle.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only reason I included is, a test would be failing as it's assumed to be an invalid url. The issue is the exclamation mark is actually a reserved character, it's just not commonly used. So I'd prefer to adjust the test and get rid of this check.
common/src/misc/utils.ts
Outdated
return null; | ||
} | ||
|
||
if (tldts.getHostname != null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume this is a holdover pattern from tldjs. I don't know if/why we need this, I suspect it's a holdover from using a js library and/or the noop implementation. If you move this into its own class then I think you can remove this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that is just a holdover in case we'd use the tldts.noop. Has been removed in a recent push.
common/src/misc/utils.ts
Outdated
return hostname; | ||
} | ||
} | ||
|
||
const url = Utils.getUrl(uriString); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably don't need this if we're removing the if
statement on line 201, right? I assume it's a fallback in case tldjs/ts isn't present
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also a fallback and I've removed it.
spec/common/misc/utils.spec.ts
Outdated
expect(Utils.getDomain('https://my_vault.bitwarden.com/')).toBe('bitwarden.com'); | ||
}); | ||
|
||
it('should support urls containing umlauts', () => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think these are valid test cases because domain names do not actually contain unicode characters. If we want the domain name from a string containing unicode, I think it should do the punycode translation in order to get the "real" domain name. e.g.
expect(Utils.getDomain('bütwarden.com')).toBe('xn--btwarden-65a.com');
However, this does depend on what components/services are using this method. There may be some situations where we want to display the unicode in the GUI. But for behind the scenes domain matching logic, we always need the punycode. I suggest doing some investigation here.
…ove-hostname-and-domain-retrieval
This reverts commit 6b527b7.
@eliykat Thank you for taking the time to go through this and also the great feedback. I've decided to ditch the noop's and also the fallback logic. Pulling this stuff into a separate class/service would be great but it is all static and it's also being used inside our models ( I'll need to have a look into for example the web repo and why we are overriding |
…hostname-and-domain-retrieval
…hostname-and-domain-retrieval
…ove-hostname-and-domain-retrieval
…ove-hostname-and-domain-retrieval
…ove-hostname-and-domain-retrieval
…ove-hostname-and-domain-retrieval
…ove-hostname-and-domain-retrieval
Hi, We are currently in the process of archiving this repository and moving the code and future development to the bitwarden/clients repository. This PR will be closed but feel free to re-open it again on bitwarden/clients. |
Type of change
Objective
Improve the retrieval and validation of hostnames and domains, which affects url matching/auto-fill
Newly supported scenarios:
http://
(Regex fails)If for any reason the new detection does not work, I've included a fallback to the original parsing logic.
Marking this as draft:
As this affects a core part of the product/clients, I'd like to discuss the points below with the team.
tldjs
as a dependencytldjs.noop.ts
tsconfig.json
in some repos that maptldjs
totldjs.noop.ts
tldjs.noop.ts
or for that mattertldts.noop.ts
Asana task: https://app.asana.com/0/1153292148278596/1201431275009049/f
Closes bitwarden/web#545
Could possibly improve or fix the following issues;
Code changes
tldts
and runningnpm i
tldts
and runningnpm i
tldjs.noop.ts
Testing requirements
For jslib itself there won't be much to test/verify other than the present unit tests.
The following PR's for the affected repos will have a lot of areas to test. Will detail them there once this is ready for merge.
Before you submit
npm run lint
) (required)