-
Notifications
You must be signed in to change notification settings - Fork 55
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
API addition proposal + remove redundancy #103
Conversation
@remusao I think this would be a very good improvement to the library! Just one slight suggestion for more consistency: I think it should be Only then will the result of parse strictly map to the current api: // tldjs.parse('https://example.co.uk')
{
valid: true, // tldjs.isValid()
hostname: 'foo.example.co.uk',
publicSuffix: 'co.uk', // tldjs.getPublicSuffix()
domain: 'example.co.uk', // tldjs.getDomain()
subdomain: 'foo', // tldjs.getSubdomain()
} |
I support the Although I was wondering if it was sensible to have lazy properties instead, so as the computation would be done only on the used properties. Can be part of another pull request if it is relevant. |
I just updated to rename |
Well, that closes the case then 👍 |
test/tld.js
Outdated
@@ -56,7 +79,7 @@ describe('tld.js', function () { | |||
expect(tld.isValid('.com')).to.be(false); | |||
}); | |||
|
|||
it('should be falsy on dotless hostname', function () { | |||
it.skip('should be falsy on dotless hostname', function () { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The tests would not pass otherwise?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, a hostname without dots seems to be valid (according to the firefox parser). So localhost
is a valid hostname, strictly speaking, but we won't find a valid public suffix for it, so it will be rejected. Unless we specify it in validHosts
. So I disabled these tests, should I remove them altogether?
We can double check if we have enough test cases to cover the behavior of all the functions for such corner cases (getPublicSuffix
, getDomain
, etc.) and make sure we offer a consistent behavior.
index.js
Outdated
|
||
return { | ||
valid, | ||
hostname, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking about it, the only property I would add would be tldExists
, derived from the same method. Especially as now, thanks to you, we can retrieve a public suffix while having an unknown TLD.
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes we can, this way we actually have an equivalent for each method of the API when using the parse
function. I will add it.
@oncletom I updated the PR with the following changes:
Current timings: tldjs#isValid x 524,277 ops/sec ±0.56% (92 runs sampled)
tldjs#extractHostname x 53,515 ops/sec ±0.25% (94 runs sampled)
tldjs#tldExists x 33,019 ops/sec ±0.65% (95 runs sampled)
tldjs#getPublicSuffix x 19,411 ops/sec ±1.32% (91 runs sampled)
tldjs#getDomain x 19,041 ops/sec ±0.95% (93 runs sampled)
tldjs#getSubdomain x 18,960 ops/sec ±0.61% (94 runs sampled)
tldjs#parse x 16,834 ops/sec ±1.21% (96 runs sampled) |
Not sure what's going on with the installation on |
Great work 👍
I don't know, it seems to happen only with |
Issue seems to be related to Travis CI cache. Testling reports an issue which does not happen on the mocha side of things:
|
And it seems to hang randomly on |
@oncletom Have you been able to reproduce the hanging? Is it while running If you can reproduce, can you try to isolate the test that is causing the issue? If that can help, we can isolate each test case in its own |
I have not been able to reproduce the random hanging. The failing case with Trying to identify the issue at the moment. |
It is related to the use of |
Ok, if this is the case, we can just replace the function repeat(str, n) {
var res = '';
for (var i = 0; i < n; i += 1) {
res += str;
}
return res;
} |
I pushed it to give it a try, we can revert if it does not work. |
Well, looks like it was that issue? |
Yeah and thinking about it it's weird, unless a different version of Phantom is in use depending on which version of Node is active… |
Let's keep an eye on this! That's weird indeed... Thanks for merging. |
I played a bit with the idea of changing slightly the API #99 (while maintaining full backward compatibility) and I think I found some compromise.
parse
method in the public API:factory
to provide a custom method to extract ahostname
from aurl
(by default it iscleanHostValue
)cleanHostValue
to only pay the price of parsing if the argument is not already a valid hostnameindex.js
has some extra logic in parse to only extract thehostname
once)The result is a substantial increase in speed + no extra cost if the value given to any method of the public API is already a valid hostname.
Some timings (if processing only clean hostnames):
That is roughly 1M hostnames processed per second (for
getDomain
,getSubdomain
,getPublicSuffix
).Other timings (normal benchmark):
Which is about 430k hostnames processed per second (but
extractHostname
is now the bottleneck).If you think this is a reasonable change, we should close #100 (as it already implements the more strict
isValid
hostname validation), and I think I should stop optimizing the code :P It's probably more than fast enough now and if not, people can still plug their own fancyextractHostname
function.