Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stripWWW strips domains too #38

Closed
stevenvachon opened this issue Feb 23, 2017 · 10 comments
Closed

stripWWW strips domains too #38

stevenvachon opened this issue Feb 23, 2017 · 10 comments

Comments

@stevenvachon
Copy link

normalize("www.com")
//-> http://com

This can be rectified with strip-www.

@sindresorhus
Copy link
Owner

Can you do a PR?

@Schavras
Copy link

Hello! If no one is working on it right now, I would like to try. It's my first attempt to contribute to another open source project :)

@stevenvachon
Copy link
Author

Feel free. If not, I'll try to get to it sometime next week.

@Schavras
Copy link

Can you give me some more information about the bug?
It only don't work with URL www.com ?
Because I have done some tests with

  • http://testwww.com
  • http://test.www.com

and they passed the tests.

@stevenvachon
Copy link
Author

The most important test case is listed above. Also, check out strip-www's test suite.

@Schavras
Copy link

I'm confused with the cases
t.is(m('http://www.', opts), 'http://www.');
t.is(m('http://www', opts), 'http://www');
from strip-www's test suite. Is those even valid domain names?

@stevenvachon
Copy link
Author

stevenvachon commented Mar 16, 2017

They don't have to be valid domains.. they test that invalid input isn't further invalidated. Also, "www" could be a local TLD.

@stevenvachon
Copy link
Author

stevenvachon commented Mar 27, 2017

There's an edge case with stripping www subdomains with regexp:

www.subdomain.domain.tld
www.www.domain.com

will produce:

subdomain.domain.tld
www.domain.com

Because the regexp doesn't know how far the subdomain extends because it can't know how long the TLD extends (.com vs .co.uk) without a large library like parse-domain. Such a lib is fine for the server, but not for a browser build.

@ethanbb
Copy link

ethanbb commented Mar 28, 2017

Interesting - so the edge case of URLs like www.www.domain.com would actually cause normalize-url to no longer be idempotent (calling it once would produce www.domain.com, and then calling it a second time on the result would produce domain.com). As a solution, perhaps just let the stripWWW option strip all leading "www" subdomains? It seems like using "www" as a subdomain in two levels of a URL is asking for trouble anyway.

@stevenvachon
Copy link
Author

stevenvachon commented Mar 28, 2017

It gets worse:

www.app.company.com

produces:

app.company.com

Technically, the subdomain is "www.app", since there is structurally no sub-subdomain. As a result, there could theroetically be no "app.company.com" configured.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants