Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Omitting www might be confusing for users #568

Open
outloudvi opened this issue Jan 2, 2021 · 5 comments
Open

Omitting www might be confusing for users #568

outloudvi opened this issue Jan 2, 2021 · 5 comments
Labels
clarification Standard could be clearer

Comments

@outloudvi
Copy link

I'm creating this issue as a comment of whatwg/url 4.8.1. Simplify non-human-readable or irrelevant components.

The related part is introduced in 8809598, which is a part of "Guidelines for URL Display" added here.

Quote

This issue is a proposal to delete or modify the following part(s) in whatwg/url:

... For example, browsers may omit a leading www or m domain label to simplify the host, ...

Reasons

Omitting www (or m) will not be helpful in spoofing

The target of this part is to avoid spoofing or security-relevant distractions. "Spoofing" harms users only when they mistake a domain for another. In the case of examplecorp.com@attacker.example, the users might mistake attacker.example for examplecorp.com, which is harmful and might be avoided by vendors. However, it doesn't seem to be a security problem if the user is visiting www.example.com or example.com, since they are controlled by the same "registrant".

www.example.com is not the same as example.com, which creates confusion

www is a commonly-used subdomain for a website. A subdomain is different from an apex domain, which means that it's doable (and simple) to host different contents on the two domains. A consequence is that, www.example.com being reachable does not means example.com is reachable, and vise versa.

I'm not sure if any standards are implying that www.example.com and example.com should be seen as the same sites. However, the confusion seems to be a practical problem. I tested against the list of The Majestic Million (Why not Alexa? Because the Alexa list is expensive.) and found that a lot of sites (at least about 6%, or 61,222 out of 1,000,000) don't treat www and apex domain as the same.

Data

We only detect the differences by trying to resolve the domains in The Majestic Million with and without www. Therefore, the list of domains does not contain the domains that host different contents on www and the apex domain, if they are both resolvable.

All the lists shown below are filtered over Public Suffix List to ensure that they are apex domains, rather than subdomains.

The list of domains that have resolvable DNS records on the apex domain, but not www, is here: WNoNYes.ps.txt. This list contains 37,027 domains (3.70%). The top 20 domains in this list:

youtu.be
www.gov.uk
wa.me
icio.us
www.gov.cn
www.nhs.uk
flic.kr
netdna-ssl.com
1drv.ms
pinimg.com
brightcove.net
campaign-archive1.com
campaign-archive2.com
hwg.org
bufferapp.com
campaign-archive.com
t.cn
lnkd.in
rapidshare.com
aliyuncs.com

The list of domains that have resolvable DNS records on the www subdomain, but not the apex domain, is here: WYesNNo.ps.txt. This list contains 24,196 domains (2.41%). The top 20 domains in this list:

wixsite.com
googleusercontent.com
fda.gov
miit.gov.cn
bbb.org
jiathis.com
army.mil
navy.mil
securityfocus.com
vatican.va
filesusr.com
nhk.or.jp
gwu.edu
af.mil
ec-lyon.fr
freetds.org
specbench.org
golux.com
clickbank.net
apachetutor.org

Any comments, problems, or suggestions are welcome.

@annevk
Copy link
Member

annevk commented Jan 4, 2021

Thank you for doing this analysis. (The formal term we settled on is "registrable domain" by the way. And no, there's no standard that requires www and no-www to resolve to the same thing.)

@estark37 I suspect you have thoughts on this.

@estark37
Copy link
Contributor

estark37 commented Jan 6, 2021

A Chrome colleague ran a similar analysis on an Alexa list in 2018 and found smaller numbers (2.6% didn't resolve on www and 0.6% didn't resolve on the bare registrable domain). But these are different datasets so I guess it's not surprising that the numbers are different.

Ultimately it's a product decision: simplifying information can cause confusion for some people and alleviate confusion for other people. While it's true that there is not a direct risk of www.example.com spoofing example.com or vice versa, making the URL simpler may generally help people notice and understand URLs better.

I don't really feel convinced that anything needs to change in the spec, since it's already a "may", allowing browsers to make their own product decisions. Maybe it would be reasonable to add a caveat at the end of the section to note that there are tradeoffs to these simplifications -- maybe something like this?

Simplifying components of the URL can cause confusion for some users in some contexts -- for example, users might interpret a URL with hidden components as the full URL. Browsers may consider using visual distinctions between full URLs and simplified URLs, and/or making full URLs visible in a secondary UI surface or with a setting.

@outloudvi
Copy link
Author

Thanks for your reply, estark37!

I don't really feel convinced that anything needs to change in the spec, since it's already a "may", allowing browsers to make their own product decisions.

It is indeed a product decision on whether to hide the "www" or so in the address bar. However, I also think that the word "may" can never be an excuse for a spec to not remove a (possibly?) unwelcomed practice.

On whether omitting "www" (or in the canonical terms, displaying only the "registrable domain") is ill-famed, I would suggest one perspective, described as below:

Chromium started to hide "www"s from the address bar from 69 (Sept 2018, later reverted) and 76 (July 2019). Two flags called omnibox-ui-hide-steady-state-url-trivial-subdomains and omnibox-ui-hide-steady-state-url-scheme were useful on reverting this behavior, effective from Chromium 76 to 79. After Chromium 79 (Dec 2019), the seemingly only way to un-elide "www" is installing the extension Suspecious Site Reporter (and also, "agree to the Google Terms of Service and Privacy Policy"), since Chromium explicitly hard-coded the extension ID to disable elision. On the page of that extension, there are tons of comments complaining that Chrome users need to install an extension to un-elide "www". In my perspective, this might be a sign implying that the "www" elision is not popular.

I have been misled by the elision in my personal experience, and some web searches I did on this topic also presented more negative reviews than positive ones, but this perspective might be more persuasive since Chrome/Chromium and Chrome Web Store have been a majority on the web browser market.

Maybe it would be reasonable to add a caveat at the end of the section to note that there are tradeoffs to these simplifications -- maybe something like this?

I agree that it would be better to add a caveat if this is kept in the spec. It might also be good if the spec can encourage vendors to leave the choice to users (for example, as one configurable setting or flag).

@estark37
Copy link
Contributor

estark37 commented Jan 6, 2021

After Chromium 79 (Dec 2019), the seemingly only way to un-elide "www" is installing the extension Suspecious Site Reporter (and also, "agree to the Google Terms of Service and Privacy Policy"), since Chromium explicitly hard-coded the extension ID to disable elision. On the page of that extension, there are tons of comments complaining that Chrome users need to install an extension to un-elide "www". In my perspective, this might be a sign implying that the "www" elision is not popular.

You can right-click on the address bar and select "Always show full URLs" if you don't want to install the extension:

image

@outloudvi
Copy link
Author

You can right-click on the address bar and select "Always show full URLs" if you don't want to install the extension:

Thank you for your idea! I tried it on Chromium 87 on desktop.

Although it remains to be seen if we need further considerations for mobile/laptop (since right-clicking on the omnibox would not work on these devices), I think the approach that Chromium desktop currently applies can solve the problem, with a default approach while allowing users to make their own choices.

Given that the users are entitled to change the behavior as they wish, the solution of adding a section describing the caveat of elision would be great.

@annevk annevk added the clarification Standard could be clearer label Oct 20, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clarification Standard could be clearer
Development

No branches or pull requests

3 participants