Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Brand info is being used to block clients #293

Closed
YngveNPettersen opened this issue Feb 16, 2022 · 22 comments
Closed

Brand info is being used to block clients #293

YngveNPettersen opened this issue Feb 16, 2022 · 22 comments

Comments

@YngveNPettersen
Copy link

Vivaldi recently became aware that a Japanese site, www.keisan.nta.go.jp, is blocking Vivaldi users from accessing an e-Tax part of the site, demanding they use a different browser.

On Windows, the site is using JS and navigator.userAgentData.brands to decide whether to open the product, or display their "Unsupported browser" dialog, and among Chromium-based browsers the site only accepts two specific browsers identified by the brands "Microsoft Edge" and "Google Chrome". No other Chromium based browser is accepted. (On Mac the situation is apparently that only Safari is accepted)

We have possible reports about a few other sites that may be doing the same.

The only scalable way for a client to work around this issue is to pretend to be one of the accepted browsers, by returning their brand instead of its own.

The scalable solution to the issue is that brand information is never made available to the sites, whether in JS or HTTP headers.

Major vendors are unlikely to be affected by such blocking policies, only smaller vendors will be affected, and such blocking will be a reason for many users to avoid using those browsers. The result is increased market share for the major browsers.

At most general engine information should be provided, and even that could be a serious obstacle for some vendor seeking to create (or fork) a new browser engine.

@miketaylr
Copy link
Collaborator

@YngveNPettersen could you provide the relevant snippet from that site? It would be interesting to see how they're doing that.

@YngveNPettersen
Copy link
Author

A bit cut down to the relevant parts:

function getBrowser(brands){
	
	const browser = brands.reduce((a, c) => c.brand.match(/Microsoft Edge|Google Chrome/) ? c : a);
	return browser;
}

const browser = getBrowser(navigator.userAgentData.brands);

if(browser === BROWSER_TYPE.GOOGLE_CHROME){
} else {
}

The real code is currently at https://www.keisan.nta.go.jp/kyoutu/ky/sm/top_web , starting with the function isRecommendedBrowserAsEtax() at line 514

@miketaylr
Copy link
Collaborator

Thanks, documenting that at https://gist.github.com/miketaylr/f042c482cd2ee33abbfa0fca723c5d66#file-sadness2-js-L38 in case it changes.

@miketaylr
Copy link
Collaborator

Part of the design of the brand list allows for a browser like Vivaldi to send its own brand name (which allows it to appear in usage analytics), in addition to something like "Google Chrome" for compatibility. But that's ultimately a product decision. Unfortunately there will always be ways to sniff and block browsers. :(

@YngveNPettersen
Copy link
Author

The problem with branding is that it WILL(!) be used to block clients, as this case demonstrates. Such blocking is the reason Vivaldi is no longer sending "Vivaldi" in the User Agent string to any sites, except to sites we trust to use it properly, usually just our own sites.

If fact, the above case is, at the very least, making me seriously consider hardcoding "Google Chrome" as our "brand", alternatively "Edge". Haven't done it yet, but if we discover more cases (there is one case in our bug tracker that sounds like it might be another one, but I have not debugged it yet), particularly prominent ones, then we might be forced to do that.

Effectively, that will make this field as useless as the User Agent field already is. I wouldn't be surprised if the field eventually contain as many Brand name entries as the User Agent field do at present.

Site specific handling is not really an option, maintaining such a list will require a lot of work, and it will always be out of date. That is entirely aside from the fact that significant changes would probably have to be implemented in the relevant Chromium code to support that.

The only way to avoid the header becoming useless is to require all branded clients to not send their brand in a number of requests, or intervals, like I suggested in #274 ; IMO it should be left out frequently enough that if a site depends on the brand, that the site break for a significant fraction (5-10% would probably be necessary) of users of Major Browsers each week. I doubt that anyone would actually implement that, though, so the header will become useless sooner rather than later.

@Sora2455
Copy link

On the one hand, if major browsers pretend to be minor/no-name browsers at intermittent intervals, I can imagine furious site owners/advertisers lodging tickets demanding that be urgently reverted like they did with GREASE.

On the other hand, GREASE itself was trying to solve the same problem you are running into, which is hardcoding on "good" UA strings and forcing browser homogenization due to dev laziness; which this specification has spend much effort trying to avoid.

@YngveNPettersen
Copy link
Author

The fundamental and inconvenient truth (IMO) is that the ONLY browsers that will benefit from truthful brand info headers are the Major Vendors, in this case, Google, Microsoft, Apple, and Mozilla Firefox (which were the only ones allowed by the site mentioned above, and even the non-Safari browsers were blocked on Macs). For everyone else providing truthful brand info will allow sites to block them for no other reason that they are not one of the major browser. (Actually, I suspect that the only reason Edge avoids occasionally being blocked, too, is that it is the default browser on Windows.)

I think that, at most, the only information that should be available to the web site is the engine brand and version, and even that can be problematic, and would e.g. have made it possible to identify the Opera Presto (<=v12) engine, and it was forced to use a lot of hacks, including UA spoofing and Site Specific Javascript overrides to get past a lot of the blocking it was targeted by.

And BTW, I am not sure we can call those devs "lazy". After all, adding the blocks and testing them actually require extra effort to implement, especially when the blocked browsers use the same engine version as the "supported" ones.

@Sora2455
Copy link

Sora2455 commented Apr 13, 2022

As a web developer, I agree that browser engine and version is the main piece of information I'm after - in fact, it would have simplified my tests for e.g. support for same-site cookies if I didn't have to research what engine UC Browser runs on. That lets me know what features I can reasonably rely on my users having.

I'm sure advertisers will be up in arms about losing precious entropy bits, but I'm not sure there's that many legitimate reasons for a web developer to need to know the browser's brand and version specifically, and not just its engine and engine version - the only one that springs to mind is needing to tell apart Safari WebViews from actual Safari on iOS (though even then, the main reason to tell them apart is they have different feature support, which feature detection is your friend with). I'm sure some other reasons will be provided in this thread shortly.

As a personal anecdote, the "Browser outdated" banner on the website I maintain became much simpler and more reliable when we switched to feature testing instead of user-agent parsing:

let upToDate =
  // Module scripts
  "currentScript" in document && "noModule" in document.currentScript &&
  // CSS @supports
  typeof CSS !== "undefined" && typeof CSS.supports === "function" &&
  // CSS grid
  CSS.supports("display", "grid") &&
  // TextEncoder (needed for one of our features, pre-Chromium Edge doesn't support)
  typeof TextEncoder !== "undefined";

@YngveNPettersen
Copy link
Author

A small update: We have recently seen several reports that might be caused by abusing Client Hints to block Vivaldi. We have not been able to confirm this possibility as both sites that have been identified (nextdoor.co.uk and PNC Bank) seems to do it during a successful login.

As a result, in the most recent snapshot, we added preferences to set various Client Hint brands, including custom ones. The next stage will probably be to permanently configure a major brand in the header.

@Sora2455
Copy link

Sora2455 commented Mar 7, 2023

And so UA client hints become just as untrustworthy as the user-agent string, just because some sites can't be stuffed to deal with minor browsers.

@YngveNPettersen
Copy link
Author

Unfortunately, yes.

As I said in my article Client Hints, or Client Lies in January:

if the last several decades of using User Agent string information have proven anything, it is that only the major vendors (OS or browser) are able to tell the truth. Everybody else will have to tell lies one way or the other – even Microsoft had to tell lies when they started distributing Internet Explorer.

@miketaylr
Copy link
Collaborator

That's sort of by design - it's totally valid to send multiple brands.

The next stage will probably be to permanently configure a major brand in the header.

Seems like a good idea (as well as the option to set custom ones).

@YngveNPettersen
Copy link
Author

That's sort of by design - it's totally valid to send multiple brands.

Until the (bad) sites start checking which brands are listed .....

@YngveNPettersen
Copy link
Author

YngveNPettersen commented Apr 16, 2023

A further update: One of Vivaldi's users today confirmed that PNC Bank is performing Sec-CH-UA brand sniffing during login, and blocking clients that are not providing an approved Brand ID in the header or JS API.

@YngveNPettersen
Copy link
Author

FYI, we just released the first Vivaldi 6.1 build with "Google Chrome" as the default Client Hints Brand.

@Sora2455
Copy link

And so the lying begins again...

@miketaylr
Copy link
Collaborator

And so the lying begins again...

This is how the brand list was designed, to allow for sending your own brand in addition to another brand for compat. It's not beautiful, but it's the reality of the web (at least for poorly tested/built sites).

@LonMcGregor
Copy link

This is how the brand list was designed, to allow for sending your own brand in addition to another brand for compat.

Which surely defeats the whole point of sending any brand information to begin with. This is yet another source of effort for browsers and web developers to maintain without bringing any benefit.

@miketaylr
Copy link
Collaborator

Which surely defeats the whole point of sending any brand information to begin with.

Not sending any brand info (or spoofing) is a choice that a UA can make, but then your users don't show up in analytics which can result in sites not testing in your browser... not a great outcome, in my experience.

See also https://wicg.github.io/ua-client-hints/#marketshare-analytics-use-case

@LonMcGregor
Copy link

The notion of measuring market share to make decisions is what is causing the problems mentioned above in the first place. Why should a browser be blocked purely because it has a low market share?

Browsers, such as vivaldi and others, are already based on chrome. Feature detection is sufficient to decide whether or not a user agent should be able to interact with a site.

@YngveNPettersen
Copy link
Author

If the User Agent and Sec-CH-UA information had ONLY been used for market share analysis we wouldn't be having this conversation.

Unfortunately, both methods are being used to BLOCK clients that does not provide "proper" identification.

With respect to providing multiple brand, that is essentially what the User Agent did for about almost 30 years, and it was extensively used to block clients, to the extent that Vivaldi decided to remove its name from the header in order to avoid being blocked by sites. We are now seeing the same thing starting to happen with Client Hints, and multiple brands will only work until the sites "(un)wise up".

Besides, suggesting that clients should send multiple brands will utterly demolish "It's about market share numbers" excuse into microscopic dust, since all market share date would be just so much garbage.

Frankly, IMNSHO CH-UA and the associated APIs should be removed from the standard and browsers. The Engine information should be sufficient, and even that will cause problems for any browser with a new engine.

It's never going to happen, but Chrome could (in a full Stable release to all its hundreds of millions of users, not beta, dev, canary etc.) try not identifying as Google Chrome and perhaps change it's brand header and User Agent to something else, e.g. "Vivaldi", and see how many bug reports they get about "site X is broken in Chrome".

@miketaylr
Copy link
Collaborator

Thanks for the report - please continue to let us know how things evolve. But for now I don't think we'll be changing anything in the spec.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants