Skip to content
Please note that GitHub no longer supports Internet Explorer.

We recommend upgrading to the latest Microsoft Edge, Google Chrome, or Firefox.

Learn more
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Defining goals, deprecating the User-Agent string, and how server implementations would actually work #14

Open
chrisgraham opened this issue Jan 14, 2020 · 3 comments

Comments

@chrisgraham
Copy link

@chrisgraham chrisgraham commented Jan 14, 2020

This spec seems to be achieving a few things at once, which is great, but I think we could use some clarity on what these things are and how the spec should be implemented to achieve them. I see it as:

  1. Reducing fingerprinting (the main spec focus), especially to 3rd party contexts
  2. Improving performance by removing header overhead
  3. Better tooling to stop bad UA sniffing behaviors - e.g. letting users configure their UA to not give out a version number, or letting browser vendors more uniformly block certain things

Achieving the second goal seems like a longer term thing, but actually I don't think it has to be. If the server passes a Accept-CH header, I think it would be fair to omit the old-school user-agent header entirely for future requests. If this is going to be the case though, I think it should be decided early so implementations are consistent.

Beyond the above, there is a major paradigm shift in this specification that brings up some important issues.

  1. What is a server, from the point of view of people implementing this spec server-side? Is the server like Apache, or is it the webapp? What are we expecting developers to be doing exactly?
  2. What is a website? Remember there may be entirely different webapps running on the same domain name and SSL certificate. What if one webapp downgrades the upgrade request of another webapp? Should UAs be sending a superset of what was requested?
  3. Are we shifting from gathering data for sessions instead of views? In which case, are we to some degree abandoning the stateless model of the web? What if different systems are processing different requests. For example a front proxy may handle some requests, and a full webapp stack handle other requests.

I could suggest solutions to these problems, but the solutions will likely be limiting in terms of how well the defined goals would be achieved. For example, we could say that the UA sends the superset of everything requested with every future request, but this would go against a desire to improve performance. Or we could specify that the UAs continue to send a superset of everything, until it is acknowledged as received and handled by the webapp.

So I think there's a lot of thought that needs to go into exactly how these issues are going to be handled. It's getting complicated!

@chrisgraham

This comment has been minimized.

Copy link
Author

@chrisgraham chrisgraham commented Jan 14, 2020

That was rather rambling, so maybe it would help if I now simplify it down to some individual questions...

  1. Should we have clients stop sending the old-school UA header, if it ever sees Accept-CH on a domain. If so, we are acknowledging that we are considering each domain to either support this spec or not support it (binary). If so, we help push things forward faster and reduce overheads.

  2. Should we recommend the actual web servers take responsibility for gathering UA data, and provide a server-side API to webapps? That seems sensible given that there may be many webapps running on the same domain, each requesting different subsets of data. It would mean, however, that the actual web server would need to be tracking user sessions (identifying clients to sessions would be part of the aforementioned API).

  3. Should we have some way for a server to acknowledge it has received data, so it doesn't have to keep being sent? This probably would only work if the actual web server was gathering data, as otherwise there is (as things stand) no way to know which web app the client is talking to.

  4. If we don't have some system of acknowledgment, do we formally decide that clients should send a superset of what was requested with each request? And if so, over what kind of time span? Surely not forever!

  5. Should we formally clarify that UA data should now be considered a session-level thing rather than a request-level thing?

@jonarnes

This comment has been minimized.

Copy link

@jonarnes jonarnes commented Jan 14, 2020

Can you please elaborate on 2. (and 5.)? Why is UA data a session-level thing?

@chrisgraham

This comment has been minimized.

Copy link
Author

@chrisgraham chrisgraham commented Jan 14, 2020

It's difficult to answer that, as it really depends how things end up being implemented.

If it's implemented that the client will send the superset of headers requested by anything on the domain, forever, then unless you are writing some analytics system that needs to back-associate detailed user agent data to the first request, you don't need to track this stuff in sessions.

However, if there's the possibility of the data being narrowed in the future, webapps are likely to want to hold onto the data in sessions, depending on what they are trying to do, and depending on what data is guaranteed to be available initially.

Here's an example...
Let's say a website foobar.com is raising up a different default download depending on platform. For example, on ARM64 they put the ARM64 build of their software first on the download page.
Now, they could try and make it so there's some complex interstitial page between the user clicking download and the downloads being listed that raises a request for the platform data, but they probably will just want to gather up what they need in a session, especially if there's a chance of that data only being sent for some subset of future requests.

(Or they could use JavaScript and build up their page that way of course).

I feel I am getting stuck in a deep web of hypotheticals. I could talk for hours about all the different use cases and how things might turn out depending on implementation choices.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.