Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

non-null body with GET/HEAD requests #1705

Closed
jasnell opened this issue Sep 26, 2023 · 11 comments
Closed

non-null body with GET/HEAD requests #1705

jasnell opened this issue Sep 26, 2023 · 11 comments

Comments

@jasnell
Copy link

jasnell commented Sep 26, 2023

@annevk @domenic ... I know that this has been discussed previously but I'd like to re-open the conversation about what happens when someone creates a Request object with a non-null body when method is equal to GET or HEAD. I am specifically referring to bullet point 35 at https://fetch.spec.whatwg.org/#dom-request

image

While the HTTP spec defines GET and HEAD such that request body payloads are meaningless, it does not forbid them, and there are production API in the wild (looking at you Elastic) that will allow specifying a body payload with a GET. Such APIs are currently not usable with fetch because of the restriction in the spec that requires that an error be thrown with non-null bodies.

This causes direct problems for us in the Workers runtime because we have cases where customers want to do something like...

const newRequest = new Request(request.url, { method: request.method, body: request.body });

Without having to introduce special case logic for when request.method happens to be GET or HEAD. In the implementation of Request on the server-side, we will allow receiving a GET with a content-length: 0 header that will result in the request.body being a zero-length stream rather than null.

The change in the spec that I would like to see is relaxing the requirement on bullet point 35 of https://fetch.spec.whatwg.org/#dom-request to say that a non-null body is always accepted by the Request constructor but that host implementations are free to ignore the body if the method used is known not to apply any semantics to the body payload. This would allow browser implementations to continue just ignoring the body payload for GET and HEAD but would allow other implementations to choose how to handle those.

As far as I can tell, such a relaxation would not be breaking for the web in any way, and would afford runtimes greater flexibility to choose how such methods are handled.

Specifically, the change would be that the following would no longer throw an error and implementations would be free to decide whether or not to consume the body based on it's understanding of the semantics of whatever method is used.

new Request('https://example.com', { body: new ReadableStream() });
new Request('https://example.com', { body: new Uint8Array(n));  // where n is any value
@kentonv
Copy link

kentonv commented Sep 26, 2023

we will allow receiving a GET with a content-length: 0

To clarify, it's more than that. Cloudflare Workers today supports receiving and proxying GET requests with bodies of any size, not just zero size. We had to, because these requests exist in the wild, and it's a basic requirement of our product that it can proxy any kind of request actually seen in the wild.

We still don't allow such requests to be constructed from scratch in JavaScript, but because they can arrive off the network, it creates some awkward situations where people have a Request object that's actually impossible to reproduce from its component parts.

@lucacasonato
Copy link
Member

All browsers also do not follow this bullet point in the spec. Chrome tried to ship a fix for this (aligning to spec) and found it was not web compatible.

@jasnell
Copy link
Author

jasnell commented Sep 27, 2023

@lucacasonato ... hmm, In chrome and edge the following throws an error, new Request("http://example.com', { method: 'GET', body: new Uint8Array(0) })

@lucacasonato
Copy link
Member

Oh sorry this is about request, I was thinking this was for response - all the browsers incorrectly return a body for HEAD fetch calls even though the spec says not to

@domenic
Copy link
Member

domenic commented Sep 27, 2023

Isn't this kind of spec-noncompliance what https://github.com/wintercg/fetch is for?

I don't think there's a reason to get rid of this error on the web, since such bodies are explicitly disallowed there.

@jasnell
Copy link
Author

jasnell commented Sep 27, 2023

Isn't this kind of spec-noncompliance what wintercg/fetch is for?

Yes and no. The goal is not really to maintain a forked spec that collects a bunch of non-compliant changes. Ideally if changes are needed an effort would be made to influence the direction of the core living spec first, thus this issue. If this effort to introduce this modification to the core fetch spec fails, then yes, my next step would be to seek consensus among wintercg participating runtimes to collectively agree that such a change is worthwhile.

I don't think there's a reason to get rid of this error on the web, since such bodies are explicitly disallowed there.

By "on the web" here I think you are specifically referring to Web browsers in particular, correct? The core specification of HTTP (which I would argue is generally more fundamentally "the web" but that's a debate for another time) specifically allows bodies for GET and HEAD even if those same specifications say they don't have any semantic meaning and should be avoided. The reality is that there are real world, in-production applications in use today on the web that do, in fact, attach a payload body to a GET request and that the fetch API as it is defined today is incapable of supporting those.

Also note that removing this error, the way I have frame it, would not change much of anything for existing Web browser implementations, which would still be allowed to entirely ignore the given non-null body when used with GET and HEAD, so if browsers wish to continue to disallow such bodies, they would simply continue to do so, albeit silently rather than with a throw.

@domenic
Copy link
Member

domenic commented Sep 27, 2023

Yes, by "the web" I mean it in the sense it's used in "WHATWG", i.e. web browsers.

@annevk
Copy link
Member

annevk commented Sep 27, 2023

At a minimum any "reopen" request should surface and study the older closed issue. I vaguely recall that even HTTP WG representatives didn't necessarily see the point of this.

@mnot
Copy link
Member

mnot commented Sep 27, 2023

Indeed. RFC9110 is very clear about this:

Although request message framing is independent of the method used, content received in a GET request has no generally defined semantics, cannot alter the meaning or target of the request, and might lead some implementations to reject the request and close the connection because of its potential as a request smuggling attack (Section 11.2 of [HTTP/1.1]). A client SHOULD NOT generate content in a GET request unless it is made directly to an origin server that has previously indicated, in or out of band, that such a request has a purpose and will be adequately supported. An origin server SHOULD NOT rely on private agreements to receive content, since participants in HTTP communication are often unaware of intermediaries along the request chain.

Previous HTTP specifications said that bodies were allowed on all methods, but that was only to allow new methods to be introduced without coordinating with every intermediary on the Web. It was not intended to retrofit bodies onto GET, and RFC7231 clarified that; when the message didn't get through, we (i.e., a consensus of HTTP implementers) made it more explicit in 9110.

Yes, we're well aware that some applications (e.g., Elasticsearch) do use bodies on GET. That is not interoperable on the open web; it only works in closed systems that, effectively, are not HTTP. While some might think that their deployment qualifies as such, it's rarely the case, because most people do not control every possible HTTP system and component that the messages might ever be handled by (eg load balancers, virus scanners, WAFs). Doing so also reduces the choice of tools and implementations that can be used in that deployment, reducing the value of using HTTP.

The HTTP WG is working on a better way to do this. That's not done yet, but in the meantime, POST is viable. I'm happy to help talk folks through the issues here if it will help.

@jasnell
Copy link
Author

jasnell commented Sep 27, 2023

Wow, I managed to completely miss that change in 9110!! Thank you @mnot and @annevk for that. Unfortunately, given that there are deployed applications out there that are still using payloads in GETs, runtimes are still going to need to deal with this case but given this I agree that changing the spec on this isn't worthwhile. I appreciate the quick turnaround on the discussion and will close this :-)

@jasnell jasnell closed this as completed Sep 27, 2023
@kentonv
Copy link

kentonv commented Sep 27, 2023

Yes, we're well aware that some applications (e.g., Elasticsearch) do use bodies on GET. That is not interoperable on the open web; it only works in closed systems that, effectively, are not HTTP. While some might think that their deployment qualifies as such, it's rarely the case, because most people do not control every possible HTTP system and component that the messages might ever be handled by (eg load balancers, virus scanners, WAFs). Doing so also reduces the choice of tools and implementations that can be used in that deployment, reducing the value of using HTTP.

FWIW, Cloudflare supports bodied GETs throughout its stack. In Workers specifically, it was very early on that we were forced to support this due to user complaints. So my expectation is that any HTTP proxy product is wide use is probably forced to support bodied GETs and therefore such support is a de facto standard.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

6 participants