New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
wsgi: Don't break HTTP framing during 100-continue handling #578
Conversation
812fc43
to
ff61b94
Compare
This comment has been minimized.
This comment has been minimized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is definitely more correct - existing behavior was wrong - great test example.
Unfortunately I have seen clients that try to incorrectly skip sending a body after they get a early error on 100-Continue - which may have been related to why the bug got introduced in the first place.
If any such clients still exist, they'll be surprised when the wsgi.server correctly starts interpreting their next request as the body of the already errored request! Still if we're going to wade into this confusing territory I'd rather be technically correct and confuse ill-behaved clients trying to cut corners rather than clients trying to do the right thing and keep using the socket with proper request framing when the server is slow to respond with the 100-Continue.
It may be safer in all cases to just close the socket after sending the 100-continue error response.
It would be nice to add some protection to how much we'll discard when the client and wsgi.app are having a hard time working out what they want to do.
That's a good idea! I wrote up tipabu@21a61d9 on top of this to implement it, but I'm still trying to decide whether I like my approach. Like, it feels weird that it would have to be set on every (rejected) request, but I'm not quite sure where to stuff it otherwise. It also feels awkward that the app would have to do the book-keeping if/when it does some reads before rejecting -- at that point, maybe it should just be a binary flag to say "trash this connection now" and skip the And that's even before I try to figure out where I ought to document such a feature 😛 |
It may not have been clear - I don't care about such clients; that was mostly a missive about how this corner of HTTP is wonky
This would probably best for everyone, does tipabu/eventlet@21a61d9 do that or something else? We can merge this in the meantime. |
ff61b94
to
3170078
Compare
I've cooled on this patch. Despite lots of resources making it clear that discarding the body is the right thing to do, the way that master works (guarding the discard so that it only happens when client did NOT send 100-Continue) actually seems to have the broader compatibility with how clients that send 100-Continue actually behave https://curl.se/mail/lib-2004-08/0002.html That is to say "clients that try to incorrectly skip sending a body after they get a early error on 100-Continue" are seemingly incredibly common (e.g. python boto, aws-sdk-go) and this change makes things worse. We should just close the connection in all cases; it's the only way to make sure the client and server agree on what to do next. Even if the body might be small, the cost of getting HTTP framing wrong is too high. There's no need to optimize; just close the connection. |
Agree with "just disconnect" solution. |
fa9408b
to
bded9f3
Compare
Updated to close the connection. |
Expect: 100-continue is a funny beast -- the client sends it to indicate that it's willing to wait for an early error, but - the client has no guarantee that the server supports 100 Continue, - the server gets no indication of how long the client's willing to wait for the go/no-go response, and - even if it did, the server has no way of knowing that the response it *emitted* within that time was actually *received* within that time - so the client may have started sending the body regardless of what the server's done. As a result, the server only has two options when it *does not* send the 100 Continue response: - close the socket - read and discard the request body Previously, we did neither of these things; as a result, a request body could be interpreted as a new request. Now, close out the connection, including sending a `Connection: close` header when practical.
bded9f3
to
579c498
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the tests are great, the finally block handling looks pretty good
The work to get the connection: close header on the wire raises the question for me how many apps were responding without reading the body and expect the server to discard it - outside of the corner case with 100-continue (and wonky clients that maybe won't send the body) it seems straight forward and might have been working well
not headers_sent and hasattr(result, '__len__') and \ | ||
'Content-Length' not in [h for h, _v in headers_set[1]]: | ||
headers_set[1].append(('Content-Length', str(sum(map(len, result))))) | ||
if headers_set and not headers_sent and hasattr(result, '__len__'): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the boolean here is tricky "headers and not headers" but it's trying to say something like:
start_response called, but not sent to client AND we can do a cute len check on result
headers_set[1].append(('Content-Length', str(sum(map(len, result))))) | ||
if request_input.should_send_hundred_continue: | ||
# We've got a complete final response, and never sent a 100 Continue. | ||
# There's no chance we'll need to read the body as we stream out the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's no chance we'll need to read the body
because the response is a list of strings and not a generator or anything fancy; so we know it's the whole response.
# We've got a complete final response, and never sent a 100 Continue. | ||
# There's no chance we'll need to read the body as we stream out the | ||
# response, so we can be nice and send a Connection: close header. | ||
self.close_connection = 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess the point is it's entirely possible apps aren't reading the request body - that's why we had the discard
Either, we have to always read/discard the body FOR the app, or we have to close the connection.
I think this is an interesting case, but I'm curious if there's a more generic/maintainable way to express the same behavior. The "hey look we get to send the connection close header!" is a minor win over "rock solid http framing"
# connection and let the client retry. | ||
# https://curl.se/mail/lib-2004-08/0002.html | ||
# Note that we likely *won't* send a Connection: close header at this point | ||
self.close_connection = 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is really the crux of it; we've handled this response - if we didn't send a 100-continue and the client was expecting one that's it for this connection
does it matter if it was a zero byte POST/PUT? can you even set expect-100 in that case?
so up above where we're being cute with hasattr __len__
and connection close headers; do we really want to close the connection, or do we want to set "should_send_hundred_continue" to False and let it discard?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does it matter if it was a zero byte POST/PUT? can you even set expect-100 in that case?
Expect-100 with zero body is valid, with especially clear semantic for PUT (rewrite with empty). It's not economic to close connection in zero body request case, but probably it's also very rare.
so up above where we're being cute with hasattr len and connection close headers; do we really want to close the connection, or do we want to set "should_send_hundred_continue" to False and let it discard?
My vote for close as more robust option.
# Read and discard body if there was no pending 100-continue | ||
if not request_input.wfile and self.close_connection == 0: | ||
# Read and discard body if connection is going to be reused | ||
if self.close_connection == 0: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this could probably be collapsed into a single expression:
there's body on the wire and we're still using this connection
and it's intirely independent from should_send_hundred_continue, which is nice
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Unsure about lot of comments. Waiting for hard do or don't merge for few days. |
We shipped it exactly like this and it rolled out yesterday - no fires yet 🤣 I like everything except for the tricky bit where we get fancy right after calling the applicaiton trying to see if we can send headers - it's such a ninche case [1] - marginal value at the expense of duplicated If we ever did want to improve our expect-100-continue connection handling, i.e.:
... we'd have to make sure we remember to do that in two places. I don't see an obvious way to DRY it out - it's necessary complexity if we want to handle the speical case.
|
@clayg right, the two places. |
Expect: 100-continue
is a funny beast -- the client sends it to indicate that it's willing to wait for an early error, but100 Continue
,As a result, the server only has two options when it does not send the
100 Continue
response:Previously, we did neither of these things; as a result, a request body could be interpreted as a new request. Now, close the connection.
read and discard the body (in part because we already have tests that exercise pipelined requests after an early response).Note that this is some pretty old bad behavior; however, prior to 5b5afd1 we actually did the right thing!