Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

wsgi: Don't break HTTP framing during 100-continue handling #578

Merged
merged 1 commit into from Sep 17, 2021

Conversation

tipabu
Copy link
Contributor

@tipabu tipabu commented Jun 26, 2019

Expect: 100-continue is a funny beast -- the client sends it to indicate that it's willing to wait for an early error, but

  • the client has no guarantee that the server supports 100 Continue,
  • the server gets no indication of how long the client's willing to wait for the go/no-go response, and
  • even if it did, the server has no way of knowing that the response it emitted within that time was actually received within that time
  • so the client may have started sending the body regardless of what the server's done.

As a result, the server only has two options when it does not send the 100 Continue response:

  • close the socket
  • read and discard the request body

Previously, we did neither of these things; as a result, a request body could be interpreted as a new request. Now, close the connection. read and discard the body (in part because we already have tests that exercise pipelined requests after an early response).

Note that this is some pretty old bad behavior; however, prior to 5b5afd1 we actually did the right thing!

@codecov-io

This comment has been minimized.

Copy link
Contributor

@clayg clayg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is definitely more correct - existing behavior was wrong - great test example.

Unfortunately I have seen clients that try to incorrectly skip sending a body after they get a early error on 100-Continue - which may have been related to why the bug got introduced in the first place.

If any such clients still exist, they'll be surprised when the wsgi.server correctly starts interpreting their next request as the body of the already errored request! Still if we're going to wade into this confusing territory I'd rather be technically correct and confuse ill-behaved clients trying to cut corners rather than clients trying to do the right thing and keep using the socket with proper request framing when the server is slow to respond with the 100-Continue.

It may be safer in all cases to just close the socket after sending the 100-continue error response.

It would be nice to add some protection to how much we'll discard when the client and wsgi.app are having a hard time working out what they want to do.

@tipabu
Copy link
Contributor Author

tipabu commented Jun 27, 2019

It would be nice to add some protection to how much we'll discard when the client and wsgi.app are having a hard time working out what they want to do.

That's a good idea! I wrote up tipabu@21a61d9 on top of this to implement it, but I'm still trying to decide whether I like my approach. Like, it feels weird that it would have to be set on every (rejected) request, but I'm not quite sure where to stuff it otherwise. It also feels awkward that the app would have to do the book-keeping if/when it does some reads before rejecting -- at that point, maybe it should just be a binary flag to say "trash this connection now" and skip the read()s...

And that's even before I try to figure out where I ought to document such a feature 😛

@clayg
Copy link
Contributor

clayg commented Sep 23, 2020

clients that try to incorrectly skip sending a body after they get a early error on 100-Continue

It may not have been clear - I don't care about such clients; that was mostly a missive about how this corner of HTTP is wonky

It may be safer in all cases to just close the socket after sending the 100-continue error response.

This would probably best for everyone, does tipabu/eventlet@21a61d9 do that or something else?

We can merge this in the meantime.

@clayg
Copy link
Contributor

clayg commented Sep 9, 2021

I've cooled on this patch.

Despite lots of resources making it clear that discarding the body is the right thing to do, the way that master works (guarding the discard so that it only happens when client did NOT send 100-Continue) actually seems to have the broader compatibility with how clients that send 100-Continue actually behave

https://curl.se/mail/lib-2004-08/0002.html

That is to say "clients that try to incorrectly skip sending a body after they get a early error on 100-Continue" are seemingly incredibly common (e.g. python boto, aws-sdk-go) and this change makes things worse.

We should just close the connection in all cases; it's the only way to make sure the client and server agree on what to do next. Even if the body might be small, the cost of getting HTTP framing wrong is too high. There's no need to optimize; just close the connection.

@temoto
Copy link
Member

temoto commented Sep 10, 2021

Agree with "just disconnect" solution.

@tipabu tipabu force-pushed the 100-continue-framing branch 2 times, most recently from fa9408b to bded9f3 Compare September 10, 2021 17:57
@tipabu
Copy link
Contributor Author

tipabu commented Sep 10, 2021

Updated to close the connection.

Expect: 100-continue is a funny beast -- the client sends it to indicate
that it's willing to wait for an early error, but

- the client has no guarantee that the server supports 100 Continue,
- the server gets no indication of how long the client's willing to wait
  for the go/no-go response, and
- even if it did, the server has no way of knowing that the response it
  *emitted* within that time was actually *received* within that time
- so the client may have started sending the body regardless of what the
  server's done.

As a result, the server only has two options when it *does not* send the
100 Continue response:

- close the socket
- read and discard the request body

Previously, we did neither of these things; as a result, a request body
could be interpreted as a new request. Now, close out the connection,
including sending a `Connection: close` header when practical.
Copy link
Contributor

@clayg clayg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the tests are great, the finally block handling looks pretty good

The work to get the connection: close header on the wire raises the question for me how many apps were responding without reading the body and expect the server to discard it - outside of the corner case with 100-continue (and wonky clients that maybe won't send the body) it seems straight forward and might have been working well

not headers_sent and hasattr(result, '__len__') and \
'Content-Length' not in [h for h, _v in headers_set[1]]:
headers_set[1].append(('Content-Length', str(sum(map(len, result)))))
if headers_set and not headers_sent and hasattr(result, '__len__'):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the boolean here is tricky "headers and not headers" but it's trying to say something like:

start_response called, but not sent to client AND we can do a cute len check on result

headers_set[1].append(('Content-Length', str(sum(map(len, result)))))
if request_input.should_send_hundred_continue:
# We've got a complete final response, and never sent a 100 Continue.
# There's no chance we'll need to read the body as we stream out the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no chance we'll need to read the body

because the response is a list of strings and not a generator or anything fancy; so we know it's the whole response.

# We've got a complete final response, and never sent a 100 Continue.
# There's no chance we'll need to read the body as we stream out the
# response, so we can be nice and send a Connection: close header.
self.close_connection = 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the point is it's entirely possible apps aren't reading the request body - that's why we had the discard

Either, we have to always read/discard the body FOR the app, or we have to close the connection.

I think this is an interesting case, but I'm curious if there's a more generic/maintainable way to express the same behavior. The "hey look we get to send the connection close header!" is a minor win over "rock solid http framing"

# connection and let the client retry.
# https://curl.se/mail/lib-2004-08/0002.html
# Note that we likely *won't* send a Connection: close header at this point
self.close_connection = 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is really the crux of it; we've handled this response - if we didn't send a 100-continue and the client was expecting one that's it for this connection

does it matter if it was a zero byte POST/PUT? can you even set expect-100 in that case?

so up above where we're being cute with hasattr __len__ and connection close headers; do we really want to close the connection, or do we want to set "should_send_hundred_continue" to False and let it discard?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does it matter if it was a zero byte POST/PUT? can you even set expect-100 in that case?

Expect-100 with zero body is valid, with especially clear semantic for PUT (rewrite with empty). It's not economic to close connection in zero body request case, but probably it's also very rare.

so up above where we're being cute with hasattr len and connection close headers; do we really want to close the connection, or do we want to set "should_send_hundred_continue" to False and let it discard?

My vote for close as more robust option.

# Read and discard body if there was no pending 100-continue
if not request_input.wfile and self.close_connection == 0:
# Read and discard body if connection is going to be reused
if self.close_connection == 0:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this could probably be collapsed into a single expression:

there's body on the wire and we're still using this connection

and it's intirely independent from should_send_hundred_continue, which is nice

Copy link
Member

@temoto temoto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@temoto
Copy link
Member

temoto commented Sep 17, 2021

Unsure about lot of comments. Waiting for hard do or don't merge for few days.

@clayg
Copy link
Contributor

clayg commented Sep 17, 2021

We shipped it exactly like this and it rolled out yesterday - no fires yet 🤣

I like everything except for the tricky bit where we get fancy right after calling the applicaiton trying to see if we can send headers - it's such a ninche case [1] - marginal value at the expense of duplicated if should send 100: close = 1 logic

If we ever did want to improve our expect-100-continue connection handling, i.e.:

  1. let 0-byte bodies stay open (impossible to get framing wrong)
  2. let chunked-transfer maybe send 0\r\n\r\n (impossible to confuse with next HTTP request)
  3. interpolate next bytes and attempt to catagorize as body or request, allowing small bodies to be discarded (kinda sketch)

... we'd have to make sure we remember to do that in two places. I don't see an obvious way to DRY it out - it's necessary complexity if we want to handle the speical case.

  1. maybe not so niche? When we return a 429 to an expect-100 PUT our resp probably passes in the "list-of-strings" test.

@temoto
Copy link
Member

temoto commented Sep 17, 2021

@clayg right, the two places. eventlet.wsgi code could use refactoring for easier understanding, including breaking the big request processing function into few parts. It could be a single close=1 then. Thanks for review and testing.

@temoto temoto merged commit 579c498 into eventlet:master Sep 17, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants