Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Read remaining bytes when cleaning dropped payload #2764 #2772

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

squidpickles
Copy link

PR Type

Bug Fix

PR Checklist

  • Tests for the changes have been added / updated.
  • Documentation comments have been added / updated.
  • A changelog entry has been made for the appropriate packages.
  • Format code with the latest stable rustfmt.
  • (Team) Label with affected crates and semver status.

Overview

Existing behavior did not completely read client data if a Payload was dropped without consuming all of its data. This calls read_available() before each Payload chunk is discarded. This does add a read_available_projected() so it can be called when only a pinned reference to the dispatcher is available.

Closes #2764

@squidpickles
Copy link
Author

I noticed curl stops sending bytes when the request is over a certain size, and the server returns an error response. (curl message HTTP error before end of send, stop sending)

I can see that we're reading zero bytes in the added read here https://github.com/squidpickles/actix-web/blob/30004d5b9f7fb7f03ed98f5a6f0f26ee67c17186/actix-http/src/h1/dispatcher.rs#L710 but it's not clear whether the true return (called should_disconnect elsewhere) is correct in all cases. Investigating... (also not clear how to write a unit test that mimics this client behavior)

@squidpickles
Copy link
Author

Still working on this, partly because I realized I'm not totally clear what the desired behavior should be here.

Right now (in the released code), if the Payload is dropped before it has been read completely, the Dispatcher will attempt to "clean" the connection by consuming any bytes in the read buffer. The trouble is that if there are more than decoder::MAX_BUFFER_SIZE bytes sent by the client, the cleaning process will fail (because PayloadDecoder::decode() will be called with bytes remaining and an empty read buffer), and it will jump to the 500 error and connection close.

I added an additional read into the cleaning loop if the read buffer is empty, so now the bytes will be consumed even if there are more than a single read buffer's worth.

However, the situation is complicated because a handler that does not consume the whole Payload will return a response to the client before all bytes are read from the wire. Some clients, such as cURL, will stop sending data if they get an early error response.

I think the main thing I would change here is the 500 error response, and the error logging. It seems best not to log a server error when it's a relic of client behavior. In this case, the 500 is triggered by a client either disconnecting or stopping sending bytes before the cleaning process completes. The handler has already sent a response if it was ever going to, so nothing further really needs to be sent to the client. The only case where it should be logging an error is if there's some reason the server can't read, but the client is still sending bytes. I'm not sure that's something we'd catch here and not elsewhere, so probably not worth treating as a potential error in this situation.

So my proposal is to disconnect when client send terminates early (fewer bytes are sent than the content-length header suggests) as the server does now, but not to send an additional 500 response, and not to log an error. All while preserving the extra read added in this PR, so larger requests still get cleaned properly.

@squidpickles
Copy link
Author

Ok, well, it looks like that extra read hangs the worker. I haven't worked out the polling mechanism well enough to understand why. Going to leave this PR open and maybe someone with better understanding can comment, but for now, I'll work around it and explicitly consume the entire Payload before returning from my handlers.

@robjtede robjtede mentioned this pull request Jun 11, 2022
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Failure to flush Payload when it is dropped before handler consumes all bytes
1 participant