Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rustls: only return CURLE_AGAIN when TLS session is fully drained #6894

Closed
wants to merge 1 commit into from

Conversation

@jblazquez
Copy link
Contributor

@jblazquez jblazquez commented Apr 14, 2021

The code in cr_recv was returning prematurely as soon as the socket reported no more data to read. However, this could be leaving some unread plaintext data in the rustls session from a previous call, causing the transfer to hang if the socket never receives further data. We need to ensure that the session is fully drained of plaintext data before returning CURLE_AGAIN to the caller.

To illustrate the problem, consider the following sequence of events as an example:

  1. The server writes a 20KB plaintext response body.
  2. The response gets sent in two TLS records: one 16KB record, and one 8KB record.
  3. The TLS stream gets sent as two TCP segments: one 15KB segment, and one 9KB segment.
  4. The client receives the first TCP segment.
  5. cr_recv is called and sread returns 15KB of TLS data.
  6. rustls_client_session_read_tls is passed the 15KB of data and stores it.
  7. rustls_client_session_process_new_packets is called but since the first TLS record isn't complete it does nothing.
  8. rustls_client_session_read returns EOF because there is no plaintext data to read yet.
  9. cr_recv returns CURLE_AGAIN
  10. The client receives the second TCP segment.
  11. cr_recv is called and sread returns 9KB of TLS data.
  12. rustls_client_session_read_tls is passed the 9KB of data and stores it.
  13. rustls_client_session_process_new_packets is called, and since both TLS records are complete, it processes them and extracts the 20KB of plaintext data.
  14. rustls_client_session_read returns 16KB of plaintext, because that's the size of the plainbuf passed to cr_recv.
  15. cr_recv returns 16KB.
  16. cr_recv is called again and sread returns EAGAIN since there are no more TCP segments to read.
  17. cr_recv returns CURLE_AGAIN prematurely, even though there are still 4KB of plaintext to be read from the session.

I don't have an easy repro for you, but the problem manifests very frequently in our internal application. These are the corresponding curl trace logs with some additional logs that I added for debugging:

STATE: PROTOCONNECTING => DO handle 0x1d05e7f8; line 1864 (connection #13)
STATE: DO => DID handle 0x1d05e7f8; line 1922 (connection #13)
STATE: DID => PERFORMING handle 0x1d05e7f8; line 2041 (connection #13)
sread: 14600 bytes
rustls_client_session_read_tls: 14600 bytes
EOF from rustls_client_session_read
cr_read: CURLE_AGAIN
sread: 14600 bytes
rustls_client_session_read_tls: 3837 bytes
rustls_client_session_read_tls: 10763 bytes
rustls_client_session_read: 16384 bytes
cr_read: 16384 bytes
Mark bundle as not supporting multiuse
HTTP 1.1 or later with persistent connection
sread: 14600 bytes
rustls_client_session_read_tls: 5650 bytes
rustls_client_session_read_tls: 8950 bytes
rustls_client_session_read: 14584 bytes
EOF from rustls_client_session_read
cr_read: 14584 bytes
sread: 16709 bytes
rustls_client_session_read_tls: 5663 bytes
rustls_client_session_read_tls: 11046 bytes
rustls_client_session_read: 16384 bytes
cr_read: 16384 bytes
sread: 16709 bytes
rustls_client_session_read_tls: 15904 bytes
rustls_client_session_read_tls: 805 bytes
rustls_client_session_read: 16384 bytes
cr_read: 16384 bytes
sread: 16709 bytes
rustls_client_session_read_tls: 16709 bytes
rustls_client_session_read: 16384 bytes
cr_read: 16384 bytes
sread: 8273 bytes
rustls_client_session_read_tls: 8273 bytes
rustls_client_session_read: 12009 bytes
EOF from rustls_client_session_read
cr_read: 12009 bytes
sread: 14600 bytes
rustls_client_session_read_tls: 8627 bytes
rustls_client_session_read_tls: 5973 bytes
rustls_client_session_read: 16384 bytes
cr_read: 16384 bytes
sread: 16709 bytes
rustls_client_session_read_tls: 10440 bytes
rustls_client_session_read_tls: 6269 bytes
rustls_client_session_read: 8981 bytes
EOF from rustls_client_session_read
cr_read: 8981 bytes
sread: 9902 bytes
rustls_client_session_read_tls: 2741 bytes
rustls_client_session_read_tls: 7161 bytes
rustls_client_session_read: 16384 bytes <- we read 16KB but there are still some plaintext bytes left in the session
cr_read: 16384 bytes <- we return the 16KB that we read
sread: EAGAIN or EWOULDBLOCK <- we return CURL_EAGAIN prematurely
... 10 seconds later ...
Operation timed out after 10006 milliseconds with 133660 bytes received
multi_done
The cache now contains 5 members
Closing connection 13
Expire cleared (transfer 0x1d05e7f8)
The code in cr_recv was returning prematurely as soon as the socket
reported no more data to read. However, this could be leaving some
unread plaintext data in the rustls session from a previous call,
causing causing the transfer to hang if the socket never receives
further data.

We need to ensure that the session is fully drained of plaintext data
before returning CURLE_AGAIN to the caller.
@jblazquez jblazquez force-pushed the rustls-session-drain branch from 6374c26 to 3733ec4 Apr 14, 2021
@bagder
Copy link
Member

@bagder bagder commented Apr 15, 2021

/cc @jsha

Loading

@bagder bagder added the TLS label Apr 15, 2021
jsha
jsha approved these changes Apr 15, 2021
Copy link
Contributor

@jsha jsha left a comment

Thanks for the patch, and the detailed breakdown of the issue @jblazquez. I agree with your assessment of the bug, and the patch looks good to me.

Loading

bagder
bagder approved these changes Apr 15, 2021
@bagder bagder closed this in 40d2d39 Apr 15, 2021
@bagder
Copy link
Member

@bagder bagder commented Apr 15, 2021

Thanks, both of you!

Loading

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

3 participants