HTTP pipelining and reusing easy handles causes mismatched request/response #1380

ahatala · 2017-04-03T10:40:26Z

I've narrowed down an issue in our use of libcurl to this repro case. Getting the repro to trigger seems to be timing-dependent so it might take a couple of runs to hit.

This is CURL verbose output for a failed test run: fail.txt

The user-visible problem is that the response data to a simple http request occasionally is associated to another http request in flight (to the same host), when multiple requests are pipelined over the same connection. The curl pipelining maintains a queue of easy handles for feeding the results to, under some conditions it seems that the queue gets sorted into incorrect order. A necessary condition for triggering the error seems to be that there are request both in sending and receiving states, the repro case accomplishes this by pushing in new requests in response to reading responses.

I'm using current github master (commit 55f4aba) but the issue triggers with stable versions as well (tried 7_39_0 and 7_53_1). The issue seems platform independent, FWIW I'm running on Win10 with configuration "LIB Debug - DLL Windows SSPI".

I can get the issue to repro against a number of different HTTP servers and I've verified that there's nothing suspicious in the data returned by the server on the wire.

I'm happy to provide any further data that may be useful for tracking down the root cause.

The text was updated successfully, but these errors were encountered:

ahatala · 2017-04-03T15:43:24Z

I spent some time root causing this and I believe I know what is happening.

The culprit seems to be USE_RECV_BEFORE_SEND_WORKAROUND. Received data may be buffered in connectdata.postponed and it might get silently discarded on connection reuse in the call to conn_reset_all_postponed_data() in url.c:6232

conn_reset_all_postponed_data(conn); /* reset unprocessed data */

This causes the next read to read the response of a more recent request, I suppose with larger payloads the buffered data might not be limited to full response boundaries either so garbage would be read in place of valid HTTP headers.

Since this requires USE_RECV_BEFORE_SEND_WORKAROUND my original statement of being platform independent is clearly wrong, the issue is Windows specific.

I don't understand what the motivation behind discarding buffered data upon connection reuse is, but just commenting out that offending line does seem to resolve the issue.

jay · 2017-04-03T19:53:12Z

Yeah that looks strange, it probably needs a move/copy_all_postponed_data instead of freeing it.

I suppose with larger payloads the buffered data might not be limited to full response boundaries either so garbage would be read in place of valid HTTP headers.

Can you explain how that could happen?

bagder · 2017-04-03T22:09:14Z

That code section comes from commit 72d5e14, issue #657 and PR #668 ... Maybe we just need to add a condition to not do that if pipelining is used?

jay · 2017-04-03T22:21:42Z

I think it would be better to just keep it regardless so it doesn't come back to bite some other way. I don't understand why it's being freed there.

- Don't free the postponed struct on conn reuse; transfer it to the replacement conn instead. - Encapsulate postponed struct initialization. Bug: curl#1380 Reported-by: Antti Hätälä

jay · 2017-04-04T07:14:17Z

@ahatala Can you try master...jay:fix_conn_reuse_if_postponed_data

ahatala · 2017-04-04T08:28:00Z

@jay I can test later, but I don't think that patch is going to work. The original problem is that there is postponed data in "conn" that gets reset. There is no postponed data in "old_conn" and how could there be, since that connection was just created. Your change overwrites existing postponed data with no postponed data, leaking the buffer(s) in doing so. Right?

jay · 2017-04-04T20:14:13Z

Your change overwrites existing postponed data with no postponed data, leaking the buffer(s) in doing so. Right?

Yes I had it backwards, old_conn is a pointer to the temporary connection struct created after conn and conn holds the connection, so there's no reason to transfer. Your fix is fine then. Thanks for the great analysis and repro case.

jay added the HTTP label Apr 3, 2017

jay closed this as completed in 36e604f Apr 4, 2017

lock bot locked as resolved and limited conversation to collaborators May 6, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HTTP pipelining and reusing easy handles causes mismatched request/response #1380

HTTP pipelining and reusing easy handles causes mismatched request/response #1380

ahatala commented Apr 3, 2017

ahatala commented Apr 3, 2017

jay commented Apr 3, 2017

bagder commented Apr 3, 2017

jay commented Apr 3, 2017

jay commented Apr 4, 2017

ahatala commented Apr 4, 2017

jay commented Apr 4, 2017 •

edited

Loading

HTTP pipelining and reusing easy handles causes mismatched request/response #1380

HTTP pipelining and reusing easy handles causes mismatched request/response #1380

Comments

ahatala commented Apr 3, 2017

ahatala commented Apr 3, 2017

jay commented Apr 3, 2017

bagder commented Apr 3, 2017

jay commented Apr 3, 2017

jay commented Apr 4, 2017

ahatala commented Apr 4, 2017

jay commented Apr 4, 2017 • edited Loading

jay commented Apr 4, 2017 •

edited

Loading