curl_easy_recv(): Handle end of stream (EOS) properly #1134

Closed
wants to merge 2 commits into
from

Projects

None yet

5 participants

@mkauf
Contributor
mkauf commented Nov 21, 2016

Return CURLE_OK and set *n to 0 on EOS. Fix the example program sendrecv.c
(handle CURLE_AGAIN, handle incomplete send) and improve the documentation for
curl_easy_recv() and curl_easy_send().

Reviewed-by: Frank Meier

@mention-bot

@mkauf, thanks for your PR! By analyzing the history of the files in this pull request, we identified @bagder, @yangtse and @dfandrich to be potential reviewers.

@jay
Member
jay commented Nov 21, 2016 edited

Ref #1129 (comment)

We could probably do for a better adjustment in curl_easy_recv, I think the existing published wording is misleading:

You must ensure that the socket has data to read before calling \fIcurl_easy_recv(3)\fP, otherwise the call will return \fBCURLE_AGAIN\fP - the socket is used in non-blocking mode internally.

The problem as we recently discussed is for example that we have SSL libraries that will receive data in records, so we decrypt the record and hold the result of the data (or the SSL lib does it for us) if it was more than what was asked for. I suspect (but I haven't reviewed the connect_only code paths to prove it) that in those cases waiting on the socket could just end up waiting forever until timeout (and possible abort) if there's no more data coming, depriving the user of the cached data.

I think it may be better to encourage the user to call curl_easy_recv until it returns CURLE_AGAIN, and repackage what you said about the SSL like this:

Wait until curl_easy_recv returns CURLE_AGAIN before waiting on the socket for
received data. The reason for this is libcurl or the SSL library may internally
cache some data, therefore you should call curl_easy_recv until all data is
read which would include any cached data.

Furthermore if you wait on the socket and it tells you there is data to read
curl_easy_recv may return CURLE_AGAIN if the only data was internal SSL
processing, and no other data is available.
@frenche
Contributor
frenche commented Nov 21, 2016

@mkauf, what's the benefit of this change? maybe some apps rely on current behaviour (return value).

The problem as we recently discussed is for example that we have SSL libraries that will receive data in records, so we decrypt the record and hold the result of the data (or the SSL lib does it for us) if it was more than what was asked for.

@jay, I think usually when you get all the bytes you asked for, then you consider the socket still readable.
While if you get less than what you asked for, then you consider it as EAGAIN and wait with epoll / select.

@mkauf
Contributor
mkauf commented Nov 22, 2016

@frenche According to the current documentation, the current implementation is wrong (or vice versa). I think the documentation makes more sense. Currently curl_easy_recv() returns CURLE_UNSUPPORTED_PROTOCOL when there are no more bytes to read.

I think usually when you get all the bytes you asked for, then you consider the socket still readable.

Expert programmers will probably do this, but it's easier to explain to "just call curl_easy_recv() until it returns CURLE_AGAIN."

@jay, thank you for the suggestion! I think that's a good idea. I will include it in an update of this pull request.

@frenche
Contributor
frenche commented Nov 22, 2016

@mkauf right, I didn't realize the doc says you get zero bytes when the connection is closed.

I wonder why we need to peek at the connection in Curl_getconnectinfo(), can't we let subsequent recv call to find out the status? it looks a bit strange, especially when it is invoked from curl_easy_getinfo().

@bagder
Member
bagder commented Nov 27, 2016

wonder why we need to peek at the connection in Curl_getconnectinfo()

Because we want to avoid sending back a socket to an already dead connection and that is one way to figure that out. We can of course debate the value in preventing that, because as you say the following recv() on that socket would return error anyway.

@mkauf mkauf curl_easy_recv(): Handle end of stream (EOS) properly
Return CURLE_OK and set *n to 0 on EOS. Fix the example program sendrecv.c
(handle CURLE_AGAIN, handle incomplete send) and improve the documentation for
curl_easy_recv() and curl_easy_send().

Reviewed-by: Frank Meier
Assisted-by: Jay Satiro
8b03a4e
@mkauf
Contributor
mkauf commented Nov 28, 2016

I have updated the pull request and included @jay's suggestion.

@frenche
Contributor
frenche commented Nov 29, 2016

Because we want to avoid sending back a socket to an already dead connection and that is one way to figure that out. We can of course debate the value in preventing that, because as you say the following recv() on that socket would return error anyway.

I see, but anyway we cannot guarantee the socket would still be valid when the app calls curl_easy_recv(), as the peer might reset it in between, no?

The reason I'm debating it here, is because I think @mkauf's patch could be greatly simplified if we get rid of this check (and also because I think curl_easy_getinfo() shouldn't do any socket operations).

I've looked up Curl_getconnectinfo() and it is invoked in four places.

  • Twice in curl_easy_getinfo(), where I deem the check to be unnecessary.
  • Once in Curl_rtsp_connisdead(), we could move the check to there.
  • Once in easy_connection(), invoked from curl_easy_recv/send(), where I think the check is unnecessary.

Also, in the case where it is invoked from curl_easy_send(), I think the check is conceptually wrong, because in theory (imaginary protocol, not HTTP) the peer might have called shutdown(SHUT_WR) to close its write-end (our read-end) but still expect to read data from us.

docs/libcurl/opts/CURLOPT_CONNECT_ONLY.3
@@ -49,3 +49,4 @@ Added in 7.15.2
Returns CURLE_OK if the option is supported, and CURLE_UNKNOWN_OPTION if not.
.SH "SEE ALSO"
.BR CURLOPT_VERBOSE "(3), " CURLOPT_HTTPPROXYTUNNEL "(3), "
+.BR curl_easy_receive "(3), " curl_easy_send "(3) "
@frenche
frenche Nov 29, 2016 Contributor

This should be curl_easy_recv not receive, it should fix test 1140 failure.

@mkauf mkauf fix documentation
b09eaab
@bagder
Member
bagder commented Dec 3, 2016

The reason I'm debating it here, is because I think @mkauf's patch could be greatly simplified if we get rid of this check

@frenche, I think you've made a really solid and convincing argument, thanks! We really don't gain anything with that check and we should remove it!

@frenche
Contributor
frenche commented Dec 6, 2016

Please take a look at: https://github.com/frenche/curl/tree/avoid_socket_check
It gets rid of the check, except for the RTSP case.
@mkauf I've tested your updated test 556 before applying this patch and it failed, and after applying the patch it succeeds, can you tell if it is alright? thanks,

@mkauf mkauf added a commit that closed this pull request Dec 18, 2016
@frenche @mkauf frenche + mkauf Curl_getconnectinfo: avoid checking if the connection is closed
It doesn't benefit us much as the connection could get closed at
any time, and also by checking we lose the ability to determine
if the socket was closed by reading zero bytes.

Reported-by: Michael Kaufmann

Closes #1134
82245ea
@mkauf mkauf closed this in 82245ea Dec 18, 2016
@mkauf mkauf added a commit that referenced this pull request Dec 18, 2016
@mkauf mkauf curl_easy_recv: Improve documentation and example program
Follow-up to 82245ea: Fix the example program sendrecv.c (handle
CURLE_AGAIN, handle incomplete send). Improve the documentation
for curl_easy_recv() and curl_easy_send().

Reviewed-by: Frank Meier
Assisted-by: Jay Satiro

See #1134
afff64d
@mkauf mkauf deleted the mkauf:easy_recv_send_fix_pr branch Dec 18, 2016
@mkauf
Contributor
mkauf commented Dec 18, 2016

@frenche Thank you, your commit is alright! I have made some cosmetic changes and pushed it, together with a second commit to improve the example program and the documentation.

@frenche
Contributor
frenche commented Dec 18, 2016

Great, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment