Operation timed out after 0 milliseconds with 0 out of 0 bytes received #619

Closed
dan-cristian opened this Issue Jan 27, 2016 · 13 comments

Projects

None yet

5 participants

@dan-cristian

I use CURL to perform upload/download operations via HTTP from an openstack Swift server (www.hubic.com).

I get random errors "Operation timed out after 0 milliseconds with 0 out of 0 bytes received" both on HTTP PUT and GET operations. After several retries it will eventually work.
The message seems to be incorrect as the timeout is set to 20 ms.
Can someone confirm if this timeout occurs in the SSL handshake? If yes, are there any other curl flags I can use to increase connection reliability?

The code excerpts and some debug data can be found below.

Code extras
curl_easy_setopt(curl, CURLOPT_URL, url);
curl_easy_setopt(curl, CURLOPT_HEADER, 0);
curl_easy_setopt(curl, CURLOPT_NOSIGNAL, 1);
curl_easy_setopt(curl, CURLOPT_NOPROGRESS, 0);
curl_easy_setopt(curl, CURLOPT_USERAGENT, USER_AGENT);
curl_easy_setopt(curl, CURLOPT_SSL_VERIFYPEER, 1);
curl_easy_setopt(curl, CURLOPT_SSL_VERIFYHOST, 1);
curl_easy_setopt(curl, CURLOPT_CONNECTTIMEOUT, 20);
curl_easy_setopt(curl, CURLOPT_VERBOSE, 1);
curl_easy_setopt(curl, CURLOPT_TCP_NODELAY, 1);
curl_easy_setopt(curl, CURLOPT_UPLOAD, 1);
curl_easy_perform(curl);

Debug output
CURL version=7.38.0 ssl=OpenSSL/1.0.1k
[2016-01-27 23:18:37.]:26085==adding extra header: X-Copy-From: /default_fuse_segments/default/test/t2_segments/medium.txt/1453929407.883449/00000003
[2016-01-27 23:18:37.]:26085==adding extra header: Content-Length: 0

  • Hostname was found in DNS cache
  • Trying 46.105.173.106...
  • TCP_NODELAY set
  • Connected to lb1.hubic.ovh.net (46.105.173.106) port 443 (#50)
  • successfully set certificate verify locations:
  • CAfile: none
    CApath: /etc/ssl/certs
  • SSL re-using session ID
  • Operation timed out after 0 milliseconds with 0 out of 0 bytes received
  • Closing connection 50
@jay
Member
jay commented Jan 27, 2016

Though this report seems to have some merit for the incorrect timeout message please note this is an issue tracker and in any other case your questions should be directed to the curl-library mailing list where you will get a wider audience.

I can confirm sometimes the connect timeouts may show as incorrect and show an error message of Operation timed out after 0 milliseconds or some small ms like 10ms. I'm using:

curl 7.47.0-DEV (i386-pc-win32) libcurl/7.47.0-DEV OpenSSL/1.0.2e nghttp2/1.6.0
Protocols: dict file ftp ftps gopher http https imap imaps ldap pop3 pop3s rtsp smb smbs smtp smtps telnet tftp
Features: AsynchDNS Debug Largefile NTLM SSL HTTP2

Can you give us your curl -V?
I don't know what is causing it yet I'm having trouble debugging it because when I add 'when hit/continue on execution' bps in visual studio on Curl_timeleft it slows down the timeout functions and I get a curl: (28) SSL connection timeout. data->mstate is always <= CURLM_STATE_WAITDO. But if I use a straight breakpoint right when it shows the warning I can see the now time is the same as the progress start time progress.t_startsingle. Not that helpful though.


Regarding the rest of what you wrote:

You should use exactly the types specified, for example most values must be long so you would use (long) or the integer literal suffix L as in CURLOPT_CONNECTTIMEOUT, 20L. Also you say 20ms but the connect timeout is actually specified in seconds so 20 seconds. Does the problem always occur with a content-length of 0 (why set the header to that?) and SSL session ID reuse?

When curl uses OpenSSL for its backend if there's session re-use you should see an abbreviated handshake. I don't know why the Client hello isn't sent. You may want to check in Wireshark to see if it actually is being sent. I tried session re-use on that server and it appears to work fine (using libcurl/7.47.0-DEV OpenSSL/1.0.2e).

* SSL re-using session ID
* TLSv1.2 (OUT), TLS handshake, Client hello (1):
* STATE: SENDPROTOCONNECT => PROTOCONNECT handle 0x1d21c68; line 1267 (connection #1)
* TLSv1.2 (IN), TLS handshake, Server hello (2):
* TLSv1.2 (IN), TLS change cipher, Client hello (1):
* TLSv1.2 (IN), TLS handshake, Finished (20):
* TLSv1.2 (OUT), TLS change cipher, Client hello (1):
* TLSv1.2 (OUT), TLS handshake, Finished (20):
* SSL connection using TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256
@dan-cristian

Thanks for the response. Indeed the main reason for reporting this is the incorrect error message.

curl -V gives this:
curl 7.38.0 (x86_64-pc-linux-gnu) libcurl/7.38.0 OpenSSL/1.0.1k zlib/1.2.8 libidn/1.29 libssh2/1.4.3 librtmp/2.3
Protocols: dict file ftp ftps gopher http https imap imaps ldap ldaps pop3 pop3s rtmp rtsp scp sftp smtp smtps telnet tftp
Features: AsynchDNS IDN IPv6 Largefile GSS-API SPNEGO NTLM NTLM_WB SSL libz TLS-SRP

Timeout is indeed set to 20s. I changed it to 20L now.
SSL sesion ID reuse is enabled as I use a thread pool and recycle connections. I have not tried yet to test without session reuse.

Content length is set to 0 for some files that represent manifest objects in swift storage, but the issue does not occur only for 0 lenght files, as I mention I also get the error while performing GET requests.
I'l try to debug with Wireshark and comeback with the findings.

@bagder
Member
bagder commented Feb 6, 2016

Any news? We can't reproduce this problem based on this little data.

@dan-cristian

I was not able yet to reproduce, I think the issue appears when the target
server is very loaded, but in the last days it worked without any single
incident.
I will keep trying to reproduce.
On Feb 7, 2016 01:25, "Daniel Stenberg" notifications@github.com wrote:

Any news? We can't reproduce this problem based on this little data.


Reply to this email directly or view it on GitHub
#619 (comment).

@bagder
Member
bagder commented Feb 16, 2016

As its not reproducible and we have no clues on where to start looking for a problem that hit once, I close this. If we get more/new details we can always re-open or create a new bug then.

@bagder bagder closed this Feb 16, 2016
@jay
Member
jay commented Feb 17, 2016

I didn't have time to explore this further, here is more detail in case you have any ideas.

I haven't reproduced his specific use case because I would need more information on that. However I can reproduce that error message. I had some problem debugging because any wait affects the path so when I break at almost any point it messes up the reproduction.

The way I can reproduce is if https and a ClientHello is sent but no ServerHello is received and a connect timeout is set. But I don't see that same thing in the reporter's output.

To reproduce listen on a port using netcat or socat but don't use a ssl/tls layer, like nc -l localhost 8000. Then connect to that port via https and after five seconds it times out.

curl --connect-timeout 5 https://localhost:8000
curl: (28) Operation timed out after 0 milliseconds with 0 out of 0 bytes received

Sometimes it doesn't show that error message, and will instead just say curl: (28) SSL connection timeout. I'm not sure why it's inconsistent like that but I assume it's something to do with the timing and my debugger (Visual Studio).

A possible fix for the 0ms might be change k->now in this line in multi to just the actual now time.

diff --git a/lib/multi.c b/lib/multi.c
index 069412d..26a965f 100644
--- a/lib/multi.c
+++ b/lib/multi.c
@@ -1070,7 +1070,7 @@ static CURLMcode multi_runsingle(struct Curl_multi *multi,
             failf(data, "Operation timed out after %ld milliseconds with %"
                   CURL_FORMAT_CURL_OFF_T " out of %"
                   CURL_FORMAT_CURL_OFF_T " bytes received",
-                  Curl_tvdiff(k->now, data->progress.t_startsingle),
+                  Curl_tvdiff(Curl_tvnow(), data->progress.t_startsingle),
                   k->bytecount, k->size);
           }
           else {
* Operation timed out after 5054 milliseconds with 0 out of 0 bytes received

Unless there's something wrong with k->now, maybe it's supposed to be updated and wasn't. I didn't have more time to look into that.

@jay jay reopened this Feb 17, 2016
@jay
Member
jay commented Feb 18, 2016

I took another look at this and k->now (aka data->req.now) is set initially to the start time (data->req.start) during state CURLM_STATE_CONNECT by Curl_connect (create_conn --> Curl_init_do) and is not set again until state CURLM_STATE_PERFORM by Curl_readwrite. data->progress.t_startsingle is set initially in CURLM_STATE_CONNECT as well.

Sometimes there is a few milliseconds of difference between the initial data->req.now and data->progress.t_startsingle because they are set separately:

Curl_pgrsTime(data, TIMER_STARTSINGLE);    <-- data->progress.t_startsingle set here
result = Curl_connect(data, &data->easy_conn,
                      &async, &protocol_connect);    <-- data->req.now set here (create_conn calls Curl_init_do which sets it)

So when we get the difference between data->req.now and data->progress.t_startsingle before CURLM_STATE_PERFORM it is either 0ms or something short like 16ms.

We could either update data->req.now at each state, or at the very top of the loop, or maybe we could change to using the now time like below

diff --git a/lib/multi.c b/lib/multi.c
index 069412d..3d0c4b9 100644
--- a/lib/multi.c
+++ b/lib/multi.c
@@ -1064,6 +1064,9 @@ static CURLMcode multi_runsingle(struct Curl_multi *multi,
         else if(data->mstate == CURLM_STATE_WAITCONNECT)
           failf(data, "Connection timed out after %ld milliseconds",
                 Curl_tvdiff(now, data->progress.t_startsingle));
+        else if(data->mstate < CURLM_STATE_PERFORM)
+          failf(data, "Connection timed out after %ld milliseconds",
+                Curl_tvdiff(now, data->progress.t_startsingle));
         else {
           k = &data->req;
           if(k->size != -1) {
@hieuvo
hieuvo commented Feb 22, 2016

I'm suffering the same thing, that error just randomly happens:

    curl_setopt($ch, CURLOPT_HTTPHEADER,array('Content-Type: application/json'));
    curl_setopt($ch, CURLOPT_USERAGENT, 'Merchant');
    curl_setopt($ch, CURLOPT_USERPWD, "XXX:YYY");
    curl_setopt($ch, CURLOPT_VERBOSE, 0);
    curl_setopt($ch, CURLOPT_POST, 1);
    curl_setopt($ch, CURLOPT_POSTFIELDS, $data);
    curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
    curl_setopt($ch, CURLOPT_SSL_VERIFYPEER, 1);
    curl_setopt($ch, CURLOPT_NOPROGRESS, 1);
    curl_setopt($ch, CURLOPT_FOLLOWLOCATION, 0);
    curl_setopt($ch, CURLOPT_CONNECTTIMEOUT, 10); 
    curl_setopt($ch, CURLOPT_TIMEOUT, 20);


$: curl -V
curl 7.40.0 (x86_64-redhat-linux-gnu) libcurl/7.40.0 NSS/3.19.1 Basic ECC zlib/1.2.7 libidn/1.28 libssh2/1.4.3
Protocols: dict file ftp ftps gopher http https imap imaps ldap ldaps pop3 pop3s rtsp scp sftp smb smbs smtp smtps telnet tftp
Features: AsynchDNS IDN IPv6 Largefile GSS-API Kerberos SPNEGO NTLM NTLM_WB SSL libz UnixSockets Metalink
@jay
Member
jay commented Feb 22, 2016

@hieuvo any information you can give us on when it's happening would be helpful. Can you enable verbose output and let us know what you see before the error occurs? Are the URLs it happens with all of one type? For example, are they all https..

@hieuvo
hieuvo commented Feb 22, 2016

yes, they are all https. It happens on our live production servers.
I don't have permissions to make changes and get verbose about on those ones.

@hieuvo
hieuvo commented Feb 22, 2016

The curl's results are mixed of:

Empty reply from server 
Operation timed out after 0 milliseconds with 0 out of 0 bytes 
and success results

Although, most of urls and data format are the same

@bagder bagder added a commit that closed this issue Mar 23, 2016
@bagder bagder multi: fix "Operation timed out after" timer
Use the local, reasonably updated, 'now' value when creating the message
string to output for the timeout condition.

Fixes #619
726ae07
@bagder bagder closed this in 726ae07 Mar 23, 2016
@bagder
Member
bagder commented Mar 23, 2016

I'm pretty sure this fixes the reported problem. If you experience it after this fix, please let us know and give us details on how to reproduce.

@incarosegit incarosegit pushed a commit to incarose/curl that referenced this issue Mar 25, 2016
@bagder bagder + Diego Bes multi: fix "Operation timed out after" timer
Use the local, reasonably updated, 'now' value when creating the message
string to output for the timeout condition.

Fixes #619
0a9b90f
@pfalcon
pfalcon commented Jul 20, 2016

For statistics, we experience this (intermittent) issue while accessing Amazon S3 servers. Thanks for fixing the error message, with the information in this ticket, it's clear we're hitting server overload issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment