New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Connection timeout on https streams #1174

Closed
Hummer12007 opened this Issue Dec 23, 2016 · 25 comments

Comments

Projects
None yet
7 participants
@Hummer12007

Hummer12007 commented Dec 23, 2016

A regression appeared in cb4e2be, which causes https connections to time out (found out by bisection).

To be precise, since the commit, mpd (0.19.21 as well as earlier versions) often (almost always) fails to play remote https streams due to connection timeouts: see

curl/lib/multi.c

Line 1378 in cb4e2be

failf(data, "Operation timed out after %ld milliseconds with %"
for the precise error which occurs. Such errors did not happen in prior commits.

This does not occur for http streams.

@MaxKellerman may be interested.

I'm running Gentoo.

@Hummer12007

This comment has been minimized.

Hummer12007 commented Dec 23, 2016

@jay

This comment has been minimized.

Member

jay commented Dec 23, 2016

You didn't post the information from the template, please do that in the future. What's your curl -V ? Is this related to #1156?

@Hummer12007

This comment has been minimized.

Hummer12007 commented Dec 23, 2016

@jay I've mentioned the relevant commit. And it doesn't seem so, as I don't use a proxy.
here's curl -V nonetheless (from the relevant commit):
curl 7.51.0 (x86_64-pc-linux-gnu) libcurl/7.52.0-DEV OpenSSL/1.0.2j zlib/1.2.8 libidn2/0.11 libssh2/1.8.0
Protocols: dict file ftp ftps gopher http https imap imaps ldap ldaps pop3 pop3s rtsp scp sftp smb smbs smtp smtps telnet tftp
Features: Debug TrackMemory IDN IPv6 Largefile NTLM NTLM_WB SSL libz TLS-SRP UnixSockets

@Hummer12007

This comment has been minimized.

Hummer12007 commented Dec 23, 2016

To be clear, this still occurs on git master.

@bagder

This comment has been minimized.

Member

bagder commented Dec 23, 2016

Can you please clarify what "this" is? What still occurs in git master? I miss a detailed description of what you're doing and what happens when you do that in comparison to what you expect.

@Hummer12007

This comment has been minimized.

Hummer12007 commented Dec 23, 2016

What I do:

  1. start mpd
  2. try to play a remote https stream

What I expect to happen (and happens before cb4e2be): it connects and starts playing.

What happens: an error in libcurl occurs: Operation timed out after 10010 milliseconds with 0 out of 0 bytes received. It is triggered at line 1378 of lib/multi.c (the branch if (timeout_ms <0)... if (k->size != -1). It does not start playing.

@bagder

This comment has been minimized.

Member

bagder commented Dec 23, 2016

What libcurl options does this use? Which TLS backend is this libcurl using? How can we reproduce this problem - without mpd?

@Hummer12007

This comment has been minimized.

Hummer12007 commented Dec 23, 2016

It uses the openssl backend, it seems.
The libcurl options are specified here: https://github.com/MaxKellermann/MPD/blob/master/src/input/plugins/CurlInputPlugin.cxx#L724
I did not reproduce the problem without mpd yet.

The backtrace (if it is relevant): https://paste.pound-python.org/show/U8PF7y4wi20ol6UkWPpY/

@bagder

This comment has been minimized.

Member

bagder commented Dec 23, 2016

Can you tell as a public HTTPS URL that you use when you get the problem? Are you using a proxy?

@Hummer12007

This comment has been minimized.

Hummer12007 commented Dec 24, 2016

https://radio.stew.moe/stream/stream256.opus (though it may rarely be down). I'm not using a proxy.

@bagder

This comment has been minimized.

Member

bagder commented Dec 24, 2016

I tried it like 20 times in a row. It usually gives back a 404 really fast, and once it did take 16 seconds to connect. I've not managed to get any hints that say that this is a curl problem and not the server's.

How frequently do you see the problem? Do you see the problem on all HTTPS URLs you try or just this? Can you change the timeout from 10 seconds to something larger, like 30 and see if that also is reached or if it then is likelier to succeed?

@Hummer12007

This comment has been minimized.

Hummer12007 commented Dec 24, 2016

Ah, it dies several hours ago, again (the error occured when it was alive, to be clear). Will fix that (it's up now) and try with a larger timeout in the meantime. And I see it on both https streams I've found.

@Hummer12007

This comment has been minimized.

Hummer12007 commented Dec 24, 2016

Still getting the timeout even after half an hour a minute.

@bagder

This comment has been minimized.

Member

bagder commented Dec 24, 2016

after half an hour

You meant half a minute, right?

So can you invoke curl -o /dev/null -v https://radio.stew.moe/stream/stream256.opus and show us the output when it doesn't work? I don't get it to hang in connect, but I do see very slow response at times when it doesn't seem to send any data for very long.

@Hummer12007

This comment has been minimized.

Hummer12007 commented Dec 24, 2016

It didn't fail with the command line client even once for me though (and doesn't after a dozen tries).
The issue is not specific to the server though.

@jetomit

This comment has been minimized.

jetomit commented Dec 25, 2016

Running hiperfifo from docs/examples exhibits the same behavior, with every HTTPS server I've tried. I get the following output:

Adding easy 0x20a1150 to multi 0x206c8a0 (https://startpage.com)
multi_timer_cb: Setting timeout to 0 ms
* Rebuilt URL to: https://startpage.com/
*   Trying 37.0.87.15...
* TCP_NODELAY set
Progress: https://startpage.com (0/0)
Progress: https://startpage.com (0/0)
socket callback: s=7 e=0x20a1150 what=OUT Adding data: OUT
multi_timer_cb: Setting timeout to 0 ms
REMAINING: 1
Progress: https://startpage.com (0/0)
multi_timer_cb: Setting timeout to 199 ms
REMAINING: 1
* Connected to startpage.com (37.0.87.15) port 443 (#0)
Progress: https://startpage.com (0/0)
* Cipher selection: ALL:!EXPORT:!EXPORT40:!EXPORT56:!aNULL:!LOW:!RC4:@STRENGTH
* successfully set certificate verify locations:
*   CAfile: /etc/ssl/certs/ca-certificates.crt
  CApath: none
Progress: https://startpage.com (0/0)
socket callback: s=7 e=0x20a1150 what=REMOVE 
REMAINING: 1
Progress: https://startpage.com (0/0)
REMAINING: 1

and after that curl waits indefinitely (well, at least 15 minutes). If I add

curl_easy_setopt(conn->easy, CURLOPT_CONNECTTIMEOUT, 10L);

like mpd, then curl aborts the connection after 10 s:

* Operation timed out after 10010 milliseconds with 0 out of 0 bytes received
Progress: https://startpage.com (0/0)
* Closing connection 0
REMAINING: 0
DONE: https://startpage.com/ => (28) Operation timed out after 10010 milliseconds with 0 out of 0 bytes received

Tested on Debian Jessie (using OpenSSL/1.0.1t and zlib/1.2.8) and Gentoo (using OpenSSL/1.0.2j and zlib/1.2.8). The bug appears to be specific to openssl - the gnutls backend works. I tested using the latest git master, and git bisect points to cb4e2be as the first bad commit. The command-line client works in all cases.

@Hummer12007

This comment has been minimized.

Hummer12007 commented Dec 25, 2016

Hmm, indeed, everything works fine with curl 7.52.1 with gnutls backend. The issue probably lies in the openssl backend.

@bagder

This comment has been minimized.

Member

bagder commented Dec 25, 2016

Try this very simple patch:

From 3388920be62ea5ae7b3ca34336683ab7195cc3bb Mon Sep 17 00:00:00 2001
From: Daniel Stenberg <daniel@haxx.se>
Date: Mon, 26 Dec 2016 00:06:33 +0100
Subject: [PATCH] vtls: s/SSLEAY/OPENSSL

Fixed an old leftover use of the USE_SSLEAY define which would make a
socket get removed from the applications sockets to monitor when the
multi_socket API was used, leading to timeouts.
---
 lib/vtls/vtls.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/lib/vtls/vtls.c b/lib/vtls/vtls.c
index b808e1c5f..707f24b02 100644
--- a/lib/vtls/vtls.c
+++ b/lib/vtls/vtls.c
@@ -482,11 +482,11 @@ void Curl_ssl_close_all(struct Curl_easy *data)
   }
 
   curlssl_close_all(data);
 }
 
-#if defined(USE_SSLEAY) || defined(USE_GNUTLS) || defined(USE_SCHANNEL) || \
+#if defined(USE_OPENSSL) || defined(USE_GNUTLS) || defined(USE_SCHANNEL) || \
     defined(USE_DARWINSSL) || defined(USE_NSS)
 /* This function is for OpenSSL, GnuTLS, darwinssl, and schannel only. */
 int Curl_ssl_getsock(struct connectdata *conn, curl_socket_t *socks,
                      int numsocks)
 {
@@ -516,11 +516,11 @@ int Curl_ssl_getsock(struct connectdata *conn,
   (void)conn;
   (void)socks;
   (void)numsocks;
   return GETSOCK_BLANK;
 }
-/* USE_SSLEAY || USE_GNUTLS || USE_SCHANNEL || USE_DARWINSSL || USE_NSS */
+/* USE_OPENSSL || USE_GNUTLS || USE_SCHANNEL || USE_DARWINSSL || USE_NSS */
 #endif
 
 void Curl_ssl_close(struct connectdata *conn, int sockindex)
 {
   DEBUGASSERT((sockindex <= 1) && (sockindex >= -1));
-- 
2.11.0

bagder added a commit that referenced this issue Dec 25, 2016

vtls: s/SSLEAY/OPENSSL
Fixed an old leftover use of the USE_SSLEAY define which would make a
socket get removed from the applications sockets to monitor when the
multi_socket API was used, leading to timeouts.

Bug: #1174
@Hummer12007

This comment has been minimized.

Hummer12007 commented Dec 26, 2016

The issue's resolved after the commit. Thanks!

@KyleSanderson

This comment has been minimized.

KyleSanderson commented Dec 31, 2016

@bagder could you please tag a new stable release? I was 'owned' by the stabilization of 7.52.1 by my distro. This breaks libcurl consumers, like mpd mentioned above, and rtorrent.

@bagder

This comment has been minimized.

Member

bagder commented Dec 31, 2016

The next release is planned for February 2017. The mistake this commit fixes has been around for many releases. If you want to argue for an earlier release, I'd urge you to take that to the curl-library mailing list.

@mkauf

This comment has been minimized.

Contributor

mkauf commented Jan 20, 2017

@bagder :

The mistake this commit fixes has been around for many releases.

That's wrong. The mistake has been introduced with the "SSL proxy" commit cb4e2be . This commit has removed these lines:

#if defined(USE_OPENSSL) || defined(USE_GNUTLS) || defined(USE_SCHANNEL) || \
    defined(USE_DARWINSSL) || defined(USE_POLARSSL) || defined(USE_NSS) || \
    defined(USE_MBEDTLS)
/* This function is for OpenSSL, GnuTLS, darwinssl, schannel and polarssl only.
   It should be made to query the generic SSL layer instead. */

... and added these lines:

#if defined(USE_SSLEAY) || defined(USE_GNUTLS) || defined(USE_SCHANNEL) || \
    defined(USE_DARWINSSL) || defined(USE_NSS)
/* This function is for OpenSSL, GnuTLS, darwinssl, and schannel only. */

So this has introduced bugs for: OpenSSL (fixed with a7b38c9), mbed TLS (fixed with 06b1197 ) and PolarSSL (not fixed yet).

For people that use one of these TLS backends, that's an important regression in curl 7.52.1.

@bagder

This comment has been minimized.

Member

bagder commented Jan 20, 2017

Seems correct, yes. I'll bring it back to the working layout and fix PolarSSL. Thanks!

bagder added a commit that referenced this issue Jan 20, 2017

vtls: fix PolarSSL non-blocking handling
A regression brought in cb4e2be

Reported-by: Michael Kaufmann
Bug: #1174 (comment)

jkralik pushed a commit to jkralik/curl that referenced this issue Jan 23, 2017

vtls: fix PolarSSL non-blocking handling
A regression brought in cb4e2be

Reported-by: Michael Kaufmann
Bug: curl#1174 (comment)

edolstra added a commit to NixOS/nixpkgs that referenced this issue Jan 24, 2017

edolstra added a commit to NixOS/nixpkgs that referenced this issue Jan 24, 2017

peterpih pushed a commit to railsnewbie257/curl that referenced this issue Jan 24, 2017

vtls: fix PolarSSL non-blocking handling
A regression brought in cb4e2be

Reported-by: Michael Kaufmann
Bug: curl#1174 (comment)
@pschlan

This comment has been minimized.

Contributor

pschlan commented Feb 1, 2017

I'm really glad I found this patch after many hours of debugging my application. Thanks to all people involved in fixing this.

algitbot pushed a commit to alpinelinux/aports that referenced this issue Feb 19, 2017

main/curl: Fix socket removal when multi_socket API is used (timeouts).
It's a fix (curl's a7b38c9) for regression in v7.52.0 (curl's cb4e2be).

curl/curl#1174

Without this patch apps like rtorrent don't work with HTTPS stuff.

(edge commit 2626290)

@lock lock bot locked as resolved and limited conversation to collaborators May 6, 2018

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.