Intermittent failure in libcurl when git cloning through authenticated HTTP proxy #2088

Closed
zachomedia opened this Issue Nov 16, 2017 · 3 comments

Comments

Projects
None yet
2 participants
@zachomedia
Contributor

zachomedia commented Nov 16, 2017

I did this

Using an authenticated HTTP proxy, running git clone https://[git path] results in intermittent failures due to "Proxy CONNECT abort".

After a bunch of debugging (with GIT_CURL_VERBOSE=1), I traced the issue to libcurl. The following differences were observed in the output:

Working:

< HTTP/1.1 407 Proxy Authentication Required
< Proxy-Authenticate: BASIC realm="$REALM"
< Cache-Control: no-cache
< Pragma: no-cache
< Content-Type: text/html; charset=utf-8
< Proxy-Connection: close
< Connection: close
< Content-Length: 1424
< 
* Ignore 1424 bytes of response-body
* Connect me again please
* CONNECT phase completed!
* Connection #0 to host $PROXY left intact
...
* Connection #0 isn't open enough, can't reuse
* Hostname $PROXY was found in DNS cache
*   Trying $PROXY_IP...
* TCP_NODELAY set
* Connected to $PROXY ($PROXY_IP) port 80 (#1)
* allocate connect buffer!
* Establish HTTP proxy tunnel to $DEST_HOST:443
* Proxy auth using Basic with user '$USERNAME'
> CONNECT $DEST_HOST:443 HTTP/1.1
Host: $DEST_HOST:443
Proxy-Authorization: Basic $AUTH
User-Agent: git/2.8.6
Proxy-Connection: Keep-Alive

< HTTP/1.1 200 Connection established

Failing:

< HTTP/1.1 407 Proxy Authentication Required
< Proxy-Authenticate: BASIC realm="$REALM"
< Cache-Control: no-cache
< Pragma: no-cache
< Content-Type: text/html; charset=utf-8
< Proxy-Connection: close
< Connection: close
< Content-Length: 1424
<
* Ignore 1424 bytes of response-body
* Establish HTTP proxy tunnel to $DEST_HOST:443
* Proxy auth using Basic with user '$USERNAME'
> CONNECT $DEST_HOST:443 HTTP/1.1
Host: $DEST_HOST:443
Proxy-Authorization: Basic $AUTH
User-Agent: git/2.8.6
Proxy-Connection: Keep-Alive

* Proxy CONNECT aborted
* CONNECT phase completed!
* Connection #0 to host $PROXY left intact

Notice that in the failing case, there is no "Connect me again please" log and it does not create a new connection to the proxy (since the proxy previously said to close the connection).

Running git bisect to find the first commit where this was issue was first experienced, I traced it to commit 5113ad0

Further debugging led me to 5113ad0#diff-a609379a03db55c0a6fe3865f3b5cb46R347,
where one of the if blocks was removed. If I put the if block back (code diff against latest master (commit 6ce9845) below), I no longer get intermittent failures when git cloneing (It appears that state about closing the connection is lost when returning into the function at a later time). However, I'm not familiar with the curl codebase to know if there are other side effects from re-introducing this if block.

diff --git a/lib/http_proxy.c b/lib/http_proxy.c
index 0283c1f3f..d1ef47e31 100644
--- a/lib/http_proxy.c
+++ b/lib/http_proxy.c
@@ -340,9 +340,14 @@ static CURLcode CONNECT(struct connectdata *conn,
         /* Read one byte at a time to avoid a race condition. Wait at most one
            second before looping to ensure continuous pgrsUpdates. */
         result = Curl_read(conn, tunnelsocket, s->ptr, 1, &gotbytes);
-        if(result == CURLE_AGAIN)
-          /* socket buffer drained, return */
-          return CURLE_OK;
+        if(result == CURLE_AGAIN) {
+          if (SOCKET_READABLE(tunnelsocket, check<10000L?check:1000) == -1) {
+            /* socket buffer drained, return */
+            return CURLE_OK;
+          }
+
+          continue;
+        }
 
         if(Curl_pgrsUpdate(conn))
           return CURLE_ABORTED_BY_CALLBACK;

I expected the following

No intermittent failures when git cloneing.

curl/libcurl version

On Alpine 3.4:
[curl -V output]

curl 7.55.0 (x86_64-alpine-linux-musl) libcurl/7.55.0 OpenSSL/1.0.2k zlib/1.2.11 libssh2/1.7.0
Release-Date: 2017-08-09
Protocols: dict file ftp ftps gopher http https imap imaps pop3 pop3s rtsp scp sftp smb smbs smtp smtps telnet tftp 
Features: IPv6 Largefile NTLM NTLM_WB SSL libz TLS-SRP UnixSockets HTTPS-proxy

On Debian Buster:
[curl -V output]

curl 7.56.1 (x86_64-pc-linux-gnu) libcurl/7.56.1 OpenSSL/1.0.2m zlib/1.2.8 libidn2/2.0.2 libpsl/0.18.0 (+libidn2/2.0.2) libssh2/1.8.0 nghttp2/1.27.0 
librtmp/2.3
Release-Date: 2017-10-23
Protocols: dict file ftp ftps gopher http https imap imaps ldap ldaps pop3 pop3s rtmp rtsp scp sftp smb smbs smtp smtps telnet tftp 
Features: AsynchDNS IDN IPv6 Largefile GSS-API Kerberos SPNEGO NTLM NTLM_WB SSL libz TLS-SRP HTTP2 UnixSockets HTTPS-proxy PSL

@zachomedia zachomedia changed the title from Intermittent failure in libcurl when git cloneing through authenticated HTTP proxy to Intermittent failure in libcurl when git cloning through authenticated HTTP proxy Nov 16, 2017

@zachomedia

This comment has been minimized.

Show comment Hide comment
@zachomedia

zachomedia Dec 1, 2017

Contributor

If my fix looks reasonable, I can open a PR for it.

Contributor

zachomedia commented Dec 1, 2017

If my fix looks reasonable, I can open a PR for it.

@bagder

This comment has been minimized.

Show comment Hide comment
@bagder

bagder Dec 5, 2017

Member

Sorry for responding slowly.

The suggested change is unfortunately not good enough. That function call will block waiting for activity until it times out if there is none, and that's a non-blocking function (like most libcurl internals are or should be).

When EAGAIN is received, there's nothing more to read for the moment it should return back to wait for the socket activity the "normal" way and once data arrives it should be called again. Since you experience an issue in this area there's clearly something that prevents this concept from working correctly...

Member

bagder commented Dec 5, 2017

Sorry for responding slowly.

The suggested change is unfortunately not good enough. That function call will block waiting for activity until it times out if there is none, and that's a non-blocking function (like most libcurl internals are or should be).

When EAGAIN is received, there's nothing more to read for the moment it should return back to wait for the socket activity the "normal" way and once data arrives it should be called again. Since you experience an issue in this area there's clearly something that prevents this concept from working correctly...

@zachomedia

This comment has been minimized.

Show comment Hide comment
@zachomedia

zachomedia Dec 6, 2017

Contributor

@bagder I figured that may be the case. I just took another look at it, and because closeConnection is a local variable, when you leave the function and return, the value is restored to false. Perhaps that should instead be stored in the http_connect_state struct?

Contributor

zachomedia commented Dec 6, 2017

@bagder I figured that may be the case. I just took another look at it, and because closeConnection is a local variable, when you leave the function and return, the value is restored to false. Perhaps that should instead be stored in the http_connect_state struct?

zachomedia added a commit to zachomedia/curl that referenced this issue Dec 6, 2017

zachomedia added a commit to zachomedia/curl that referenced this issue Dec 6, 2017

zachomedia added a commit to zachomedia/curl that referenced this issue Dec 6, 2017

@bagder bagder closed this in 55e6098 Dec 7, 2017

JohnDeHelian pushed a commit to JohnDeHelian/curl that referenced this issue Dec 7, 2017

@lock lock bot locked as resolved and limited conversation to collaborators May 6, 2018

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.