libcurl reuses connections when proxy port changes #648

Closed
p opened this Issue Feb 13, 2016 · 17 comments

Projects

None yet

6 participants

@p
p commented Feb 13, 2016

This was originally reported to pycurl: pycurl/pycurl#392

Test program in C: https://github.com/p/test-repo/blob/master/libcurl/socksproxybug.c reproduces the bug against libcurl master.

@fuchaoqun can provide the public test IP.

Even though proxy port changes between requests, libcurl reuses the connection which means the second request is sent to the first proxy.

@p p referenced this issue in pycurl/pycurl Feb 13, 2016
Closed

bugs with socks5 proxy #392

@p p changed the title from libcurl reuses connections when proxy settings change to libcurl reuses connections when proxy port changes Feb 13, 2016
@jay
Member
jay commented Feb 13, 2016

Confirmed when the proxy host is the same but the port is different.

@frenche
Contributor
frenche commented Feb 13, 2016

Can't seem to reproduce with HTTP proxy, so I guess this is SOCKS only issue.
The code in ConnectionExists does compare proxy port though.

@jay
Member
jay commented Feb 13, 2016

Curl_conncache_find_bundle is called in ConnectionExists and it returns the existing bundle from the previous connection because the hashkey is made using only the proxy name and localport. Adding something like conn->port to the hash would definitely work but I don't know how unique it's supposed to be, in other words I don't know whether it's supposed to find bundles with different destination ports and return them from the cache. Curl_conncache_find_bundle (just a few lines down from hashkey) doesn't say anything about ports:

Look up the bundle with all the connections to the same host this connectdata struct is setup to use.

So the appropriate place may be to handle this in ConnectionExists and check the bundle after it's returned. The code I expected to be hit in this else block which checks the proxy port is actually never hit. I'll try redoing that statement, maybe combining it with the else block somehow.

      if(!needle->bits.httpproxy || needle->handler->flags&PROTOPT_SSL ||
         (needle->bits.httpproxy && check->bits.httpproxy &&
          needle->bits.tunnel_proxy && check->bits.tunnel_proxy &&
          Curl_raw_equal(needle->proxy.name, check->proxy.name) &&
          (needle->port == check->port))) {
        /* The requested connection does not use a HTTP proxy or it uses SSL or
           it is a non-SSL protocol tunneled over the same http proxy name and
           port number or it is a non-SSL protocol which is allowed to be
           upgraded via TLS */
@jay jay added a commit to jay/curl that referenced this issue Feb 14, 2016
@jay jay url: Stop connection reuse on proxy port change. draft1
If the proxy for an upcoming connection (needle) doesn't have the same
port as the the proxy for an existing connection (check) then do not
attempt to reuse that existing connection.

Bug: curl#648
Reported-by: fuchaoqun@users.noreply.github.com
Reported-by: Oleg Pudeyev
c14f9f8
@jay
Member
jay commented Feb 14, 2016

I started a branch here to fix this issue. I could use a set of eyes with some experience in the conn area.

The first draft I've changed it so that all of the scheme (or protocol) and host and proxy information must match to have a successful match. I think this is easier to understand and more conservative but I don't know if it's more correct. The old else block the proxy check did not check for the scheme or protocol or host and I don't know why that is. So maybe I'm missing something important here.

Also: I don't understand the protocol and scheme logic:

(Curl_raw_equal(needle->handler->scheme, check->handler->scheme) ||
 (needle->handler->protocol & check->handler->protocol))

If the scheme doesn't match, why is the protocol checked? Wouldn't it make sense to check one or the other? Also, why is the protocol comparison an & instead of ==, isn't that field supposed to be a single bit?

@bagder
Member
bagder commented Feb 14, 2016

AFAICS, the bundles should be per host+port, and that is actually what is documented for example the CURLMOPT_MAX_HOST_CONNECTIONS option (which is one of the reasons that bundle concept was invented). So adding the remote port to the hash should be fine for all I can think right now. Did you run any tests with such a change to see if anything breaks?

The protocol and '&' is probably a legacy from early code iterations when we ORed bits in the protocol field. I can think of a specific reason right now why & would be better than ==, or even a case where the scheme would differ from the protocol bits. But I think we should tread gently in that code and not change more than we have to, to fix actual problems we have identified as it has proven to cause us much head aches in the past when we've had subtle details wrong.

@bagder
Member
bagder commented Feb 14, 2016

Have you tried to write up a test case that repeats the flaw? It might be tricky if it really needs two functional local proxy ports, but maybe we can just let one of them fail?

@jay
Member
jay commented Feb 15, 2016

The thing is the old logic there doesn't make much sense to me, for example when can that else block be hit except if it's an http proxy whose proxy information for the new conn (needle) doesn't match the old conn (check). Ok so say we get to that point then it does another check in that else block to see if the proxy information matches... and it won't because how else does it get to that point. Unless I'm missing something.

As far as the protocol even if it's a legacy it's still confusing to do it like that, if a protocol should be a single protocol I think we should be following that pattern in the call, all something like that does is make me scratch my head!

Yes, I think we can let one of the calls fail rather than use two ports in the test. I know in some of the tests they use the port 60000 expecting that that port won't be open. But what if it is? Couldn't some other program on the machine take that port and act as a server (either inadvertently or maliciously). Assuming 60000 is ok the test command would probably look something like this:

<command>
--socks5 %HOSTIP:%SOCKSPORT http://%HOSTIP:%HTTPPORT/713 --next --socks5 %HOSTIP:60000 http://%HOSTIP:%HTTPPORT/713
</command>
@fuchaoqun

It's really useful local ports binding for multiple remote socks5 proxies, when will be the fix released? thanks for your great job

@fuchaoqun

the bug can be "fixed" by set opt FRESH_CONNECT or FORBID_REUSE true

@bagder
Member
bagder commented May 2, 2016

How about this simple patch?

From 4d795102898ec04380f1ff365b82566658593804 Mon Sep 17 00:00:00 2001
From: Daniel Stenberg <daniel@haxx.se>
Date: Mon, 2 May 2016 23:15:05 +0200
Subject: [PATCH] conncache: use proper port number as hash key for bundles!

Reported-by: Oleg Pudeyev

Fixes #648
---
 lib/conncache.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/conncache.c b/lib/conncache.c
index 930b932..d0c09c8 100644
--- a/lib/conncache.c
+++ b/lib/conncache.c
@@ -138,11 +138,11 @@ static char *hashkey(struct connectdata *conn)
   else if(conn->bits.conn_to_host)
     hostname = conn->conn_to_host.name;
   else
     hostname = conn->host.name;

-  return aprintf("%s:%d", hostname, conn->localport);
+  return aprintf("%s:%d", hostname, conn->port);
 }

 /* Look up the bundle with all the connections to the same host this
    connectdata struct is setup to use. */
 struct connectbundle *Curl_conncache_find_bundle(struct connectdata *conn,
-- 
2.8.1
@bagder bagder self-assigned this May 2, 2016
@bagder
Member
bagder commented May 3, 2016

When trying to write up a test case for this, I realized that this bug isn't reproducible using HTTP proxies which made me find another mistake that in fact also fixes the problem when applied stand-alone, but I think both changes should be done to limit the number of entries to check in the ConnectionExists function:

--- a/lib/url.c
+++ b/lib/url.c
@@ -3374,11 +3374,11 @@ ConnectionExists(struct SessionHandle *data,
           /* one of them was different */
           continue;
         }
       }

-      if(!needle->bits.httpproxy || (needle->handler->flags&PROTOPT_SSL) ||
+      if(!needle->bits.proxy || (needle->handler->flags&PROTOPT_SSL) ||
          (needle->bits.httpproxy && check->bits.httpproxy &&
           needle->bits.tunnel_proxy && check->bits.tunnel_proxy &&
           Curl_raw_equal(needle->proxy.name, check->proxy.name) &&
           (needle->port == check->port))) {
         /* The requested connection does not use a HTTP proxy or it uses SSL or
@bagder bagder added a commit that closed this issue May 3, 2016
@bagder bagder connections: non-HTTP proxies on different ports aren't reused either
Reported-by: Oleg Pudeyev and fuchaoqun

Fixes #648
5823179
@bagder bagder closed this in 5823179 May 3, 2016
@mkauf
Contributor
mkauf commented May 3, 2016

Commit 5823179 probably introduces a bug. For SOCKS the "else" branch will be taken (previously the "if" branch). So with SOCKS, connections may be reused when needle->host.name and check->host.name differ.

I think that the "else" branch should only be taken for HTTP proxies in normal (non-tunnel) mode.

There is also an existing bug that the proxy name/port/type is not compared for SSL protocols ( needle->handler->flags&PROTOPT_SSL).

@jay 's branch fixes some of these issues, but it is too strict for connections to HTTP proxies in normal (non-tunnel) mode.

@bagder
Member
bagder commented May 8, 2016

@mkauf ah, yeah, connections of the two different types should probably be dealt with differently. The second part of that patch should then probably be reverted. You agree?

@bagder bagder reopened this May 8, 2016
@bagder
Member
bagder commented May 8, 2016

There is also an existing bug that the proxy name/port/type is not compared for SSL protocols ( needle->handler->flags&PROTOPT_SSL).

Sorry I don't get it, can you explain your point with a patch?

@mkauf
Contributor
mkauf commented May 11, 2016

The second part of that patch should then probably be reverted. You agree?

Yes, but it's also possible to fix the bug (see proposed patch below)

Sorry I don't get it, can you explain your point with a patch?

I think that the needle connection will be reused in this situation with the current code:

  • needle: URL: https://www.example.com/; uses the HTTP proxy foo.example.com in tunnel mode
  • check: URL: https://www.example.com/; uses the HTTP proxy bar.example.com in tunnel mode

I propose to compare the proxy information in an earlier step, this also simplifies the code:

diff --git a/lib/url.c b/lib/url.c
index 605212e..4ab654d 100644
--- a/lib/url.c
+++ b/lib/url.c
@@ -3249,8 +3249,8 @@ ConnectionExists(struct SessionHandle *data,
       size_t pipeLen;

       /*
-       * Note that if we use a HTTP proxy, we check connections to that
-       * proxy and not to the actual remote server.
+       * Note that if we use a HTTP proxy in normal mode (no tunneling), we
+       * check connections to that proxy and not to the actual remote server.
        */
       check = curr->ptr;
       curr = curr->next;
@@ -3331,6 +3331,15 @@ ConnectionExists(struct SessionHandle *data,
         /* don't do mixed proxy and non-proxy connections */
         continue;

+      if(needle->bits.proxy &&
+          (needle->proxytype != check->proxytype ||
+              needle->bits.httpproxy != check->bits.httpproxy ||
+              needle->bits.tunnel_proxy != check->bits.tunnel_proxy ||
+              !Curl_raw_equal(needle->proxy.name, check->proxy.name) ||
+               needle->port != check->port))
+        /* don't mix connections that use different proxies */
+        continue;
+
       if(needle->bits.conn_to_host != check->bits.conn_to_host)
         /* don't mix connections that use the "connect to host" feature and
          * connections that don't use this feature */
@@ -3376,11 +3385,8 @@ ConnectionExists(struct SessionHandle *data,
         }
       }

-      if(!needle->bits.proxy || (needle->handler->flags&PROTOPT_SSL) ||
-         (needle->bits.httpproxy && check->bits.httpproxy &&
-          needle->bits.tunnel_proxy && check->bits.tunnel_proxy &&
-          Curl_raw_equal(needle->proxy.name, check->proxy.name) &&
-          (needle->port == check->port))) {
+      if(!needle->bits.httpproxy || (needle->handler->flags&PROTOPT_SSL) ||
+          (needle->bits.httpproxy && needle->bits.tunnel_proxy)) {
         /* The requested connection does not use a HTTP proxy or it uses SSL or
            it is a non-SSL protocol tunneled over the same HTTP proxy name and
            port number */
@@ -3419,16 +3425,11 @@ ConnectionExists(struct SessionHandle *data,
           match = TRUE;
         }
       }
-      else { /* The requested needle connection is using a proxy,
-                is the checked one using the same host, port and type? */
-        if(check->bits.proxy &&
-           (needle->proxytype == check->proxytype) &&
-           (needle->bits.tunnel_proxy == check->bits.tunnel_proxy) &&
-           Curl_raw_equal(needle->proxy.name, check->proxy.name) &&
-           needle->port == check->port) {
-          /* This is the same proxy connection, use it! */
-          match = TRUE;
-        }
+      else {
+        /* The requested connection is using the same HTTP proxy in normal
+           mode (no tunneling) */
+        /* This is the same proxy connection, use it! */
+        match = TRUE;
       }

       if(match) {
@bagder
Member
bagder commented May 12, 2016

This looks good to me! Runs through all tests fine and some added manual tests I could think of. Merge coming up. Thanks!

@bagder bagder added a commit that closed this issue May 12, 2016
@mkauf @bagder mkauf + bagder ConnectionExists: follow-up fix for proxy re-use
Follow-up commit to 5823179

Closes #648
117a0ff
@bagder bagder closed this in 117a0ff May 12, 2016
@p
p commented May 17, 2016

Thanks for fixing this folks.

@p p referenced this issue in lorien/grab Nov 22, 2016
Open

Report to pycurl project about a bug #206

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment