Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTTPS Proxy Connections don't play with with Connection: Close. #295

Closed
Lukasa opened this issue Dec 7, 2013 · 20 comments
Closed

HTTPS Proxy Connections don't play with with Connection: Close. #295

Lukasa opened this issue Dec 7, 2013 · 20 comments

Comments

@Lukasa
Copy link
Sponsor Contributor

Lukasa commented Dec 7, 2013

This was actually spotted on StackOverflow.

To reproduce, set up a proxy that can handle the CONNECT verb (e.g. mitmproxy), then do as follows:

pman = urllib3.proxy_from_url('http://127.0.0.1:8080')
res = pman.urlopen('GET', 'https://www.paypal.com/')

You hit a nice big fancy error:

  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python2.7/site-packages/urllib3/poolmanager.py", line 254, in urlopen
    return super(ProxyManager, self).urlopen(method, url, redirect, **kw)
  File "/usr/local/lib/python2.7/site-packages/urllib3/poolmanager.py", line 155, in urlopen
    response = conn.urlopen(method, u.request_uri, **kw)
  File "/usr/local/lib/python2.7/site-packages/urllib3/connectionpool.py", line 520, in urlopen
    'Socket error: %s.' % e)
urllib3.exceptions.ProxyError: Cannot connect to proxy. Socket error: [Errno 54] Connection reset by peer.

This can be more clearly reproduced by turning off redirects:

pman = urllib3.proxy_from_url('http://127.0.0.1:8080')
res = pman.urlopen('GET', 'https://www.paypal.com/', redirect=False)
res = pman.urlopen('GET', 'https://www.paypal.com/', redirect=False) # Same error is hit here.

This is almost certainly because PayPal is sending back Connection: Close in the headers. This causes mitmproxy to close the upstream connection and our connection to it, but we don't remove the connection from the pool.

@Lukasa
Copy link
Sponsor Contributor Author

Lukasa commented Dec 7, 2013

Ok, so I think I've worked out the problem. We do a test to confirm that the connection is still up in urllib3.util.is_connection_dropped. The first thing this test does is this:

sock = getattr(conn, 'sock', False)
if not sock: # Platform-specific: AppEngine
    return False

However, that doesn't match with what httplib does. From Python 3.4's HTTPConnection class, check out the close() method:

def close(self):
    """Close the connection to the HTTP server."""
    if self.sock:
        self.sock.close()   # close it manually... there may be other refs
        self.sock = None
    if self.__response:
        self.__response.close()
        self.__response = None
    self.__state = _CS_IDLE

httplib, when the socket is closed, will actually set the socket to None. This means that first check is no good: if the connection got closed manually (either by httplib or a user of the library doing something crazy), we automatically say 'This socket is fine'.

That leads me to another question: this code looks wrong to me. Why isn't it wrong? From HTTPConnectionPool.get_conn():

# If this is a persistent connection, check if it got disconnected
if conn and is_connection_dropped(conn):
    log.info("Resetting dropped connection: %s" % self.host)
    conn.close()

return conn or self._new_conn()

If the connection is dropped, it looks like this code branch will simply return the now-closed connection. Which seems wrong to me. Is it wrong?

@sigmavirus24
Copy link
Contributor

Perhaps we should be setting conn = None in the if block or just returning self._new_conn() from there.

@shazow
Copy link
Member

shazow commented Dec 7, 2013

The code attempts to reuse existing socket() objects even if they are closed. IIRC the syscall to allocate a socket is comparatively expensive so it's a good idea to reuse them if you can.

Making a request with a closed socket should reopen it (this works because all the details remain the same—host, port, flags, etc).

@ferozed
Copy link

ferozed commented Jan 14, 2014

Really? I have never heard of this. So, you can do a socket.connect() on a closed socket, that was previously connected?

@shazow
Copy link
Member

shazow commented Jan 14, 2014

Correct. Should be easy to test if I'm wrong.

@ferozed
Copy link

ferozed commented Feb 4, 2014

This is affecting us very badly. Any idea when this will be fixed?

@shazow
Copy link
Member

shazow commented Feb 4, 2014

:( Sorry to hear that, @botouser. If you'd be interested in writing a fix/test for this, it would be much appreciated.

@Lukasa Do we have any updates on this issue?

@Lukasa
Copy link
Sponsor Contributor Author

Lukasa commented Feb 4, 2014

Ah, no, this slipped off my radar, sorry. =(

Based on my prior analysis (assuming it's right, no guarantees there), the simplest fix would probably be to update is_connection_dropped to spot when sock is None, and automatically return True. This probably has to happen fairly early: I have no idea what happens if you call select() or poll() on a None, but it's unlikely to be good.

@jschneier
Copy link
Contributor

It seems that Paypal is now sending back Connection: Keep Alive so I don't hit the fancy error. I did, however, configure mitmproxy to override that header and send back Connection: Close to urllib3 so while the connections might still work I was able to check the type/value of sock in util.is_connection_dropped and @Lukasa is correct, sock is None so we are reporting that the connection is not dropped when it in fact is.

The patch is very straightforward, I do wish we could have a server to test on though. Maybe proxy to another app running locally? Or use a mock...

@shazow
Copy link
Member

shazow commented Feb 11, 2014

Would be handy to do a socket-level test for this. @jschneier Could you share this patch? :) PR is appreciated.

@jschneier
Copy link
Contributor

A test would be handy, yeah. I haven't sent any PR become something still confuses me:

The interesting thing is that although strictly speaking our logic is wrong (the connection is dropped when we say it isn't) there wouldn't be much difference if it was right. Here's what I mean: if we add the check if sock is None: return True all that will happen is we will call conn.close() because util.is_connection_dropped returned True. However, the way you get conn.sock is None is if someone else (most likely httplib.HTTP(S)Connection) already called self.close()!

So I'm wondering if there is some weird edge case with proxies, redirects and SSL or if this was a bit of a wild goose chase where we found a different bug that luckily doesn't cause issues (maybe we just switch the name to should_close_connection or something) and there is another issue in there. Or it might be that it's almost impossible to find a webserver that issues Connection: Close and then closes the connection now so I can't reproduce.

Regardless, I'd like an actual reproducible scenario for this one, do you have one @botouser?

edit: I actually did manage to find a different host that does the same redirect and issues Connection: close (https://nytimes.com) but still no error.

@blueplane
Copy link

@shazow You write:

Making a request with a closed socket should reopen it (this works because all the details remain the same—host, port, flags, etc).

I believe this isn't true for tunneled connections. httplib.HTTPConnection._tunnel() overwrites the original host and port of a connection - that is those of the proxy - with the host and port of the target. This seems to lead to the CONNECT being send to the target instead of the proxy, when there is an attempt to revive a closed connection.

@shazow
Copy link
Member

shazow commented Mar 21, 2014

Mmm, that's no good. Would love any help on this. :)

@mliu7
Copy link

mliu7 commented Mar 27, 2014

Thanks everyone for all your hard work on this issue. Like @botouser, this issue is also affecting me pretty badly so I truly appreciate your efforts. Keep up the good work! If I see something that may help I'll be sure to chime in.

@jschneier
Copy link
Contributor

@mliu7 can you share with us the set up/configuration that reproduces this issue. I've had trouble reproducing it in the past and having one would be extremely helpful.

@mliu7
Copy link

mliu7 commented Mar 29, 2014

@jschneier, my colleague @kriwil will chime in since he's the guy who has been working on this particular feature for us.

@kriwil
Copy link

kriwil commented Mar 31, 2014

False alarm. We're using requests 2.2.1 which uses old urllib3 (vendored). Updating the requests to latest master fixes the issue. Sorry @mliu7 @jschneier.

@shazow
Copy link
Member

shazow commented Mar 31, 2014

Soooo is this safe to close?

@mliu7
Copy link

mliu7 commented Mar 31, 2014

@shazow - I can't totally say if it's safe to close since a couple other people did say they have an issue, but for at least @kriwil and I urllib3 works great with https. Sorry for throwing in my +1 too early!

@jschneier
Copy link
Contributor

also yes i think this is safe to close at this point.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants