Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Connecting to HTTP server through HTTPS proxy fails, seems to send requests to the proxy itself #2137

Closed
phess opened this issue Jan 14, 2021 · 2 comments

Comments

@phess
Copy link

phess commented Jan 14, 2021

Subject

Connecting to HTTPS servers through both HTTP and HTTPS proxies works fine.
Connecting to HTTP servers through HTTP proxies works fine.
Now, connecting to HTTP servers through HTTPS proxies fails every time on multiple environments I've tested.

Environment

Multiple, see below:

OS Linux-4.18.0-193.el8.x86_64-x86_64-with-redhat-8.2-Ootpa
Python 3.6.8
urllib3 1.26.2

OS Linux-4.18.0-193.el8.x86_64-x86_64-with-redhat-8.2-Ootpa
Python 3.6.8
urllib3 1.24.2

('OS', 'Linux-3.10.0-1160.11.1.el7.x86_64-x86_64-with-redhat-7.9-Maipo')
('Python', '2.7.5')
('urllib3', '1.10.2')

Steps to Reproduce

Run script below on python2 or equivalent on python3.

#!/usr/bin/env python2

import urllib3

HTTPS_PROXY = 'https://proxy.lab.local:3128'
HTTP_PROXY  = 'http://proxy.lab.local:3128'
URL = 'http://centos.mirror.nucleus.be/7/opstools/x86_64/'

http = urllib3.PoolManager()
https_proxy = urllib3.ProxyManager(HTTPS_PROXY)
http_proxy = urllib3.ProxyManager(HTTP_PROXY)

print "Trying through HTTPS_PROXY..."
try:
    req = https_proxy.request('GET', URL)
    print req.data
except Exception as e:
    print "HTTPS_PROXY failed with this exception:"
    print "   ", e
    print ""
    print "Trying through HTTP_PROXY instead..."
    req = http_proxy.request('GET', URL)
    if req:
        print "*** HTTP proxy works ***"
        print ""
        print "server response follows"
        print req.data

Expected Behavior

Connecting to HTTP server through HTTPS proxy should work just like it does for HTTP servers through HTTP proxies and HTTPS servers through HTTP/HTTPS proxies.

Actual Behavior

Script output shows:

Trying through HTTPS_PROXY...
HTTPS_PROXY failed with this exception:
    HTTPSConnectionPool(host='proxy.lab.local', port=3128): Max retries exceeded with url: http://centos.mirror.nucleus.be/7/opstools/x86_64/ (Caused by ProxyError('Cannot connect to proxy.', error('Tunnel connection failed: 403 Forbidden',)))

Trying through HTTP_PROXY instead...
*** HTTP proxy works ***

server response follows
<html>
<head><title>Index of /7/opstools/x86_64/</title></head>
<body>
<h1>Index of /7/opstools/x86_64/</h1><hr><pre><a href="../">../</a>
<a href="common/">common/</a>                                            28-Nov-2018 23:43       -
<a href="fluentd/">fluentd/</a>                                           13-Sep-2017 12:54       -
<a href="logging/">logging/</a>                                           26-Mar-2019 09:02       -
<a href="perfmon/">perfmon/</a>                                           26-Feb-2020 09:31       -
<a href="repodata/">repodata/</a>                                          26-Feb-2020 09:31       -
<a href="sensu/">sensu/</a>                                             25-Mar-2019 09:06       -
</pre><hr></body>
</html>

Proxy logs so far seem to point to this code attempting to GET / on the proxy itself.

@sethmlarson
Copy link
Member

When urllib3 requests an HTTP resource through an HTTPS proxy it uses the proxy in "forwarding" mode compared to CONNECT tunneling. I ran a test of this locally and the request being sent to the proxy was:

    GET http://centos.mirror.nucleus.be/7/opstools/x86_64/ HTTP/1.1\r\n
    Accept-Encoding: identity\r\n
    Accept: */*\r\n
    Host: centos.mirror.nucleus.be\r\n
    User-Agent: python-urllib3/2.0.0.dev0\r\n
    \r\n

notice the entire URL is being used in the request target as is expected in RFC 7230 Section 5.3.2. The proxy should then fetch the resource and return the response transparently.

I don't think this is a defect in urllib3 and has been implemented this way for a long time without other users finding this issue. Perhaps your proxy is expecting tunneling only?

@phess
Copy link
Author

phess commented Jan 15, 2021

I don't think this is a defect in urllib3 and has been implemented this way for a long time without other users finding this issue. Perhaps your proxy is expecting tunneling only?

I was baffled to find such an issue this "late" in the process so it may certainly be an issue with the proxies I've tested this with expecting tunnelling only. Thank you so much for running this test.

I'm going to close this issue now.

@phess phess closed this as completed Jan 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants