Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cleaned-up python 3 proxy support #90

Merged
merged 17 commits into from
Mar 9, 2018

Conversation

cveilleux
Copy link
Contributor

@cveilleux cveilleux commented Feb 27, 2018

This is adapapted from #73

#73 introduced a lot of white-space changes and automated clean-ups which made the patch hard to review.

It also copy-pasted (!) a socks.py module from SocksiPy.

This version does not ship a socks module, users that require a proxy will need to install their own socks library (I used PySocks) in addition to httplib2.

Since the default value for proxy_info is proxy_info_from_environment, this means that without any other changes, upgrading to this version will cause httplib2 to respect HTTP_PROXY / HTTPS_PROXY environment variables.

Closes #53
Closes #25
Closes #22

Copy link
Member

@temoto temoto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cveilleux thank you very much, it's much cleaner.


class CertificateHostnameMismatch(SSLHandshakeError):
def __init__(self, desc, host, cert):
HttpLib2Error.__init__(self, desc)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please super(). I know there is bad example above.

if ssl_version is None:
ssl_version = ssl.PROTOCOL_TLS

if hasattr(ssl, 'SSLContext'): # Python 3.2+
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please put something like if not SSLContext -> raise RuntimeError("Python 3.2+ required with ssl.SSLContext")

Upon 2/3 merge we will not support 2.6- and 3.2- so SSLContext is automatically fulfilled.

self.sock.settimeout(self.timeout)
if self.debuglevel > 0:
print(
"connect: (%s, %s) ************" % (self.host, self.port))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please str.format() instead of %.

@@ -858,6 +958,112 @@ def __init__(self, host, port=None, key_file=None, cert_file=None,
cert_file=cert_file, timeout=timeout, context=context,
check_hostname=disable_ssl_certificate_validation ^ True)

def _GetValidHostsForCert(self, cert):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • (important) this ought to be redundant with modern ssl module that does proper SNI check, this code was in py2 tree to support 2.6-, AFAIK or please explain how it is needed ssl module host verification is not enough here
  • (minor but still) https://tldrlegal.com/license/apache-license-2.0-(apache-2.0) read section Must, this is not Android, we don't copy code and strip license information

@@ -858,6 +958,112 @@ def __init__(self, host, port=None, key_file=None, cert_file=None,
cert_file=cert_file, timeout=timeout, context=context,
check_hostname=disable_ssl_certificate_validation ^ True)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just realized it's easier to understand as check=not disable. Change or ignore this comment since it doesn't affect proxy feature.

@cveilleux
Copy link
Contributor Author

@temoto thanks for the quick feedback. You are right, the modern ssl module can perform the cert validation for us. Fixed.

def ssl_wrap_socket(sock, key_file, cert_file, disable_validation,
ca_certs, ssl_version, hostname):

if not ssl.SSLContext:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry i gave misleading example, this will fail with AttributeError on old Pythons, hasattr is still needed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry.. I'm a spoiled Python 3 only user.. Did not know that, will fix.

@@ -125,7 +146,7 @@ class CertificateValidationUnsupportedInPython31(HttpLib2Error): pass
HOP_BY_HOP = ['connection', 'keep-alive', 'proxy-authenticate', 'proxy-authorization', 'te', 'trailers', 'transfer-encoding', 'upgrade']

# Default CA certificates file bundled with httplib2.
CA_CERTS = os.path.join(
DEFAULT_CA_CERTS = os.path.join(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, this name is used by some projects as part of our API, so there has to be a good reason for change.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wanted to make the code cleaner but you are right about the breaking change. Will revert.

self.disable_ssl_certificate_validation, self.ca_certs,
self.ssl_version, self.host)
if self.debuglevel > 0:
print("connect: (%s, %s)" % (self.host, self.port))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

% -> format

@cveilleux
Copy link
Contributor Author

@temoto Should be all good now.

@temoto
Copy link
Member

temoto commented Feb 27, 2018

@cveilleux please have a look at our new tests (as of today not merged to master) #89 it's in branch tests.

https://github.com/httplib2/httplib2/blob/tests/tests/test_proxy.py

These are test failures with your proxy patch:

_____________________________________________________________ test_applies_to ______________________________________________________________
def test_applies_to():
        os.environ['http_proxy'] = 'http://myproxy.example.com:80'
        os.environ['https_proxy'] = 'http://myproxy.example.com:81'
        os.environ['no_proxy'] = 'localhost,otherhost.domain.local,example.com'
        pi = httplib2.proxy_info_from_environment()
>       assert not pi.applies_to('localhost')
E       AttributeError: 'ProxyInfo' object has no attribute 'applies_to'

pi         = <httplib2.ProxyInfo object at 0x104849390>

tests/test_proxy.py:54: AttributeError
_______________________________________________________________ test_headers _______________________________________________________________
def test_headers():
        headers = {'key0': 'val0', 'key1': 'val1'}
>       pi = httplib2.ProxyInfo(httplib2.socks.PROXY_TYPE_HTTP, 'localhost', 1234, proxy_headers=headers)
E       AttributeError: 'NoneType' object has no attribute 'PROXY_TYPE_HTTP'

headers    = {'key0': 'val0', 'key1': 'val1'}

tests/test_proxy.py:71: AttributeError
____________________________________________________________ test_noproxy_star _____________________________________________________________
def test_noproxy_star():
        os.environ['http_proxy'] = 'http://myproxy.example.com:80'
        os.environ['NO_PROXY'] = '*'
        pi = httplib2.proxy_info_from_environment()
        for host in ('localhost', '169.254.38.192', 'www.google.com'):
>           assert not pi.applies_to(host)
E           AttributeError: 'ProxyInfo' object has no attribute 'applies_to'

host       = 'localhost'
pi         = <httplib2.ProxyInfo object at 0x104843da0>

tests/test_proxy.py:65: AttributeError

@cveilleux
Copy link
Contributor Author

Should be easy to fix.

It will require adding support for no_proxy env var which would probably be a good thing.

Will you be merging the tests to master soon?

@temoto
Copy link
Member

temoto commented Feb 28, 2018

@cveilleux tests will be merged soon, maybe guys will request changes.

@cveilleux
Copy link
Contributor Author

@temoto I merged the tests branch into an integration branch with mine and pushed to travis ci.

There was one failure related to a wrong exception being thrown which I've fixed in 56e31a2

I could not replicate the 2 other failures you pasted.. Are you sure these tests are in the tests branch? I used tests from 7d31716 (May 2017).

I had some more failures with the python 3.3 and 3.4 builds but they seem unrelated to the proxy code and probably already failing tests.

You can see Python 2.7, 3.5 and 3.6 passing tests here:
https://travis-ci.org/cveilleux/httplib2/builds/347083638

raise RuntimeError("httplib2 requires Python 3.2+ for ssl.SSLContext")

if ssl_version is None:
ssl_version = ssl.PROTOCOL_TLS
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This name is not defined in 3.4- shown in https://travis-ci.org/cveilleux/httplib2/jobs/347083645 and https://docs.python.org/3.6/library/ssl.html#ssl.PROTOCOL_TLS

Please make it like this

DEFAULT_TLS_VERSION = getattr(ssl, 'PROTOCOL_TLS', None) or getattr(ssl, 'PROTOCOL_SSLv23')
...

  self.version = DEFAULT_TLS_VERSION

except (socket.timeout, socket.gaierror):
raise
except socket.error as error_msg:
self.error_msg = error_msg
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • (minor, naming) it is not error message, except statement binds exception object. Error message is, sometimes, ex.msg attribute of exception object, or more generally, it's str(ex). But we don't need error message here. We need actual exception object.
  • (major) assigning to instance leaks exception object (and associated traceback) after connect() call
  • (major) raise socket.error(self.error_msg) is creating another exception object wrapping previous one, so caller will not get .errno and maybe other useful attributes of original exception object
  • (major) self.error_msg leaks code flow, given how complicated code is here, it's possible that raise socket.error(self.error_msg) will happen with outdated, irrelevant object

Solution: define error object in scope of connect() function and raise it without wrapping into another level of socket.error
Something like this:

def connect(self):
  socket_err = None
  # ...
    try:
    # ...
    except socket.error as e:
      socket_err = e
      # if debug...
  if not self.sock:
    raise socket_err

I'm sorry for headaches for code that you didn't write just copied from other file, but it's really better to fix it earlier than later. We have outstanding issue with this except/socket.error in py2 code tree.

@cveilleux
Copy link
Contributor Author

@temoto I made the corrections.

I also fixed another compatibility issue with Python 3.3 (see c4ff3cc).

Tests are passing in all versions!

https://travis-ci.org/cveilleux/httplib2/builds/347429839

@temoto
Copy link
Member

temoto commented Feb 28, 2018

Cool. Congratulations with pass.

@cveilleux
Copy link
Contributor Author

FYI, Python 3.3 is EOL.. https://www.python.org/download/releases/3.3.0/
Might be better to drop suppport.

Copy link
Member

@temoto temoto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry I'm a pain in the ass

if cert_file:
context.load_cert_chain(cert_file, key_file)

context = _build_ssl_context(self.disable_ssl_certificate_validation, ca_certs, cert_file, key_file)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes it by default use system default certificates instead of CA_CERTS. Fix:

context = _build_ssl_context(self.disable_ssl_certificate_validation, self.ca_certs, cert_file, key_file)

I think it would be better to drop self.ca_certs altogether and just rewrite it like ca_certs = ca_certs or CA_CERTS but it breaks public API, so not today.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_build_ssl_context also falls back to CA_CERTS if the ca_certs param is None. So changing it to self.ca_certs would not change anything. See:

https://github.com/httplib2/httplib2/pull/90/files#diff-72a22fe2efc9771bbe023ba1ffa07173R151

@@ -969,10 +974,10 @@ def connect(self):
if has_timeout(self.timeout):
sock.settimeout(self.timeout)
sock.connect((self.host, self.port))
self.sock = ssl_wrap_socket(
self.sock = _ssl_wrap_socket(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're creating context in Connection.__init__ only to forget it and create again in _ssl_wrap_socket().

Intuitively, there should be only one ssl helper, something is wrong if we have two.

  def connect(self):
    # ...
    self.sock = self._context.wrap...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you are right here, there is a way to re-use the context in connect() instead of creating a new one. I will look into it.

@temoto
Copy link
Member

temoto commented Feb 28, 2018

According to this https://trends.builtwith.com/framework/Python
py3.3 is used much more than 3.4
but these guys report opposite https://w3techs.com/technologies/details/pl-python/3/all

tssubundu LTS ship 3.2 (12.04), then 3.4 and 3.5.

I'm perplexed.

We'd probably need to restore custom hostname checker, it's not a big deal, but only use it when native is not available.

@cveilleux
Copy link
Contributor Author

Regarding python 3.3 and the change in c4ff3cc :

SSLContext.check_hostname changes the behavior of SSLSocket.do_handshake(), and causes it to automatically perform call to match_hostname() during the handshake.

It should be possible in Python 3.3 to replicate that same behavior by manually calling match_hostname(), which exists since Python 3.2.

https://docs.python.org/3/library/ssl.html#ssl.SSLSocket.do_handshake

@temoto
Copy link
Member

temoto commented Feb 28, 2018

Oh thanks, I didn't know that.

@cveilleux
Copy link
Contributor Author

cveilleux commented Feb 28, 2018

@temoto Please see my latest commit ( d8db356 ) which fixes the 2 outstanding issues:

  • Restore full Python 3.3 compatibility by calling do_handshake() if SSLContext.check_hostname is not available.
  • Get rid of the _ssl_wrap_socket helper and simply re-use the SSLContext directly.

@cveilleux
Copy link
Contributor Author

Tests are still green: https://travis-ci.org/cveilleux/httplib2/builds/347488098

check_hostname=disable_ssl_certificate_validation ^ True)
self.proxy_info = proxy_info('https')

context = _build_ssl_context(self.disable_ssl_certificate_validation, ca_certs, cert_file, key_file)
Copy link
Member

@temoto temoto Feb 28, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use self.ca_certs here and make ca_certs required argument in build_context without if else.
Reason: there must be one place (if possible) in code which selects certs location.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in b72546d

@temoto
Copy link
Member

temoto commented Feb 28, 2018

There is not enough tests for all requested changes. But don't worry about it, I'll add some more tests later.

@cveilleux
Copy link
Contributor Author

Latest test run with the ca_certs fix: https://travis-ci.org/cveilleux/httplib2/builds/347549662

Copy link
Member

@temoto temoto left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@temoto
Copy link
Member

temoto commented Mar 1, 2018

@cdent @dims @sigmavirus24 @stinky2nine guys urgent help is needed, I'm out of changes to request.

@temoto
Copy link
Member

temoto commented Mar 1, 2018

@cveilleux thank you very much. We need 2 reviews before merge, please have patience.

@prouhard
Copy link

prouhard commented Mar 5, 2018

@cveilleux Nice work, sorry for the mess with my branch, I badly needed that functionality for a project and just started to use git at this time...

@bbattino
Copy link

bbattino commented Mar 6, 2018

Great work !

@temoto temoto merged commit 4f59e79 into httplib2:master Mar 9, 2018
temoto pushed a commit that referenced this pull request Mar 9, 2018
As requested by @temoto in #90

Wrapping socket.error looses debugging information. Raise the original error instead.
This will need porting to Python 2 version.
@temoto
Copy link
Member

temoto commented Mar 9, 2018

Dear fellow maintainers, I'm sorry for pushing matters, hope for understanding. There is still time to change something before pypi release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants