Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Match_hostname: hostname 'github.com.' doesn't match 'github.com' or 'www.github.com' #1254

Closed
karteek opened this issue Aug 31, 2017 · 7 comments
Assignees

Comments

@karteek
Copy link

karteek commented Aug 31, 2017

I'm getting hostname mismatch error when I try to access a domain with a trailing dot.

Example:

Version of urllib3 I'm using

(venv) ➜  ~ pip list --format=columns | grep urllib3
urllib3                      1.22    
(venv) ➜  ~ python
Python 2.7.10 (default, Jul 14 2015, 19:46:27) 
[GCC 4.2.1 Compatible Apple LLVM 6.0 (clang-600.0.39)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import urllib3
>>> http = urllib3.PoolManager(cert_reqs='CERT_REQUIRED')
>>> r = http.request('GET', 'https://github.com/')
>>> print r.status
200
>>> r = http.request('GET', 'https://github.com./')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/karteek/venv/lib/python2.7/site-packages/urllib3/request.py", line 66, in request
    **urlopen_kw)
  File "/Users/karteek/venv/lib/python2.7/site-packages/urllib3/request.py", line 87, in request_encode_url
    return self.urlopen(method, url, **extra_kw)
  File "/Users/karteek/venv/lib/python2.7/site-packages/urllib3/poolmanager.py", line 321, in urlopen
    response = conn.urlopen(method, u.request_uri, **kw)
  File "/Users/karteek/venv/lib/python2.7/site-packages/urllib3/connectionpool.py", line 668, in urlopen
    **response_kw)
  File "/Users/karteek/venv/lib/python2.7/site-packages/urllib3/connectionpool.py", line 668, in urlopen
    **response_kw)
  File "/Users/karteek/venv/lib/python2.7/site-packages/urllib3/connectionpool.py", line 668, in urlopen
    **response_kw)
  File "/Users/karteek/venv/lib/python2.7/site-packages/urllib3/connectionpool.py", line 639, in urlopen
    _stacktrace=sys.exc_info()[2])
  File "/Users/karteek/venv/lib/python2.7/site-packages/urllib3/util/retry.py", line 388, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='github.com.', port=443): Max retries exceeded with url: / (Caused by SSLError(CertificateError("hostname 'github.com.' doesn't match either of 'github.com', 'www.github.com'",),))

Similar request using curl seems to go fine.

curl -v https://github.com./ 
*   Trying 192.30.255.112...
* Connected to github.com (192.30.255.112) port 443 (#0)
* TLS 1.2 connection using TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256
* Server certificate: github.com
* Server certificate: DigiCert SHA2 Extended Validation Server CA
* Server certificate: DigiCert High Assurance EV Root CA
> GET / HTTP/1.1
> Host: github.com
> User-Agent: curl/7.43.0
> Accept: */*
> 
< HTTP/1.1 200 OK
< Date: Thu, 31 Aug 2017 21:06:01 GMT
< Content-Type: text/html; charset=utf-8
< Transfer-Encoding: chunked
< Server: GitHub.com
< Status: 200 OK
< Cache-Control: no-cache

Also, going to similar URL https://github.com./robots.txt on browsers Firefox and Chrome doesn't show any SSL errors (They get redirected to https://github.com/robots.txt)

@haikuginger
Copy link
Contributor

This is an interesting issue! Looking at how cURL does it, it seems like it's internally setting the Host header to remove the trailing dot, as well as removing the trailing dot when doing hostname validation on the cert.

I did some quick checking on getaddrinfo (which we use for DNS resolution), and it seems like it does both check my system's search domain list when the dot isn't present, and respect the trailing dot (indicating an FQDN) to indicate an absolute domain when it is present, so we definitely want to make any transformations post-resolution.

It's unclear to me whether we should just need to make the modification needed for SSL certification somewhere around here (that'll pass a hostname with trailing dot removed to whatever SSL backend is in use for the purposes of certificate verification) or if we should perform the dot-ectomy change in the Host header we send as well (which would need more careful consideration due to the DNS resolution stuff mentioned above).

As for algorithm, I believe simply doing host.rstrip('.') is the minimally-invasive operation we need to remove a trailing dot that indicates an FQDN; as far as I can tell, neither the IPv4 nor IPv6 syntax allows for trailing dots.

@Lukasa, thoughts? I'd be happy to work on a fix for this.

@Lukasa
Copy link
Sponsor Contributor

Lukasa commented Sep 1, 2017

Probably we should almost unconditionally do the dot-ectomy. It is likely to confuse most people to see it and also to cause latent bugs down the line, so probably after DNS resolution it should just be stripped.

@haikuginger
Copy link
Contributor

@Lukasa, sounds good. I'll get to work on that; the unfortunate part is that we'll probably need to store the domain twice (once for DNS purposes; once for everything else).

@Lukasa
Copy link
Sponsor Contributor

Lukasa commented Sep 1, 2017

Nah, let's just create a property that does the transformation for us when we need it. 😁

@haikuginger
Copy link
Contributor

LAZY EVALUATION FOR THE LAZY DEVELOPMENT GOD

@tiran
Copy link
Contributor

tiran commented Nov 29, 2017

The same issue came up on Python's bug tracker, https://bugs.python.org/issue31997 . TL;DR and IMHO urllib3 is the correct place to resolve the issue. You should use the FQDN + trailing dot for DNS lookup, then strip the trailing dot and use the clean FQDN for SNI, HTTP Host header, and hostname matching.

@haikuginger
Copy link
Contributor

@tiran yep! I've got an open PR to do that; time has just been a bit tight lately. It's on my list of things to get done very soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants