New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't use bucket names with dots #2836

Open
lewisdiamond opened this Issue Dec 22, 2014 · 98 comments

Comments

Projects
None yet
@lewisdiamond

lewisdiamond commented Dec 22, 2014

Using boto for a s3 bucket named with dots, e.g. my.bucket.s3.amazonaws.com fails:

ssl.CertificateError: hostname 'my.bucket.s3.amazonaws.com' doesn't match either of '*.s3.amazonaws.com', 's3.amazonaws.com'

@krallin

This comment has been minimized.

Show comment
Hide comment
@krallin

krallin Dec 22, 2014

Contributor

Having the same issue here, running on Python 2.7.9. Python 2.7.9 introduced strict certificate checking. This might be why the error is happening.

@lewisdiamond — are you also running Python 2.7.9?

Contributor

krallin commented Dec 22, 2014

Having the same issue here, running on Python 2.7.9. Python 2.7.9 introduced strict certificate checking. This might be why the error is happening.

@lewisdiamond — are you also running Python 2.7.9?

@krallin

This comment has been minimized.

Show comment
Hide comment
@krallin

krallin Dec 22, 2014

Contributor

Note: setting the following configuration solves the issue for me:

[s3]
calling_format = boto.s3.connection.OrdinaryCallingFormat
Contributor

krallin commented Dec 22, 2014

Note: setting the following configuration solves the issue for me:

[s3]
calling_format = boto.s3.connection.OrdinaryCallingFormat
@lewisdiamond

This comment has been minimized.

Show comment
Hide comment
@lewisdiamond

lewisdiamond commented Dec 23, 2014

@krallin yes, 2.7.9

@kmarekspartz

This comment has been minimized.

Show comment
Hide comment
@kmarekspartz

kmarekspartz Dec 23, 2014

Same here and @krallin's config change fixes it for me.

kmarekspartz commented Dec 23, 2014

Same here and @krallin's config change fixes it for me.

@starrify

This comment has been minimized.

Show comment
Hide comment
@starrify

starrify Dec 23, 2014

Same issue experienced here. Solved by @krallin's fix.

starrify commented Dec 23, 2014

Same issue experienced here. Solved by @krallin's fix.

@jbmartin

This comment has been minimized.

Show comment
Hide comment
@jbmartin

jbmartin Dec 26, 2014

Thanks @krallin, your fix works for me on Python 2.7.9.

jbmartin commented Dec 26, 2014

Thanks @krallin, your fix works for me on Python 2.7.9.

@alex

This comment has been minimized.

Show comment
Hide comment
@alex

alex Jan 3, 2015

This looks like an AWS bug to me, as far as I can tell from the various RFCs, *.domain.com in a SAN should not match domain1.domain2.domain.com.

alex commented Jan 3, 2015

This looks like an AWS bug to me, as far as I can tell from the various RFCs, *.domain.com in a SAN should not match domain1.domain2.domain.com.

@oberstet

This comment has been minimized.

Show comment
Hide comment
@oberstet

oberstet Jan 3, 2015

This workaround does not work for me.

Test program:

from boto.s3.connection import S3Connection
conn = S3Connection()
print conn.get_bucket("web-autobahn-ws")
print conn.get_bucket("autobahn.ws")

Without the workaround in .boto:

$ python test.py
<Bucket: web-autobahn-ws>
Traceback (most recent call last):
  File "test.py", line 4, in <module>
    print conn.get_bucket("autobahn.ws")
  File "c:\Python27\lib\site-packages\boto\s3\connection.py", line 502, in get_bucket
    return self.head_bucket(bucket_name, headers=headers)
  File "c:\Python27\lib\site-packages\boto\s3\connection.py", line 521, in head_bucket
    response = self.make_request('HEAD', bucket_name, headers=headers)
  File "c:\Python27\lib\site-packages\boto\s3\connection.py", line 664, in make_request
    retry_handler=retry_handler
  File "c:\Python27\lib\site-packages\boto\connection.py", line 1068, in make_request
    retry_handler=retry_handler)
  File "c:\Python27\lib\site-packages\boto\connection.py", line 942, in _mexe
    request.body, request.headers)
  File "c:\Python27\lib\httplib.py", line 1001, in request
    self._send_request(method, url, body, headers)
  File "c:\Python27\lib\httplib.py", line 1035, in _send_request
    self.endheaders(body)
  File "c:\Python27\lib\httplib.py", line 997, in endheaders
    self._send_output(message_body)
  File "c:\Python27\lib\httplib.py", line 850, in _send_output
    self.send(msg)
  File "c:\Python27\lib\httplib.py", line 812, in send
    self.connect()
  File "c:\Python27\lib\httplib.py", line 1216, in connect
    server_hostname=server_hostname)
  File "c:\Python27\lib\ssl.py", line 350, in wrap_socket
    _context=self)
  File "c:\Python27\lib\ssl.py", line 566, in __init__
    self.do_handshake()
  File "c:\Python27\lib\ssl.py", line 796, in do_handshake
    match_hostname(self.getpeercert(), self.server_hostname)
  File "c:\Python27\lib\ssl.py", line 269, in match_hostname
    % (hostname, ', '.join(map(repr, dnsnames))))
ssl.CertificateError: hostname 'autobahn.ws.s3.amazonaws.com' doesn't match either of '*.s3.amazonaws.com', 's3.amazonaws.com'

With the workaround:

$ python test.py
Traceback (most recent call last):
  File "test.py", line 3, in <module>
    print conn.get_bucket("web-autobahn-ws")
  File "c:\Python27\lib\site-packages\boto\s3\connection.py", line 502, in get_bucket
    return self.head_bucket(bucket_name, headers=headers)
  File "c:\Python27\lib\site-packages\boto\s3\connection.py", line 549, in head_bucket
    response.status, response.reason, body)
boto.exception.S3ResponseError: S3ResponseError: 301 Moved Permanently

oberstet commented Jan 3, 2015

This workaround does not work for me.

Test program:

from boto.s3.connection import S3Connection
conn = S3Connection()
print conn.get_bucket("web-autobahn-ws")
print conn.get_bucket("autobahn.ws")

Without the workaround in .boto:

$ python test.py
<Bucket: web-autobahn-ws>
Traceback (most recent call last):
  File "test.py", line 4, in <module>
    print conn.get_bucket("autobahn.ws")
  File "c:\Python27\lib\site-packages\boto\s3\connection.py", line 502, in get_bucket
    return self.head_bucket(bucket_name, headers=headers)
  File "c:\Python27\lib\site-packages\boto\s3\connection.py", line 521, in head_bucket
    response = self.make_request('HEAD', bucket_name, headers=headers)
  File "c:\Python27\lib\site-packages\boto\s3\connection.py", line 664, in make_request
    retry_handler=retry_handler
  File "c:\Python27\lib\site-packages\boto\connection.py", line 1068, in make_request
    retry_handler=retry_handler)
  File "c:\Python27\lib\site-packages\boto\connection.py", line 942, in _mexe
    request.body, request.headers)
  File "c:\Python27\lib\httplib.py", line 1001, in request
    self._send_request(method, url, body, headers)
  File "c:\Python27\lib\httplib.py", line 1035, in _send_request
    self.endheaders(body)
  File "c:\Python27\lib\httplib.py", line 997, in endheaders
    self._send_output(message_body)
  File "c:\Python27\lib\httplib.py", line 850, in _send_output
    self.send(msg)
  File "c:\Python27\lib\httplib.py", line 812, in send
    self.connect()
  File "c:\Python27\lib\httplib.py", line 1216, in connect
    server_hostname=server_hostname)
  File "c:\Python27\lib\ssl.py", line 350, in wrap_socket
    _context=self)
  File "c:\Python27\lib\ssl.py", line 566, in __init__
    self.do_handshake()
  File "c:\Python27\lib\ssl.py", line 796, in do_handshake
    match_hostname(self.getpeercert(), self.server_hostname)
  File "c:\Python27\lib\ssl.py", line 269, in match_hostname
    % (hostname, ', '.join(map(repr, dnsnames))))
ssl.CertificateError: hostname 'autobahn.ws.s3.amazonaws.com' doesn't match either of '*.s3.amazonaws.com', 's3.amazonaws.com'

With the workaround:

$ python test.py
Traceback (most recent call last):
  File "test.py", line 3, in <module>
    print conn.get_bucket("web-autobahn-ws")
  File "c:\Python27\lib\site-packages\boto\s3\connection.py", line 502, in get_bucket
    return self.head_bucket(bucket_name, headers=headers)
  File "c:\Python27\lib\site-packages\boto\s3\connection.py", line 549, in head_bucket
    response.status, response.reason, body)
boto.exception.S3ResponseError: S3ResponseError: 301 Moved Permanently
@oberstet

This comment has been minimized.

Show comment
Hide comment
@oberstet

oberstet Jan 3, 2015

FWIW, here is how to monkey patch away hostname verification:

import ssl
if hasattr(ssl, '_create_unverified_context'):
   ssl._create_default_https_context = ssl._create_unverified_context

Other than that, it seems, migrating to Cloudfront (which doesn't require source S3 buckets to be dotted), might be an option.

oberstet commented Jan 3, 2015

FWIW, here is how to monkey patch away hostname verification:

import ssl
if hasattr(ssl, '_create_unverified_context'):
   ssl._create_default_https_context = ssl._create_unverified_context

Other than that, it seems, migrating to Cloudfront (which doesn't require source S3 buckets to be dotted), might be an option.

@oberstet

This comment has been minimized.

Show comment
Hide comment
@oberstet

oberstet Jan 5, 2015

Here is a more specific monkey patch:

import ssl

_old_match_hostname = ssl.match_hostname

def _new_match_hostname(cert, hostname):
   if hostname.endswith('.s3.amazonaws.com'):
      pos = hostname.find('.s3.amazonaws.com')
      hostname = hostname[:pos].replace('.', '') + hostname[pos:]
   return _old_match_hostname(cert, hostname)

ssl.match_hostname = _new_match_hostname

oberstet commented Jan 5, 2015

Here is a more specific monkey patch:

import ssl

_old_match_hostname = ssl.match_hostname

def _new_match_hostname(cert, hostname):
   if hostname.endswith('.s3.amazonaws.com'):
      pos = hostname.find('.s3.amazonaws.com')
      hostname = hostname[:pos].replace('.', '') + hostname[pos:]
   return _old_match_hostname(cert, hostname)

ssl.match_hostname = _new_match_hostname
@krallin

This comment has been minimized.

Show comment
Hide comment
@krallin

krallin Jan 5, 2015

Contributor

@alex,

I imagine S3 can't (or just doesn't) generate certificates on the fly for buckets that don't match the generic certificate.

However, it might make sense for boto to default to the ordinary calling format (vs. the subdomain calling format)? At least for "dotted" buckets where the subdomain calling format will not work?

The default might have to do with #443, though.

@oberstet,

Is your bucket located outside of us-east-1? Your issue looks a lot like #443.

Contributor

krallin commented Jan 5, 2015

@alex,

I imagine S3 can't (or just doesn't) generate certificates on the fly for buckets that don't match the generic certificate.

However, it might make sense for boto to default to the ordinary calling format (vs. the subdomain calling format)? At least for "dotted" buckets where the subdomain calling format will not work?

The default might have to do with #443, though.

@oberstet,

Is your bucket located outside of us-east-1? Your issue looks a lot like #443.

@oberstet

This comment has been minimized.

Show comment
Hide comment
@oberstet

oberstet Jan 5, 2015

@krallin yes, my bucket is in EU west. and yes, the workaround with the calling format stuff triggers an error that looks very similar to #443. currently, the only thing that works for me is monkey patching ..

oberstet commented Jan 5, 2015

@krallin yes, my bucket is in EU west. and yes, the workaround with the calling format stuff triggers an error that looks very similar to #443. currently, the only thing that works for me is monkey patching ..

@krallin

This comment has been minimized.

Show comment
Hide comment
@krallin

krallin Jan 5, 2015

Contributor

@oberstet

Using a patched HTTP Connection seems to work too. The following code uses the standard library for SSL cert validation, but submits a different hostname for validation to pass (one that matches S3's cert).

It appears on Python 2.7.9 (though it will not work on an earlier version of Python, since those don't have ssl.SSLContext, so some conditionals would be required).

import logging
import socket, ssl
import re

import boto
from boto.https_connection import CertValidatingHTTPSConnection

logging.basicConfig(level=logging.WARNING)

class TestHttpConntection(CertValidatingHTTPSConnection):
    # !! Unsafe on Python < 2.7.9
    def __init__(self, *args, **kwargs):
        CertValidatingHTTPSConnection.__init__(self, *args, **kwargs)  # No super, it's an old-style class
        self.ssl_ctx = ssl.create_default_context(cafile=self.ca_certs)  # Defaults to cert validation
        if self.cert_file is not None:
            self.ssl_ctx.load_cert_chain(certfile=self.cert_file, keyfile=self.key_file)

    def connect(self):
        "Connect to a host on a given (SSL) port."
        if hasattr(self, "timeout"):
            sock = socket.create_connection((self.host, self.port), self.timeout)
        else:
            sock = socket.create_connection((self.host, self.port))

        if re.match(".*\.s3.*\.amazonaws\.com", self.host):
            patched_host = ".".join(self.host.rsplit(".", 4)[1:])
        boto.log.warn("Connecting to '%s', validated as '%s'", self.host, patched_host)
        self.sock = self.ssl_ctx.wrap_socket(sock, server_hostname=patched_host)


def main():
    from boto.s3.connection import S3Connection
    conn = S3Connection(https_connection_factory=(TestHttpConntection, ()))

    print conn.get_bucket("boto2836.us-east-1.test")
    print "Standard OK"

    print conn.get_bucket("boto2836.eu-west-1.test")
    print "EU OK"

if __name__ == "__main__":
    main()

Output:

WARNING:boto:Connecting to 'boto2836.us-east-1.test.s3.amazonaws.com', validated as 'test.s3.amazonaws.com'
<Bucket: boto2836.us-east-1.test>
Standard OK
WARNING:boto:Connecting to 'boto2836.eu-west-1.test.s3.amazonaws.com', validated as 'test.s3.amazonaws.com'
WARNING:boto:Connecting to 'boto2836.eu-west-1.test.s3-eu-west-1.amazonaws.com', validated as 'test.s3-eu-west-1.amazonaws.com'
<Bucket: boto2836.eu-west-1.test>
EU OK
Contributor

krallin commented Jan 5, 2015

@oberstet

Using a patched HTTP Connection seems to work too. The following code uses the standard library for SSL cert validation, but submits a different hostname for validation to pass (one that matches S3's cert).

It appears on Python 2.7.9 (though it will not work on an earlier version of Python, since those don't have ssl.SSLContext, so some conditionals would be required).

import logging
import socket, ssl
import re

import boto
from boto.https_connection import CertValidatingHTTPSConnection

logging.basicConfig(level=logging.WARNING)

class TestHttpConntection(CertValidatingHTTPSConnection):
    # !! Unsafe on Python < 2.7.9
    def __init__(self, *args, **kwargs):
        CertValidatingHTTPSConnection.__init__(self, *args, **kwargs)  # No super, it's an old-style class
        self.ssl_ctx = ssl.create_default_context(cafile=self.ca_certs)  # Defaults to cert validation
        if self.cert_file is not None:
            self.ssl_ctx.load_cert_chain(certfile=self.cert_file, keyfile=self.key_file)

    def connect(self):
        "Connect to a host on a given (SSL) port."
        if hasattr(self, "timeout"):
            sock = socket.create_connection((self.host, self.port), self.timeout)
        else:
            sock = socket.create_connection((self.host, self.port))

        if re.match(".*\.s3.*\.amazonaws\.com", self.host):
            patched_host = ".".join(self.host.rsplit(".", 4)[1:])
        boto.log.warn("Connecting to '%s', validated as '%s'", self.host, patched_host)
        self.sock = self.ssl_ctx.wrap_socket(sock, server_hostname=patched_host)


def main():
    from boto.s3.connection import S3Connection
    conn = S3Connection(https_connection_factory=(TestHttpConntection, ()))

    print conn.get_bucket("boto2836.us-east-1.test")
    print "Standard OK"

    print conn.get_bucket("boto2836.eu-west-1.test")
    print "EU OK"

if __name__ == "__main__":
    main()

Output:

WARNING:boto:Connecting to 'boto2836.us-east-1.test.s3.amazonaws.com', validated as 'test.s3.amazonaws.com'
<Bucket: boto2836.us-east-1.test>
Standard OK
WARNING:boto:Connecting to 'boto2836.eu-west-1.test.s3.amazonaws.com', validated as 'test.s3.amazonaws.com'
WARNING:boto:Connecting to 'boto2836.eu-west-1.test.s3-eu-west-1.amazonaws.com', validated as 'test.s3-eu-west-1.amazonaws.com'
<Bucket: boto2836.eu-west-1.test>
EU OK
@krallin

This comment has been minimized.

Show comment
Hide comment
@krallin

krallin Jan 5, 2015

Contributor

Note that I'm still getting a 400 when connecting to S3 in Frankfurt, but I think that's because S3 in Frankfurt requires a different signature format.

Contributor

krallin commented Jan 5, 2015

Note that I'm still getting a 400 when connecting to S3 in Frankfurt, but I think that's because S3 in Frankfurt requires a different signature format.

@krallin

This comment has been minimized.

Show comment
Hide comment
@krallin

krallin Jan 5, 2015

Contributor

I'm working on a patch here https://github.com/krallin/boto/compare/fix-2836

First, I'm making sure that cert validation is left up to Boto regardless of the Python version (which means that when validate_certs is None, as is in the case in S3Connection, certs are indeed not validated). This is done, so and I'm adding tests (I have a few done manually, just need to convert those to integration tests)

Finally, I'll also try and add an option for boto to accept certs for "dotted" buckets on S3.

Cheers,

Contributor

krallin commented Jan 5, 2015

I'm working on a patch here https://github.com/krallin/boto/compare/fix-2836

First, I'm making sure that cert validation is left up to Boto regardless of the Python version (which means that when validate_certs is None, as is in the case in S3Connection, certs are indeed not validated). This is done, so and I'm adding tests (I have a few done manually, just need to convert those to integration tests)

Finally, I'll also try and add an option for boto to accept certs for "dotted" buckets on S3.

Cheers,

@excieve

This comment has been minimized.

Show comment
Hide comment
@excieve

excieve Jan 5, 2015

Same issue here with 2.7.9. However aws-cli, which is based on the new botocore works fine in case a certain way of passing the args is used. There's a related issue in aws-cli.

excieve commented Jan 5, 2015

Same issue here with 2.7.9. However aws-cli, which is based on the new botocore works fine in case a certain way of passing the args is used. There's a related issue in aws-cli.

krallin added a commit to krallin/boto that referenced this issue Jan 5, 2015

Add failing test for #2836
Although boto disables certificate hostname validation for S3, the
standard library still checks certificates in Python 2.7.9.

krallin added a commit to krallin/boto that referenced this issue Jan 5, 2015

Add failing test for #2836
Although boto disables certificate hostname validation for S3, the
standard library still checks certificates in Python 2.7.9.

krallin added a commit to krallin/boto that referenced this issue Jan 5, 2015

Add failing test for #2836
Although boto disables certificate hostname validation for S3, the
standard library still checks certificates in Python 2.7.9.

@krallin krallin referenced this issue Jan 18, 2015

Closed

Can't deploy #113

powdahound added a commit to powdahound/ec2instances.info that referenced this issue Jan 26, 2015

@NamanJn

This comment has been minimized.

Show comment
Hide comment
@NamanJn

NamanJn Feb 26, 2015

Yeah same here, getting an error when accessing buckets with dots.

NamanJn commented Feb 26, 2015

Yeah same here, getting an error when accessing buckets with dots.

@mattse

This comment has been minimized.

Show comment
Hide comment
@mattse

mattse Mar 5, 2015

I'm having similar issues.

With this code:

conn = S3Connection(awsAccessKeyID, awsSecretKey)

It works fine with a bucket name that has no periods in it. Like matt-test
But if a bucket name has a period name in it, like matt.test, I'll get the following error:

InvalidCertificateException: Host matt.test.s3.amazonaws.com returned an invalid certificate (remote hostname \"matt.test.s3.amazonaws.com\" does not match certificate): {'notAfter': 'Apr  9 23:59:59 2015 GMT', 'subjectAltName': ((u'DNS', '*.s3.amazonaws.com'), (u'DNS', 's3.amazonaws.com')), 'subject': ((('countryName', u'US'),), (('stateOrProvinceName', u'Washington'),), (('localityName', u'Seattle'),), (('organizationName', u'Amazon.com Inc.'),), (('commonName', u'*.s3.amazonaws.com'),))}

And if I change the code to:

conn = S3Connection(awsAccessKeyID, awsSecretKey, calling_format=OrdinaryCallingFormat())

it now works when there are periods in the name, but fails when there are no periods in the name. Here's the failure for when there are no periods in the name:

<Error><Code>PermanentRedirect</Code><Message>The bucket you are attempting to access must be addressed using the specified endpoint. Please send all future requests to this endpoint.</Message><Bucket>matt-test</Bucket><Endpoint>matt-test.s3.amazonaws.com</Endpoint><RequestId>DB89A58C5FFB8B2E</RequestId><HostId>ODxMzw0brxB4PyqpmGD+Ecff8lak6DuULecHrt3S6PHcRclft8tFaDjUXRXd62dm</HostId></Error>"

The only solution I've found to work is to if/else based on the type of bucket:

if '.' in bucketName:
    conn = S3Connection(awsAccessKeyID, awsSecretKey, calling_format=OrdinaryCallingFormat())
else:
    conn = S3Connection(awsAccessKeyID, awsSecretKey)

Hope that helps.

EDIT: It's still failing for international buckets (anything other than US Standard). I'm trying to figure that out next.

mattse commented Mar 5, 2015

I'm having similar issues.

With this code:

conn = S3Connection(awsAccessKeyID, awsSecretKey)

It works fine with a bucket name that has no periods in it. Like matt-test
But if a bucket name has a period name in it, like matt.test, I'll get the following error:

InvalidCertificateException: Host matt.test.s3.amazonaws.com returned an invalid certificate (remote hostname \"matt.test.s3.amazonaws.com\" does not match certificate): {'notAfter': 'Apr  9 23:59:59 2015 GMT', 'subjectAltName': ((u'DNS', '*.s3.amazonaws.com'), (u'DNS', 's3.amazonaws.com')), 'subject': ((('countryName', u'US'),), (('stateOrProvinceName', u'Washington'),), (('localityName', u'Seattle'),), (('organizationName', u'Amazon.com Inc.'),), (('commonName', u'*.s3.amazonaws.com'),))}

And if I change the code to:

conn = S3Connection(awsAccessKeyID, awsSecretKey, calling_format=OrdinaryCallingFormat())

it now works when there are periods in the name, but fails when there are no periods in the name. Here's the failure for when there are no periods in the name:

<Error><Code>PermanentRedirect</Code><Message>The bucket you are attempting to access must be addressed using the specified endpoint. Please send all future requests to this endpoint.</Message><Bucket>matt-test</Bucket><Endpoint>matt-test.s3.amazonaws.com</Endpoint><RequestId>DB89A58C5FFB8B2E</RequestId><HostId>ODxMzw0brxB4PyqpmGD+Ecff8lak6DuULecHrt3S6PHcRclft8tFaDjUXRXd62dm</HostId></Error>"

The only solution I've found to work is to if/else based on the type of bucket:

if '.' in bucketName:
    conn = S3Connection(awsAccessKeyID, awsSecretKey, calling_format=OrdinaryCallingFormat())
else:
    conn = S3Connection(awsAccessKeyID, awsSecretKey)

Hope that helps.

EDIT: It's still failing for international buckets (anything other than US Standard). I'm trying to figure that out next.

@gholms

This comment has been minimized.

Show comment
Hide comment
@gholms

gholms Mar 5, 2015

Contributor

I'll post some info here to save a little time: the ordinary calling format uses older, "path-style" URLs like https://s3-us-west-2.amazonaws.com/bukkit/key to address things, which means when you create the S3Connection you have to ensure you point it at the right endpoint or the service will reply with a redirect that is intentionally difficult to handle automatically. As long as you use the DNS name that matches the region of the bucket you want to work with it should work just fine.

Contributor

gholms commented Mar 5, 2015

I'll post some info here to save a little time: the ordinary calling format uses older, "path-style" URLs like https://s3-us-west-2.amazonaws.com/bukkit/key to address things, which means when you create the S3Connection you have to ensure you point it at the right endpoint or the service will reply with a redirect that is intentionally difficult to handle automatically. As long as you use the DNS name that matches the region of the bucket you want to work with it should work just fine.

@mattse

This comment has been minimized.

Show comment
Hide comment
@mattse

mattse Mar 6, 2015

@gholms thanks for the insight!

So I have buckets located in various regions with and without dots in their name that I have to upload to. I would like a clean solution to be able to handle them all. My current solution is now:

conn = boto.s3.connect_to_region(
    region,
    aws_access_key_id=awsAccessKeyID,
    aws_secret_access_key=awsSecretKey,
    calling_format=OrdinaryCallingFormat()
    )

But this requires me to map a region, (us-east-1, us-west-1, etc.) to each bucket, which is something I haven't had to do before. Previously the default calling format worked fine for buckets with dots in their name. Looking at my logs, it seems that starting February 13th, I began getting the ssl.CertificateError error mentioned in the first post of this issue thread for buckets with a dot in their name. Nothing in the code changed, although it's possible that some software on the box got updated.

If my goal is to not require a region with each bucket, is my only option to wait for @krallin 's PR?

mattse commented Mar 6, 2015

@gholms thanks for the insight!

So I have buckets located in various regions with and without dots in their name that I have to upload to. I would like a clean solution to be able to handle them all. My current solution is now:

conn = boto.s3.connect_to_region(
    region,
    aws_access_key_id=awsAccessKeyID,
    aws_secret_access_key=awsSecretKey,
    calling_format=OrdinaryCallingFormat()
    )

But this requires me to map a region, (us-east-1, us-west-1, etc.) to each bucket, which is something I haven't had to do before. Previously the default calling format worked fine for buckets with dots in their name. Looking at my logs, it seems that starting February 13th, I began getting the ssl.CertificateError error mentioned in the first post of this issue thread for buckets with a dot in their name. Nothing in the code changed, although it's possible that some software on the box got updated.

If my goal is to not require a region with each bucket, is my only option to wait for @krallin 's PR?

@krallin

This comment has been minimized.

Show comment
Hide comment
@krallin

krallin Mar 6, 2015

Contributor

@mattse ,

Unfortunately that PR doesn't really seem to be making much progress :( It's been lingering there for a while and I haven't really heard back ever.

As explained a bit above, the change might have been caused by you upgrading to Python 2.7.9, which broke a lot of stuff that used to silently work (albeit insecurely!) as far as SSL is concerned.

Cheers,

Contributor

krallin commented Mar 6, 2015

@mattse ,

Unfortunately that PR doesn't really seem to be making much progress :( It's been lingering there for a while and I haven't really heard back ever.

As explained a bit above, the change might have been caused by you upgrading to Python 2.7.9, which broke a lot of stuff that used to silently work (albeit insecurely!) as far as SSL is concerned.

Cheers,

dasein pushed a commit to brkt/brkt-cli that referenced this issue Feb 27, 2017

Herbert Pfennig
Migrate to boto3 for retrieving S3 objects for esx_service
There is inconsistent behavior with boto when downloading objects
from S3 where the bucket name contains dots:

boto/boto#2836

To address this issue, VMware operations for fetching images from S3
are now using boto3. In addition to the new dependency on boto3, the
following issues have been addressed.

- The hidden --bucket-name argument now supports s3 buckets with dots
  in the name

- The --metavisor-ovf-image argument can return the latest
  metavisor image with exact or partial matches (e.g. 0-0-964 or
  metavisor-0-0-964-g0cf0e0e62.ovf)

- If the --metavisor-ovf-image argument is _not_ provided, we
  default to retrieving the latest modified image with a *latest* or
  *release-* prefix

dasein pushed a commit to brkt/brkt-cli that referenced this issue Feb 27, 2017

Herbert Pfennig
Migrate to boto3 for retrieving S3 objects for esx_service
There is inconsistent behavior with boto when downloading objects
from S3 where the bucket name contains dots:

boto/boto#2836

To address this issue, VMware operations for fetching images from S3
are now using boto3. In addition to the new dependency on boto3, the
following issues have been addressed.

- The hidden --bucket-name argument now supports s3 buckets with dots
  in the name

- The --metavisor-ovf-image argument can return the latest
  metavisor image with exact or partial matches (e.g. 0-0-964 or
  metavisor-0-0-964-g0cf0e0e62.ovf)

- If the --metavisor-ovf-image argument is _not_ provided, we
  default to retrieving the latest modified image with a *latest* or
  *release-* prefix

thieman added a commit to gamechanger/boto that referenced this issue Mar 22, 2017

Add failing test for #2836
Although boto disables certificate hostname validation for S3, the
standard library still checks certificates in Python 2.7.9.
@shadle10

This comment has been minimized.

Show comment
Hide comment
@shadle10

shadle10 Apr 21, 2017

I know this isn't useful in solving the period issue but Amazon recommends that you do not use periods in bucket names. "When using virtual hosted–style buckets with SSL, the SSL wildcard certificate only matches buckets that do not contain periods. To work around this, use HTTP or write your own certificate verification logic. We recommend that you do not use periods (".") in bucket names."

shadle10 commented Apr 21, 2017

I know this isn't useful in solving the period issue but Amazon recommends that you do not use periods in bucket names. "When using virtual hosted–style buckets with SSL, the SSL wildcard certificate only matches buckets that do not contain periods. To work around this, use HTTP or write your own certificate verification logic. We recommend that you do not use periods (".") in bucket names."

@xenithorb

This comment has been minimized.

Show comment
Hide comment
@xenithorb

xenithorb Apr 21, 2017

Note they're talking about their wildcard. A bucket that you intend to host with a separate domain name is still required to match that virtual host. It literally passes the Host: header off to S3 and if they don't match it can't find the bucket. There is no other way for that to work, despite some of the documentation recommending against it. Yes you can use cloudfront, but that is not always desirable.

https://docs.aws.amazon.com/AmazonS3/latest/dev/VirtualHosting.html (See examples)

xenithorb commented Apr 21, 2017

Note they're talking about their wildcard. A bucket that you intend to host with a separate domain name is still required to match that virtual host. It literally passes the Host: header off to S3 and if they don't match it can't find the bucket. There is no other way for that to work, despite some of the documentation recommending against it. Yes you can use cloudfront, but that is not always desirable.

https://docs.aws.amazon.com/AmazonS3/latest/dev/VirtualHosting.html (See examples)

vitorbaptista added a commit to opentrials/opentrials-airflow that referenced this issue Apr 28, 2017

[#763] Save Airflow logs to S3
This sets the remote logs URL to a S3 bucket, making sure our logs persist even
if Airflow's host machine is destroyed. There's a caveat, though: we can't use
buckets with dots in the name (e.g. "datastore.opentrials.net"). This is
because Airflow still uses the older boto (not boto3) that has this issue (see
boto/boto#2836 and
https://issues.apache.org/jira/browse/AIRFLOW-115).

Fixes opentrials/opentrials#763

vitorbaptista added a commit to opentrials/opentrials-airflow that referenced this issue Apr 28, 2017

[#763] Save Airflow logs to S3
This sets the remote logs URL to a S3 bucket, making sure our logs persist even
if Airflow's host machine is destroyed. There's a caveat, though: we can't use
buckets with dots in the name (e.g. "datastore.opentrials.net"). This is
because Airflow still uses the older boto (not boto3) that has this issue (see
boto/boto#2836 and
https://issues.apache.org/jira/browse/AIRFLOW-115).

Fixes opentrials/opentrials#763

vitorbaptista added a commit to opentrials/opentrials-airflow that referenced this issue Apr 28, 2017

[#763] Save Airflow logs to S3
This sets the remote logs URL to a S3 bucket, making sure our logs persist even
if Airflow's host machine is destroyed. There's a caveat, though: we can't use
buckets with dots in the name (e.g. "datastore.opentrials.net"). This is
because Airflow still uses the older boto (not boto3) that has this issue (see
boto/boto#2836 and
https://issues.apache.org/jira/browse/AIRFLOW-115).

Fixes opentrials/opentrials#763

RevolutionTech added a commit to infoscout/boto that referenced this issue May 9, 2017

Add failing test for #2836
Although boto disables certificate hostname validation for S3, the
standard library still checks certificates in Python 2.7.9.
@Koff

This comment has been minimized.

Show comment
Hide comment
@Koff

Koff May 19, 2017

If you are experiencing this issue in airflow. Go the Admin -> Connection settings for your S3 connection, and add an extra key calling_format to your connection dictionary with boto.s3.connection.OrdinaryCallingFormat as value.

Your Extra field should look like:

{"aws_access_key_id":"_your_key_", "aws_secret_access_key": "_your_secret_", "calling_format": "boto.s3.connection.OrdinaryCallingFormat"}

Koff commented May 19, 2017

If you are experiencing this issue in airflow. Go the Admin -> Connection settings for your S3 connection, and add an extra key calling_format to your connection dictionary with boto.s3.connection.OrdinaryCallingFormat as value.

Your Extra field should look like:

{"aws_access_key_id":"_your_key_", "aws_secret_access_key": "_your_secret_", "calling_format": "boto.s3.connection.OrdinaryCallingFormat"}
@tolgahanuzun

This comment has been minimized.

Show comment
Hide comment
@tolgahanuzun

tolgahanuzun Jul 28, 2017

@YunanHu Thanks.
My problem is solved.

tolgahanuzun commented Jul 28, 2017

@YunanHu Thanks.
My problem is solved.

mrterry added a commit to mrterry/qds-sdk-py that referenced this issue Aug 30, 2017

Use OrdinaryCallingFormat for boto s3 connections.
For S3 buckets with dots in them (eg qubole.customer_name), boto cannot
verify ssl certs. The work-around is to use OrdinaryCallingFormat, which
puts the bucket name in the url rather than the domain.

More info in: boto/boto#2836
@benlk

This comment has been minimized.

Show comment
Hide comment
@benlk

benlk Oct 24, 2017

If you're looking for a solution that does not depend upon global configs, here's the code for connecting to a bucket in a non-us-east-1 region that has periods in its name:

import boto

def get_bucket( bucket_name ):
    """
    Establish a connection and get an S3 bucket
    """
 
    s3 = boto.s3.connect_to_region(
        'us-east-2',
        host='s3-us-east-2.amazonaws.com', # endpoint name from https://docs.aws.amazon.com/general/latest/gr/rande.html#s3_region
        calling_format=OrdinaryCallingFormat()
    )

    return s3.get_bucket(bucket_name)

benlk commented Oct 24, 2017

If you're looking for a solution that does not depend upon global configs, here's the code for connecting to a bucket in a non-us-east-1 region that has periods in its name:

import boto

def get_bucket( bucket_name ):
    """
    Establish a connection and get an S3 bucket
    """
 
    s3 = boto.s3.connect_to_region(
        'us-east-2',
        host='s3-us-east-2.amazonaws.com', # endpoint name from https://docs.aws.amazon.com/general/latest/gr/rande.html#s3_region
        calling_format=OrdinaryCallingFormat()
    )

    return s3.get_bucket(bucket_name)
@ykhrustalev

This comment has been minimized.

Show comment
Hide comment
@ykhrustalev

ykhrustalev Dec 13, 2017

Based on @oberstet solution but with support of region zones

import ssl

_old_match_hostname = ssl.match_hostname


def remove_dot(host):
    """
    >>> remove_dot('a.x.s3-eu-west-1.amazonaws.com')
    'ax.s3-eu-west-1.amazonaws.com'
    >>> remove_dot('a.s3-eu-west-1.amazonaws.com')
    'a.s3-eu-west-1.amazonaws.com'
    >>> remove_dot('s3-eu-west-1.amazonaws.com')
    's3-eu-west-1.amazonaws.com'
    >>> remove_dot('a.x.s3-eu-west-1.example.com')
    'a.x.s3-eu-west-1.example.com'
    """
    if not host.endswith('.amazonaws.com'):
        return host
    parts = host.split('.')
    h = ''.join(parts[:-3])
    if h:
        h += '.'
    return h + '.'.join(parts[-3:])


def _new_match_hostname(cert, hostname):
    return _old_match_hostname(cert, remove_dot(hostname))


ssl.match_hostname = _new_match_hostname

ykhrustalev commented Dec 13, 2017

Based on @oberstet solution but with support of region zones

import ssl

_old_match_hostname = ssl.match_hostname


def remove_dot(host):
    """
    >>> remove_dot('a.x.s3-eu-west-1.amazonaws.com')
    'ax.s3-eu-west-1.amazonaws.com'
    >>> remove_dot('a.s3-eu-west-1.amazonaws.com')
    'a.s3-eu-west-1.amazonaws.com'
    >>> remove_dot('s3-eu-west-1.amazonaws.com')
    's3-eu-west-1.amazonaws.com'
    >>> remove_dot('a.x.s3-eu-west-1.example.com')
    'a.x.s3-eu-west-1.example.com'
    """
    if not host.endswith('.amazonaws.com'):
        return host
    parts = host.split('.')
    h = ''.join(parts[:-3])
    if h:
        h += '.'
    return h + '.'.join(parts[-3:])


def _new_match_hostname(cert, hostname):
    return _old_match_hostname(cert, remove_dot(hostname))


ssl.match_hostname = _new_match_hostname
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment