-
Notifications
You must be signed in to change notification settings - Fork 4.4k
Description
Today AWS cli is trying to guess the content type of the file even if it's compressed (which is a VERY common case when uploading to S3 files instead of cloudfront auto compression). This is being done here:
aws-cli/awscli/customizations/s3/utils.py
Lines 288 to 307 in 5d34910
| def guess_content_type(filename): | |
| """Given a filename, guess it's content type. | |
| If the type cannot be guessed, a value of None is returned. | |
| """ | |
| try: | |
| return mimetypes.guess_type(filename)[0] | |
| # This catches a bug in the mimetype libary where some MIME types | |
| # specifically on windows machines cause a UnicodeDecodeError | |
| # because the MIME type in the Windows registery has an encoding | |
| # that cannot be properly encoded using the default system encoding. | |
| # https://bugs.python.org/issue9291 | |
| # | |
| # So instead of hard failing, just log the issue and fall back to the | |
| # default guessed content type of None. | |
| except UnicodeDecodeError: | |
| LOGGER.debug( | |
| 'Unable to guess content type for %s due to ' | |
| 'UnicodeDecodeError: ', filename, exc_info=True | |
| ) |
Which internally uses a python library with hardcoded options: https://github.com/python/cpython/blob/cedc9b74202d8c1ae39bca261cbb45d42ed54d45/Lib/mimetypes.py#L402-L530
All of that is described here in a closed issue: #3303
However, .br extensions are becoming more and more popular and are being sent by default by all major browsers (excluding IE 11). Because of that problem .br files content-type are not being guessed at all and we're required to do it manually.
I suggest either adding it to the python library or embedding the hard-coded list to aws-cli so you will have full control.