Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No UTF-8 characters allowed in content-disposition? #1124

Closed
kvz opened this issue Feb 4, 2015 · 1 comment · Fixed by boto/botocore#457
Closed

No UTF-8 characters allowed in content-disposition? #1124

kvz opened this issue Feb 4, 2015 · 1 comment · Fixed by boto/botocore#457
Labels
bug This issue is a bug. s3

Comments

@kvz
Copy link

kvz commented Feb 4, 2015

Hi,

We're running into some strange issues. We recently switched to this official aws cli coming from Tim Kay's aws tool, but we're running into problems with 'weird' filenames.

$ aws s3api put-object \
  --bucket dev.s3.transload.it \
  --debug \
  --body ./failing-utf8.jpg \
  --key failing-utf8.jpg \
  --acl public-read \
  --content-type image/jpeg \
  --content-disposition attachment\; filename=5小時接力起跑.jpg;

2015-02-04 15:49:14,198 - MainThread - awscli.clidriver - DEBUG - CLI version: aws-cli/1.7.4 Python/2.7.8 Linux/3.13.0-39-generic, botocore version: 0.85.0
2015-02-04 15:49:14,198 - MainThread - botocore.hooks - DEBUG - Event session-initialized: calling handler <function add_scalar_parsers at 0x7f345653a410>
2015-02-04 15:49:14,198 - MainThread - botocore.hooks - DEBUG - Event session-initialized: calling handler <function inject_assume_role_provider at 0x7f345677dcf8>
2015-02-04 15:49:14,198 - MainThread - botocore.service - DEBUG - Creating service object for: s3
2015-02-04 15:49:14,228 - MainThread - botocore.hooks - DEBUG - Event service-data-loaded.s3: calling handler <function register_retries_for_service at 0x7f3456efee60>
2015-02-04 15:49:14,233 - MainThread - botocore.handlers - DEBUG - Registering retry handlers for service: s3
2015-02-04 15:49:14,234 - MainThread - botocore.hooks - DEBUG - Event service-data-loaded.s3: calling handler <function signature_overrides at 0x7f3456f00050>
2015-02-04 15:49:14,234 - MainThread - botocore.hooks - DEBUG - Event service-data-loaded.s3: calling handler <function register_retries_for_service at 0x7f3456efee60>
2015-02-04 15:49:14,235 - MainThread - botocore.handlers - DEBUG - Registering retry handlers for service: s3
2015-02-04 15:49:14,235 - MainThread - botocore.hooks - DEBUG - Event service-data-loaded.s3: calling handler <function signature_overrides at 0x7f3456f00050>
2015-02-04 15:49:14,235 - MainThread - botocore.service - DEBUG - Creating operation objects for: Service(s3)
2015-02-04 15:49:14,239 - MainThread - botocore.hooks - DEBUG - Event building-command-table.s3api: calling handler <function add_waiters at 0x7f34565036e0>
2015-02-04 15:49:14,253 - MainThread - awscli.clidriver - DEBUG - OrderedDict([(u'acl', <awscli.arguments.CLIArgument object at 0x7f34560be3d0>), (u'body', <awscli.arguments.CLIArgument object at 0x7f34560be410>), (u'bucket', <awscli.arguments.CLIArgument object at 0x7f34560be450>), (u'cache-control', <awscli.arguments.CLIArgument object at 0x7f34560be490>), (u'content-disposition', <awscli.arguments.CLIArgument object at 0x7f34560be4d0>), (u'content-encoding', <awscli.arguments.CLIArgument object at 0x7f34560be510>), (u'content-language', <awscli.arguments.CLIArgument object at 0x7f34560be550>), (u'content-length', <awscli.arguments.CLIArgument object at 0x7f34560be590>), (u'content-md5', <awscli.arguments.CLIArgument object at 0x7f34560be5d0>), (u'content-type', <awscli.arguments.CLIArgument object at 0x7f34560be610>), (u'expires', <awscli.arguments.CLIArgument object at 0x7f34560be650>), (u'grant-full-control', <awscli.arguments.CLIArgument object at 0x7f34560be690>), (u'grant-read', <awscli.arguments.CLIArgument object at 0x7f34560be6d0>), (u'grant-read-acp', <awscli.arguments.CLIArgument object at 0x7f34560be710>), (u'grant-write-acp', <awscli.arguments.CLIArgument object at 0x7f34560be750>), (u'key', <awscli.arguments.CLIArgument object at 0x7f34560be790>), (u'metadata', <awscli.arguments.CLIArgument object at 0x7f34560be7d0>), (u'server-side-encryption', <awscli.arguments.CLIArgument object at 0x7f34560be810>), (u'storage-class', <awscli.arguments.CLIArgument object at 0x7f34560be850>), (u'website-redirect-location', <awscli.arguments.CLIArgument object at 0x7f34560be890>), (u'sse-customer-algorithm', <awscli.arguments.CLIArgument object at 0x7f34560be8d0>), (u'sse-customer-key', <awscli.arguments.CLIArgument object at 0x7f34560be910>), (u'sse-customer-key-md5', <awscli.arguments.CLIArgument object at 0x7f34560be950>), (u'ssekms-key-id', <awscli.arguments.CLIArgument object at 0x7f34560be990>)])
2015-02-04 15:49:14,253 - MainThread - botocore.hooks - DEBUG - Event building-argument-table.s3api.put-object: calling handler <function add_streaming_output_arg at 0x7f34568c45f0>
2015-02-04 15:49:14,253 - MainThread - botocore.hooks - DEBUG - Event building-argument-table.s3api.put-object: calling handler <function add_cli_input_json at 0x7f3456770ed8>
2015-02-04 15:49:14,254 - MainThread - botocore.hooks - DEBUG - Event building-argument-table.s3api.put-object: calling handler <function unify_paging_params at 0x7f345685c500>
2015-02-04 15:49:14,257 - MainThread - botocore.hooks - DEBUG - Event building-argument-table.s3api.put-object: calling handler <function add_generate_skeleton at 0x7f345677a488>
2015-02-04 15:49:14,258 - MainThread - botocore.hooks - DEBUG - Event before-building-argument-table-parser.s3api.put-object: calling handler <bound method CliInputJSONArgument.override_required_args of <awscli.customizations.cliinputjson.CliInputJSONArgument object at 0x7f34560be9d0>>
2015-02-04 15:49:14,258 - MainThread - botocore.hooks - DEBUG - Event before-building-argument-table-parser.s3api.put-object: calling handler <bound method GenerateCliSkeletonArgument.override_required_args of <awscli.customizations.generatecliskeleton.GenerateCliSkeletonArgument object at 0x7f34560bea10>>

Traceback (most recent call last):
  File "/srv/current/stack/bin/aws", line 23, in <module>
    sys.exit(main())
  File "/srv/current/stack/bin/aws", line 19, in main
    return awscli.clidriver.main()
  File "/srv/current/stack/lib/python2.7/site-packages/awscli/clidriver.py", line 50, in main
    return driver.main()
  File "/srv/current/stack/lib/python2.7/site-packages/awscli/clidriver.py", line 200, in main
    sys.stderr.write(str(e) + '\n')
UnicodeEncodeError: 'ascii' codec can't encode character u'\u8dd1' in position 26: ordinal not in range(128)

My environment:

$ bash -xc "lsb_release -ds; bash --version |head -1; aws --version; env |egrep '(LC_|LANG)'"+ lsb_release -ds
+ lsb_release -ds
Ubuntu 12.04.5 LTS
+ bash --version
+ head -1
GNU bash, version 4.2.25(1)-release (x86_64-pc-linux-gnu)
+ aws --version
aws-cli/1.7.4 Python/2.7.8 Linux/3.13.0-39-generic
+ env
+ egrep '(LC_|LANG)+'
LC_ALL=en_US.UTF-8
LANG=en_US.UTF-8
LANGUAGE=en_US.UTF-8
LC_CTYPE=en_US.UTF-8
LC_TIME=en_US.UTF-8

I suspect there might be a bug that causes this behavior and would like to know if you can shed some light on it.

Kind regards,
Kevin

Edit: At first this report also had bits about escaping bash chars, but I could boil it down to this.

@kvz kvz changed the title Can't escape characters in content-disposition No UTF-8 characters allowed in content-disposition? Feb 4, 2015
@jamesls
Copy link
Member

jamesls commented Feb 6, 2015

Interesting, getting a slightly different error when our http lib tries to encode the headers to latin-1, which fails given chars outside of the latin-1 range:

Traceback (most recent call last):
  File "botocore/botocore/endpoint.py", line 165, in _get_response
    proxies=self.proxies, timeout=self.timeout)
  File "botocore/botocore/vendored/requests/sessions.py", line 464, in send
    r = adapter.send(request, **kwargs)
  File "botocore/botocore/vendored/requests/adapters.py", line 321, in send
    timeout=timeout
  File "botocore/botocore/vendored/requests/packages/urllib3/connectionpool.py", line 471, in urlopen
    body=body, headers=headers)
  File "botocore/botocore/vendored/requests/packages/urllib3/connectionpool.py", line 285, in _make_request
    conn.request(method, url, **httplib_request_kw)
  File "/usr/local/Cellar/python3/3.4.2_1/Frameworks/Python.framework/Versions/3.4/lib/python3.4/http/client.py", line 1090, in request
    self._send_request(method, url, body, headers)
  File "botocore/botocore/awsrequest.py", line 120, in _send_request
    self, method, url, body, headers)
  File "/usr/local/Cellar/python3/3.4.2_1/Frameworks/Python.framework/Versions/3.4/lib/python3.4/http/client.py", line 1123, in _send_request
    self.putheader(hdr, value)
  File "/usr/local/Cellar/python3/3.4.2_1/Frameworks/Python.framework/Versions/3.4/lib/python3.4/http/client.py", line 1066, in putheader
    values[i] = one_value.encode('latin-1')
UnicodeEncodeError: 'latin-1' codec can't encode characters in position 10-15: ordinal not in range(256)

Looking into a fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug This issue is a bug. s3
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants