Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

s3boto3 en/decoding issue #216

Closed
g-as opened this issue Oct 10, 2016 · 7 comments
Closed

s3boto3 en/decoding issue #216

g-as opened this issue Oct 10, 2016 · 7 comments

Comments

@g-as
Copy link
Contributor

g-as commented Oct 10, 2016

I recently started testing the new s3boto3 backend, and stumbled upon some unicode/str mixups (I'm still rolling with python2.7).

I have a file field with a path containing non ASCII characters and I encountered an issue at file loading (here). After further investigation, it seems that boto3 expects each name argument (the one given to each resource) to be a python2 unicode/python3 str, as it goes though this method. But on django-storages side, every path/filename goes though the internal method _encode_name, which converts to bytes in python2 (with smart_str, an alias for smart_bytes in python2). Hence my UnicodeDecodeError.

I'll open a PR soon. Any feedback welcome.

g-as referenced this issue in antonagestam/collectfast Dec 30, 2016
@jschneier
Copy link
Owner

Thanks for bringing this up...a while ago :)

@jschneier
Copy link
Owner

Hi. Does anyone watching this issue have a traceback? I'm eager to merge in this fix but cannot seem to cause the UnicodeDecodeError to happen.

@g-as
Copy link
Contributor Author

g-as commented Jan 12, 2017

Here's a case where I'm getting it. The file name contains non ascii characters.

----> 1 my_model.my_image_field.file.read()

/home/vagrant/code/.virtualenvs/lite/lib/python2.7/site-packages/django/db/models/fields/files.pyc in _get_file(self)
     49         self._require_file()
     50         if not hasattr(self, '_file') or self._file is None:
---> 51             self._file = self.storage.open(self.name, 'rb')
     52         return self._file
     53

/home/vagrant/code/.virtualenvs/lite/lib/python2.7/site-packages/django/core/files/storage.pyc in open(self, name, mode)
     35         Retrieves the specified file from storage.
     36         """
---> 37         return self._open(name, mode)
     38
     39     def save(self, name, content, max_length=None):

/home/vagrant/code/.virtualenvs/lite/lib/python2.7/site-packages/storages/backends/s3boto3.pyc in _open(self, name, mode)
    422         name = self._normalize_name(self._clean_name(name))
    423         try:
--> 424             f = self.file_class(name, mode, self)
    425         except self.connection_response_error as err:
    426             if err.response['ResponseMetadata']['HTTPStatusCode'] == 404:

/home/vagrant/code/.virtualenvs/lite/lib/python2.7/site-packages/storages/backends/s3boto3.pyc in __init__(self, name, mode, storage, buffer_size)
     96         if 'w' not in mode:
     97             # Force early RAII-style exception if object does not exist
---> 98             self.obj.load()
     99         self._is_dirty = False
    100         self._file = None

/home/vagrant/code/.virtualenvs/lite/lib/python2.7/site-packages/boto3/resources/factory.pyc in do_action(self, *args, **kwargs)
    503             # instance via ``self``.
    504             def do_action(self, *args, **kwargs):
--> 505                 response = action(self, *args, **kwargs)
    506                 self.meta.data = response
    507             # Create the docstring for the load/reload mehtods.

/home/vagrant/code/.virtualenvs/lite/lib/python2.7/site-packages/boto3/resources/action.pyc in __call__(self, parent, *args, **kwargs)
     81                     operation_name, params)
     82
---> 83         response = getattr(parent.meta.client, operation_name)(**params)
     84
     85         logger.debug('Response: %r', response)

/home/vagrant/code/.virtualenvs/lite/lib/python2.7/site-packages/botocore/client.pyc in _api_call(self, *args, **kwargs)
    249                     "%s() only accepts keyword arguments." % py_operation_name)
    250             # The "self" in this scope is referring to the BaseClient.
--> 251             return self._make_api_call(operation_name, kwargs)
    252
    253         _api_call.__name__ = str(py_operation_name)

/home/vagrant/code/.virtualenvs/lite/lib/python2.7/site-packages/botocore/client.pyc in _make_api_call(self, operation_name, api_params)
    511         }
    512         request_dict = self._convert_to_request_dict(
--> 513             api_params, operation_model, context=request_context)
    514
    515         handler, event_response = self.meta.events.emit_until_response(

/home/vagrant/code/.virtualenvs/lite/lib/python2.7/site-packages/botocore/client.pyc in _convert_to_request_dict(self, api_params, operation_model, context)
    564
    565         request_dict = self._serializer.serialize_to_request(
--> 566             api_params, operation_model)
    567         prepare_request_dict(request_dict, endpoint_url=self._endpoint.host,
    568                              user_agent=self._client_config.user_agent,

/home/vagrant/code/.virtualenvs/lite/lib/python2.7/site-packages/botocore/validate.pyc in serialize_to_request(self, parameters, operation_model)
    270                 raise ParamValidationError(report=report.generate_report())
    271         return self._serializer.serialize_to_request(parameters,
--> 272                                                      operation_model)

/home/vagrant/code/.virtualenvs/lite/lib/python2.7/site-packages/botocore/serialize.pyc in serialize_to_request(self, parameters, operation_model)
    406         serialized['url_path'] = self._render_uri_template(
    407             operation_model.http['requestUri'],
--> 408             partitioned['uri_path_kwargs'])
    409         # Note that we lean on the http implementation to handle the case
    410         # where the requestUri path already has query parameters.

/home/vagrant/code/.virtualenvs/lite/lib/python2.7/site-packages/botocore/serialize.pyc in _render_uri_template(self, uri_template, params)
    428             if template_param.endswith('+'):
    429                 encoded_params[template_param] = percent_encode(
--> 430                     params[template_param[:-1]], safe='/~')
    431             else:
    432                 encoded_params[template_param] = percent_encode(

/home/vagrant/code/.virtualenvs/lite/lib/python2.7/site-packages/botocore/utils.pyc in percent_encode(input_str, safe)
    308     if not isinstance(input_str, string_types):
    309         input_str = text_type(input_str)
--> 310     return quote(text_type(input_str).encode('utf-8'), safe=safe)
    311
    312

UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 30: ordinal not in range(128)

traceback.txt

@g-as
Copy link
Contributor Author

g-as commented Jan 26, 2017

@jschneier ping!

@RailTracker
Copy link

Just experienced this issue while trying to upload an image with a Chinese filename to S3. I'm also on 2.7.

@jschneier
Copy link
Owner

Merged #217, will be in the next release this week.

@anuj9196
Copy link

anuj9196 commented Aug 23, 2019

I'm using version 1.7.1

When I try to upload a file from Django application, it closes the application server without any error or any message.

I tried uploading from Django's shell

file = File(open('/path/to/file'))
m = Media(file=file)
m.save()

But it gives an error

File "/home/user/.virtualenvs/qcg-TqOLHEIu/lib/python3.7/site-packages/s3transfer/upload.py", line 86, in read
    return self._fileobj.read(amount)
  File "/home/user/.virtualenvs/qcg-TqOLHEIu/lib/python3.7/codecs.py", line 322, in decode
    (result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x89 in position 0: invalid start byte

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants