Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Blob.download_as_text does not decode properly #319

Closed
kornholi opened this issue Nov 18, 2020 · 1 comment
Closed

Blob.download_as_text does not decode properly #319

kornholi opened this issue Nov 18, 2020 · 1 comment

Comments

@kornholi
Copy link

@kornholi kornholi commented Nov 18, 2020

Blob.download_as_text tries to use the content-encoding header to decode the bytes. In most cases that value is gzip, even though the bytes were already decompressed at that point. In other cases, e.g text/plain; charset=utf-8, the value does not make sense to Python's bytes.decode.

  File "/storage/bazel-cache/_bazel_kornholi/9f066b43468ef9bfd3c6a621a4515622/execroot/__main__/bazel-out/k8-opt/bin/foo.runfiles/pypi__google_cloud_storage_1_33_0/google/cloud/storage/blob.py", line 1424, in download_as_text
    return data.decode(self.content_encoding)
LookupError: unknown encoding: gzip

I don't think we can be smarter here than passing through the encoding kwarg which defaults to utf-8.

@tseaver
Copy link
Contributor

@tseaver tseaver commented Nov 24, 2020

@kornholi Thanks for the report! I agree with your assessment that the content_encoding value is not appropriate. It might be possible to use the charset portion of the content_type value as a default, if no explicit encoding argument is passed.

tseaver added a commit that referenced this issue Nov 24, 2020
Explicit 'encoding' overrides the fallback.

Use the 'charset' param of 'content_type', rather than 'content_encoding',
which isn't going to be a Unicode -> bytes encoding.

Closes #319.
tseaver added a commit that referenced this issue Nov 24, 2020
Explicit 'encoding' overrides the fallback.

Use the 'charset' param of 'content_type', rather than 'content_encoding',
which isn't going to be a Unicode -> bytes encoding.

Closes #319.
@tseaver tseaver closed this in #326 Nov 30, 2020
tseaver added a commit that referenced this issue Nov 30, 2020
…326)

Explicit 'encoding' overrides the fallback.

Use the 'charset' param of 'content_type', rather than 'content_encoding',
which isn't going to be a Unicode -> bytes encoding.

Closes #319.

Also, rewrap long param descriptions for in-source readability.
shaffeeullah added a commit to shaffeeullah/python-storage that referenced this issue Jan 26, 2021
…oogleapis#326)

Explicit 'encoding' overrides the fallback.

Use the 'charset' param of 'content_type', rather than 'content_encoding',
which isn't going to be a Unicode -> bytes encoding.

Closes googleapis#319.

Also, rewrap long param descriptions for in-source readability.
shaffeeullah added a commit to shaffeeullah/python-storage that referenced this issue Jan 26, 2021
…oogleapis#326)

Explicit 'encoding' overrides the fallback.

Use the 'charset' param of 'content_type', rather than 'content_encoding',
which isn't going to be a Unicode -> bytes encoding.

Closes googleapis#319.

Also, rewrap long param descriptions for in-source readability.
cojenco added a commit to cojenco/python-storage that referenced this issue Oct 13, 2021
…oogleapis#326)

Explicit 'encoding' overrides the fallback.

Use the 'charset' param of 'content_type', rather than 'content_encoding',
which isn't going to be a Unicode -> bytes encoding.

Closes googleapis#319.

Also, rewrap long param descriptions for in-source readability.
cojenco added a commit to cojenco/python-storage that referenced this issue Oct 13, 2021
…oogleapis#326)

Explicit 'encoding' overrides the fallback.

Use the 'charset' param of 'content_type', rather than 'content_encoding',
which isn't going to be a Unicode -> bytes encoding.

Closes googleapis#319.

Also, rewrap long param descriptions for in-source readability.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

3 participants