Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

content_encoding != charset, bug #153 #157

Merged
merged 1 commit into from May 23, 2013
Merged

content_encoding != charset, bug #153 #157

merged 1 commit into from May 23, 2013

Conversation

@mdomsch
Copy link
Contributor

mdomsch commented Apr 24, 2013

#153

Thus spake Castedo Ellerman:
I believe the commit 64eb484 you did on 2012-07-31 04:04:24Z added
incorrect code that sets the HTTP Content-Encoding header to be a
character encoding (typically "UTF-8").

I just got bitten by this because I used a recent s3cmd to upload a
file and now a Perl script that uses the "get" function from
LWP::Simple to download that file fails because that file uploaded by
s3cmd got the meta-tagged to have "Content-Encoding" as "UTF-8" which
is an invalid value. This is the same root cause for this bug
affecting Richard Harrington:
https://sourceforge.net/mailarchive/message.php?msg_id=30653510

The problem is that the code you added is confounding the "charset"
with "content encoding". Here is the definition of HTTP "content
encoding":
http://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html#sec3.5
http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.11
which is different and independent from charset (e.g. UTF-8).

The data member "self.config.encoding" of class S3 in S3.py is a
charset (looks like by default set to "utf-8").

In contrast the variable "content_encoding" originally added by
Karsten Sperling mail@ksperling.net is supposed to be an HTTP
content encoding.

It is with your merge that you incorrectly assign a charset
(self.config.encoding.upper()) to the variable content_encoding which
is intended to be used at an HTTP Contend-Encoding header and use
content_encoding as a charset for the
content_type. self.config.encoding.upper() and content_encoding should
remain completely independent of each other, they are two different
things.

This patch from Castedo fixes the mistake.

#153

Thus spake Castedo Ellerman:
I believe the commit 64eb484 you did on 2012-07-31 04:04:24Z added
incorrect code that sets the HTTP Content-Encoding header to be a
character encoding (typically "UTF-8").

I just got bitten by this because I used a recent s3cmd to upload a
file and now a Perl script that uses the "get" function from
LWP::Simple to download that file fails because that file uploaded by
s3cmd got the meta-tagged to have "Content-Encoding" as "UTF-8" which
is an invalid value. This is the same root cause for this bug
affecting Richard Harrington:
  https://sourceforge.net/mailarchive/message.php?msg_id=30653510

The problem is that the code you added is confounding the "charset"
with "content encoding". Here is the definition of HTTP "content
encoding":
  http://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html#sec3.5
  http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.11
which is different and independent from charset (e.g. UTF-8).

The data member "self.config.encoding" of class S3 in S3.py is a
charset (looks like by default set to "utf-8").

In contrast the variable "content_encoding" originally added by
Karsten Sperling <mail@ksperling.net> is supposed to be an HTTP
content encoding.

It is with your merge that you incorrectly assign a charset
(self.config.encoding.upper()) to the variable content_encoding which
is intended to be used at an HTTP Contend-Encoding header and use
content_encoding as a charset for the
content_type. self.config.encoding.upper() and content_encoding should
remain completely independent of each other, they are two different
things.

This patch from Castedo fixes the mistake.
mdomsch added a commit to mdomsch/s3cmd that referenced this pull request May 11, 2013
@mludvig mludvig merged commit 85bb3ed into s3tools:master May 23, 2013
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

3 participants
You can’t perform that action at this time.