Decoded Body String Has Incorrect Encoding #512

Closed
carsonreinke opened this Issue Feb 12, 2013 · 2 comments

Comments

Projects
None yet
2 participants
@carsonreinke
Contributor

carsonreinke commented Feb 12, 2013

I am having an issue that the decoded body does not have the encoding set for the String.

1.9.3p125 :001 > test = "\u24B8\u24B6\u24C7\u24C8\u24C4\u24C3" => "ⒸⒶⓇⓈⓄⓃ" 1.9.3p125 :002 > test.encoding.name => "UTF-8" 1.9.3p125 :003 > test = Mail::Body.new(test) => ⒸⒶⓇⓈⓄⓃ 1.9.3p125 :004 > test.charset = 'UTF-8' => "UTF-8" 1.9.3p125 :005 > test.encoding = 'quoted-printable' => "quoted-printable" 1.9.3p125 :006 > test.raw_source.encoding.name => "UTF-8" 1.9.3p125 :007 > test.to_s.encoding.name => "ASCII-8BIT" 1.9.3p125 :008 > test.to_s => "\xE2\x92\xB8\xE2\x92\xB6\xE2\x93\x87\xE2\x93\x88\xE2\x93\x84\xE2\x93\x83"

Maybe the decoded method of body should return the String with the known encoding for >=1.9. If not, seems strange to force the encoding myself.

@jeremy

This comment has been minimized.

Show comment Hide comment
@jeremy

jeremy Feb 12, 2013

Collaborator

Note that decoded/encoded refer to Content-Transfer-Encoding, not Ruby charset encoding.

Use mail.decoded to decode the body's transfer encoding and set the string's Ruby encoding to match the message's charset (pulled from the Content-Type header).

Agree that we need better, less surprising API here 👍

Collaborator

jeremy commented Feb 12, 2013

Note that decoded/encoded refer to Content-Transfer-Encoding, not Ruby charset encoding.

Use mail.decoded to decode the body's transfer encoding and set the string's Ruby encoding to match the message's charset (pulled from the Content-Type header).

Agree that we need better, less surprising API here 👍

@jeremy jeremy closed this Feb 12, 2013

@carsonreinke

This comment has been minimized.

Show comment Hide comment
@carsonreinke

carsonreinke Feb 12, 2013

Contributor

Yes, that is strange, as I assumed Message#decoded would act like encoded, but decoded gives me the body decoded (and the Ruby charset assigned, as you stated). Maybe instead it should be named decoded_body instead.

Thanks for the help.

Contributor

carsonreinke commented Feb 12, 2013

Yes, that is strange, as I assumed Message#decoded would act like encoded, but decoded gives me the body decoded (and the Ruby charset assigned, as you stated). Maybe instead it should be named decoded_body instead.

Thanks for the help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment