Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Keep all bytes of one UTF-8 char on same line (no split) #92

Merged
merged 4 commits into from
Apr 20, 2017
Merged

Keep all bytes of one UTF-8 char on same line (no split) #92

merged 4 commits into from
Apr 20, 2017

Conversation

skorobkov
Copy link
Contributor

mimemail:rfc2047_utf8_encode/1 can splite multibyte UTF-8 character to difrent lines.
Example:
Russian word "Информация", mimemail:rfc2047_utf8_encode/1 converting it into 2 strings
=?UTF-8?Q?=D0=98=D0=BD=D1=84=D0=BE=D1=80=D0=BC=D0=B0=D1=86=D0=B8=D1?=
=?UTF-8?Q?=8F?=
Last character "и" (=D1=8F) splitted, and dont show correctly in mail-readers.

@seriyps
Copy link
Collaborator

seriyps commented Dec 8, 2015

AFAIK, rfc2047_utf8_encode should produce strings not longer than 76 chars. If your version detects such a split does it move it to next line or broke 76 char limit?

Please, also add unittests!

@skorobkov
Copy link
Contributor Author

Is "\r\n" counted in this 76 bytes?

@skorobkov
Copy link
Contributor Author

mimemail:rfc2047_utf8_encode(unicode:characters_to_binary("€ € € € € 1234 € € € € 123 € € € € € 1234€"))]) produce this strings:

=?UTF-8?Q?=E2=82=AC=20=E2=82=AC=20=E2=82=AC=20=E2=82=AC=20=E2=82=AC=20123?=
 =?UTF-9?Q?4=20=E2=82=AC=20=E2=82=AC=20=E2=82=AC=20=E2=82=AC=20123=20?=
 =?UTF-9?Q?=E2=82=AC=20=E2=82=AC=20=E2=82=AC=20=E2=82=AC=20=E2=82=AC=20123?=
 =?UTF-9?Q?4=E2=82=AC?=

Each encoded line not longer 75 bytes (76 with first space character).

@mworrell
Copy link
Collaborator

mworrell commented Apr 6, 2017

@seriyps Do you think the line length issue is resolved?

@seriyps
Copy link
Collaborator

seriyps commented Apr 6, 2017

@mworrell I guess it is, but still no unittests for this functionality

@mworrell
Copy link
Collaborator

mworrell commented Apr 6, 2017

@skorobkov can you add your example (with all the €) as an unit test?

@skorobkov
Copy link
Contributor Author

Sorry for delay.
I have added unit test.

++ " =?UTF-8?Q?4=20=E2=82=AC=20=E2=82=AC=20=E2=82=AC=20=E2=82=AC=20123=20?=\r\n"
++ " =?UTF-8?Q?=E2=82=AC=20=E2=82=AC=20=E2=82=AC=20=E2=82=AC=20=E2=82=AC=20123?=\r\n"
++ " =?UTF-8?Q?4=E2=82=AC?=",
?assertEqual(mimemail:rfc2047_utf8_encode(UnicodeString), Encoded).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to http://erlang.org/doc/apps/eunit/chapter.html#Assert_macros it's recommended to place 'expected' as a first argument and 'actual result' as 2nd of assertEqual

@seriyps
Copy link
Collaborator

seriyps commented Apr 20, 2017

Looks good

@mworrell
Copy link
Collaborator

I also like it. Merging.

@skorobkov Thank you!

@mworrell mworrell merged commit 2c02d35 into gen-smtp:master Apr 20, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants