New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
◆?
charcter appears in the subject when it has multiple MIME lines
#5760
Comments
Works for me when copying the text from your post. It's possible the actual email contains additional invisible characters. Can you attach the unmodified source of such a message? |
Sure, atached. |
Apologies. Decoding the subject worked fine when running the code on my computer. However, when running on an Android device I can reproduce the issue you're seeing. The problem seems to be the change we made to support improperly encoded subjects (PR #2725). With the change we strip the Q- or B-encoding from the segments and then perform the character set decoding on the concatenated bytes. [switch to JIS X 0208:1983] [some characters] [switch to ASCII] [switch to JIS X 0208:1983] [some characters] [switch to ASCII] The switch to ASCII and then back to JIS X 0208:1983 is unnecessary. The charset decoder on the JVM doesn't care and decodes the data as expected. However, the decoder on Android does mind and inserts a replacement character �. The proper way of decoding the data is to completely decode the individual segments and then concatenate the decoded text. However, many email clients (including old versions of K-9 Mail) mess up the encoding and require the decoding we added with PR #2725. We'll have to figure out a way to pick which decoding method to select. Maybe assume the text is properly encoded and use the "one segment at a time" approach. And when that leads to a replacement character being present in the output, try the "combine segments, then decode" method. Side note: If you have any control over the creation of such a message, please use UTF-8 instead of ISO-2022-JP. |
Thank you for your speedy observation!
Well,ISO-2022-JP is still common encoding for Japanese e-mail. |
Thunderbird ran into the same problem: https://bugzilla.mozilla.org/show_bug.cgi?id=1374149 I'm adopting their fix for K-9 Mail. |
I confirmed this issue was fixed in v5.905. Thank you! 😄 |
Describe the bug
Graphical character (
◆
+?
) appears in the subject of some mails written in Japanese language.To Reproduce
Post a mail with subject like:
Current K-9 shows
日本語と日本語と日本語のチェッ◆ク
on both tray list mode and mail content mode.Expected behavior
Though the example subject has a newline, this should be parsed as one-line
'日本語と日本語と日本語のチェック'
by MIME decoding rule for subject. It looks current K-9 simply trys to show (unexpected) newline char.
Screenshots
Tray mode
Content mode has same problem also
Environment (please complete the following information):
Additional context
It seems same as #3622.
I'm not sure what was done by 'Fixed in master'. (Because I hadn't used K-9 mail since 2019, I didn't know it was fixed at that time.)
The text was updated successfully, but these errors were encountered: