Message is displayed incorrectly #351

Closed
tmalsburg opened this Issue Jan 20, 2014 · 8 comments

2 participants

@tmalsburg

I received a message from a mailing list and I believe that it is displayed incorrectly in mu4e-view. This is the start of the message body, copied from mu4e-raw-view:

--===============1091113107==
Content-Type: text/plain
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable
Content-length: 6111

Dear List,

I am wondering about the consequences of using non-orthogonal contrasts in =
an LMM.
I come from psychological experiments in which we often code factorial effe=

And below you see what is displayed by mu4e-view. Everything is on one line and each newline in the raw text is replaced by a period:

Dear List,..I am wondering about the consequences of using non-orthogonal contrasts in an LMM..I come from psychological experiments in which we often code factorial effe
@tmalsburg

Any ideas? I actually receive more and more of these mails. If you could offer some directions for debugging, I could try to solve it myself. I had a look at mu4e-view.el but haven't found the culprit yet. Thanks.

@djcb
Owner

Does this go through some html-renderer? Can you reproduce it with mu view?

@tmalsburg

Thanks for the response, Dirk-Jan. I don't think it goes through an html renderer. (When I set mu4e-html2text-command to a dummy function, I still get the same display of the message.) If I display the email using mu view on the command line, I also get the wrong display with all text on one line and newlines replaced by periods. BTW, I'm running the current development version of Emacs and updated mu/mu4e a few days ago.

@tmalsburg

My locale is en_US.UTF-8.

@tmalsburg

Ok, I found it. In mu_msg_mime_part_to_string, the mime part is converted to utf-8 (line 501, in mu-msg-file.c):

buffer = convert_to_utf8 (part, buffer);

This is where the problem is introduced. convert_to_utf8 doesn't recognize the charset of the part and "ugly * hack: replace all non-ascii chars with '.'" What's strange is that the newlines should not be replaced as they are in the ASCII range. The issue is that mu_str_asciify_in_place does more than it says: it does not only replace non-ascii characters but also characters below the whitespace and that seems wrong. The relevant part is this:

if (!isascii(*c) || *c < ' ')
    *c = '.';

Is it enough to instead write

if (!isascii(*c))
    *c = '.';

or would this introduce other problems?

@tmalsburg tmalsburg added a commit that referenced this issue Feb 12, 2014
@tmalsburg tmalsburg * lib: Don't replace ascii characters < ' ' with '.' in mu_str_asciif…
…y_in_place.

Fix for issue #351.
92ce900
@tmalsburg

There may be another problem involved: the mime part is plain ascii and should therefore g_utf8_validate. Thus, the replacement of nonascii characters shouldn't be triggered at all.

@djcb
Owner

I've changed the check slightly (git), does that fix things for you?

(I suspect there must be some non-ascii char somewhere in the decoded message)

@tmalsburg

That solves my problem. Thanks!

I will open a new issue if I find out more about the other problem (which I think is real because there are no non-ascii characters in the message.)

@tmalsburg tmalsburg closed this Feb 15, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment