don't raise an exception when processing emails with UTF-7 #592

Closed
wants to merge 2 commits into
from

Conversation

Projects
None yet
3 participants
Contributor

mreinsch commented Aug 4, 2013

When processing UTF-7 encoded emails (bounces from hotmail for instance), mail raises an exception.

This patch changes the behaviour to not raise an exception but simply keep the encoded string instead.

mreinsch added some commits Aug 4, 2013

@mreinsch mreinsch don't raise an exception when processing emails with UTF-7 or other e…
…ncodings which can't be handled
1f3b075
@mreinsch mreinsch don't raise UndefinedConversionError
* skip unconvertable characters as we do with invalid characters
14a7c1e
Collaborator

ConradIrwin commented Aug 6, 2013

@mreinsch I like 14a7c1e a lot, but I'm a bit scared of 1f3b075. Given that Net::IMAP has a UTF-7 decoder in it, maybe we should just decode it properly instead?

Also could you please add a fixture with an example UTF-7 email, I don't seem to have any.

Thanks!

Contributor

mreinsch commented Aug 7, 2013

@ConradIrwin thanks for the feedback. I did try the UTF-7 decoder in Net::IMAP, but couldn't get it to decode the subject line which you can also find in the test (it's an actual subject line I encountered in the wild). I tried various approaches to decode that subject line, but none resulted in anything useful. That might be certainly because of me, but then it seems UTF-7's main use is for auto-generated bounce replies by hotmail only - which probably explains why no one ever bothered to implement that?
Anyway, I can see that 1f3b075 is scary, on the other hand mail should somehow handle headers it can't decode - something other than throwing an exception IMHO.

👍 for 14a7c1e.

We had a message with the following subject line:

Subject: =?ks_c_5601-1987?B?uei03rXHwfYgvsrAvTogRSZFVFYgLS0gU29sYXIgOiBCcmlnaHRT?=
 =?ks_c_5601-1987?B?b3VyY2UncyBEZXNtb25kIHNheXMgSXZhbnBhaCBmYWNpbGl0eSBh?=
 =?ks_c_5601-1987?Q?_model_for_future_projects?=

When trying to retrieve the Mail object subject, we would then get Encoding::UndefinedConversionError: "\xEBn" from CP949 to UTF-8

After monkey-patching in 14a7c1e, the message parsed fine. It looks like :undef => :replace is a good change for more than one situation.

Contributor

mreinsch commented Sep 27, 2014

superseded by #802 and the work @grosser did :)

mreinsch closed this Sep 27, 2014

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment