Bring back the alternatives array #74

Open
fabiob opened this Issue Nov 20, 2013 · 6 comments

3 participants

@fabiob

First of all, congratulations for this great module!

I'm using mailparser to parse incoming mails with XML files attached. It works just fine for almost every case. However, some email clients are (wrongly) attaching the XML files as text/plain. Since the 0a36dc7 commit, I can't read those attachments anymore, as they get concatenated into the message body.

I would like to kindly ask if you can revert back that commit and bring back the alternatives array – maybe as an option to the parser. Is that possible?

If you're too busy, just let me know, and I'll try to fork and work on it.

TIA!

@fabiob

I was checking the incoming message again, and I noticed something that might be helpful: in this case, aside from the Content-Type specifying text/plain, the mailer also included a filename for the attachment. So this could be used to mark this part as an attachment, instead of an alternative.

Here's the relevant part of the source:

------=_NextPart_000_0038_01CEE501.61A93450
Content-Type: text/plain;
    name="35131102916265010980550010010667321709662207.xml"
Content-Transfer-Encoding: quoted-printable
Content-ID: <{E3FD5FC1-009F-4D1F-9CDD-7F2FFC06F8CE}>

What do you think?

@andris9
Owner

Well this is a complicated situation, the concatenation is there on purpose. Is the name property and content-id values something common with these e-mails? Could you also provide a sample message, so I could check how other e-mail clients are handling this situation?

@fabiob

OFC, I'll send you the sample e-mail on your profile e-mail.

@fabiob

I'm currently using this hack, but it feels very, very dirty :)

var inlines;
if (mail.text && (inlines = mail.text.match(/(?:<\?xml[^>]*\?>\S*)?<(\w+)[^>]*>.*?<\/\1>/g))) {
  console.log('Found ' + inlines.length + ' inline attachments');
  for (var i = 0; i < inlines.length; i++) {
    var xml = inlines[i];
    var fn = 'inline-' + i + '.xml';
    // parse the XML
  }
}
@andris9
Owner

I tested the e-mail source you sent with different modifications and it seems that gmail is not showing the attached XML file even if I remove name property and content-id. I need to dig into it a bit more, maybe different text/plain elements should be concatted only if these are found in a multipart/related subpart. Not sure yet.

@jstedfast

GMail will only display the first text part in a multipart/mixed as the message body.

Some mail clients will render all parts which do not have a Content-Disposition header with a value of "attachment" inline as part of the body.

Even so, you shouldn't concatenate text parts into 1 blob. IOW, keep the model separate from the view.

Also, the name property on Content-Type isn't necessarily a file name, that property is typically the "filename" parameter on the Content-Disposition header.

(see http://tools.ietf.org/html/rfc2183 for more details)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment