Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Multipart text messages not returned fully by ->getMessageBody() #163
I've noticed that if you have a message which has an inline attachment only the initial text of that email is returned, rather than all text parts concatenated together, and there is no straightforward way - that I can see - to get the concatenated body with the existing code.
For my purposes I've hacked Parser->getMessageBody() as follows:
I.E. set it not to break after the first matching part and to concatenate onto $body.
which works for me, for text parts, and HTML emails I've tested, but not sure this is clean enough for a PR - it certainly passes the existing tests, but the change in how this works may be a deal breaker for some. Any thoughts much appreciated.
(PS. about to go on holiday for a fortnight so won't check back on this for a while, I'm not ignoring you)
I’m not aware of an RFC that explicitly says you need to concatenate the body, or one that says you should not concatenate the body - I'm not really familiar with the RFCs at all.
https://www.w3.org/Protocols/rfc1341/7_2_Multipart.html does say that "A body part is NOT to be interpreted as actually being an RFC 822 message” which, from my interpretation, implies that the whole idea of their being a
Looking at the source it seems that
If you think that's a better approach I'll look at creating some text cases and try to implement it and make a PR.
I attach a raw email example, which is an actual sent email which I've edited the headers on to hide personal information.
I don't know yet if it's a good idea. The purpose of this lib is to be easy to understand, when I parse an email I only have one text version and on html version. If you want to have the technical view of the email you can use directly mailparse for that.
But I understand your reply and it's true that there is no sense for getMessageBody().
Maybe another approach will be to get all others text or html in attachment of the email. The first text or html is the main body and others are attachments.
Another solution could be to have
Thanks for the feedback, I totally understand the purpose as digging into MailParse can be a bit daunting and a bit of a chore, hence why I chose your library and initially suggested the kludge above without thinking it through.
I like either of the solutions you suggest, though would tend towards the latter,
As that would fit my use case better. I'll fork and give that a shot and you can see what you feel about merging it.