You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
getMessageBody() changes the received content's character encoding but this has an undesired side effect.
When you get (X)HTML body and it is sinkhole-converted to another encoding there can be a situation where factual and charset specified inline don't mach. This means that a HTML (or XML) parser that honors this inline charset may doubly convert.
For example, this very possible email content
<html>
<head>
<meta http-equiv=Content-Type content="text/html; charset=ISO-8859-15">
</head>
<body>
<p>Zwölf Boxkämpfer jagen Viktor quer über den großen Sylter Deich</p>
</body>
</html>
'Zwölf BoxkÀmpfer jagen Viktor quer Ìber den gro�en Sylter Deich'
I suppose there are a few possible ways to fix this
Incorporate DOMDocument into getMessageBody(). Dealing with its error reporting is not the best experience and this does mean more dependencies/requirements.
Do a naive swap of inline charset specifications. Probably a very bad idea, because it would not be real XML parsing.
Have another function that returns the body untouched for "aware" processing.
The text was updated successfully, but these errors were encountered:
getMessageBody()
changes the received content's character encoding but this has an undesired side effect.When you get (X)HTML body and it is sinkhole-converted to another encoding there can be a situation where factual and charset specified inline don't mach. This means that a HTML (or XML) parser that honors this inline charset may doubly convert.
For example, this very possible email content
Passed through this
Prints the following
I suppose there are a few possible ways to fix this
DOMDocument
intogetMessageBody()
. Dealing with its error reporting is not the best experience and this does mean more dependencies/requirements.The text was updated successfully, but these errors were encountered: