Thanks. It is indeed what one could expect from mime/quotedprintable, however this leaves me with either bypassing entirely qp encoding, or skipping the message altogether. Considering that email displays fine in Gmail, Outlook, Thunderbird and all the popular mai clients, there is no reason I should not be able to read it as well using golang's stdlib.
The only workaround I see for now is removing Content-Transfer-Encoding: quoted-printable from the mail headers whenever the read fails, in order to bypass quotedprintable when reading the mail part body from mime/multipart's output. This is obviously not optimal. The other solution would be to fork mime/quotedprintable to add a "relaxed" flag of some sorts, allowing unencoded characters to be let through.
But IMO this flag should be in the stdlib, especially considering it doesn't seem blocking for any other parser/mail client/mail service.
In the discussion of illegal substrings of quoted-printable data, RFC 2045 says:
Control characters other than TAB, or CR and LF as parts of CRLF pairs, must not appear. The same is true for octets with decimal values greater than 126. If found in incoming quoted-printable data by a decoder, a robust implementation might exclude them from the decoded data and warn the user that illegal characters were discovered.
So I suppose the question here is whether the mime/quotedprintable package should permit invalid bytes or not. Right now we return an error. One possibility would be:
upon encountering an invalid byte, if there are any bytes already read, unread the invalid byte and return the bytes read so far
if no bytes have been read so far, return the invalid byte with a new exported error
allow future reads to continue as usual
That would maintain the current behavior for most callers, while permitting callers who want to permit invalid bytes to simply ignore the new exported error and carry on as though it did not occur.
RFC 2045 doesn't permit non-ASCII bytes, but some systems send them
anyhow, and it seems to do little harm to permit them.
The harm is callers that could previously rely on stdlib to check the data was correct now can't.
There is a continuing pattern of stdlib slackening RFC compliance over time, promoting divergence, and increasing the pressure on other systems to also ignore errors: 'Gmail/PHP/Go doesn't complain'.
Callers that care have to now check for themselves before they pass the quoted-printable input downstream to non-stdlib that will barf.
@RalphCorderoy It's certainly not ideal, but I think our priority has to be supporting the less sophisticated user. Since other systems are apparently doing the wrong thing, we need to support that. The different approach I outlined a few comments above seemed too complex. It's very simple for the sophisticated user to detect this problematic case by scanning the bytes returned from the Read method.