New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
quopri module differences in quoted-printable text with whitespace #60677
Comments
New to python-dev; I grab a beginner tasks "increase test coverage" and I decided to add coverage to this bit of code in the quopri module: # quopri.py As far as I understand to get into that while-loop the line to decode should end in " \t\r\n". So the I added the following test: def test_decodestring_badly_enconded(self):
e = b"hello \t\r\n"
p = b"hello\n"
s = self.module.decodestring(e)
self.assertEqual(s, p) but that only passes when the module doesn't use binascii. In fact I change test_quopri to use support.import_fresh_module to disable binascii and removed a decorator that was used. The decode text when binascci is used is: >>> quopri.decodestring("hello \t\r\n")
'hello \t\r\n' which differs from >>> quopri.a2b_qp = None
>>> quopri.b2a_qp = None
>>> quopri.decodestring("hello \t\r\n")
'hello\n And what's the deal with: >>> import quopri
>>> quopri.encodestring("hello \t\r")
'hello \t\r'
>>> "hello \t\r".encode("quopri")
'hello=20=09\r' |
I think I can answer your last question. There are two quopri algorithms, one where spaces are allowed (message body) and one where they aren't (email headers). For the rest, I'd have to take a closer look than I have time for right now. |
I think I can answer your last question. There are two quopri algorithms,
|
Ping. |
I'll take this on if I can. Is binascii available on all platforms, as if it is the quopri code could be simplified slightly along with the test code? |
The first problem is determining the "best" error recovery algorithms by reading through the RFCs and considering use cases. |
Three slightly different points here:
|
Regarding decoding trailing whitespace, <https://tools.ietf.org/html/rfc1521.html#section-5.1\> rule #3 says: “When decoding a Quoted-Printable body, any trailing white space on a line must be deleted, as it will necessarily have been added by intermediate transport agents.” |
Will commit a slightly modified version of my doc patch to 3.4+, since mentioning the wrong functions is confusing. But I think we still need to fix the “binascii” decoding, and have a look at Alejandro’s test suite patch. |
New changeset de82f41d6669 by Martin Panter <vadmium> in branch '3.4': New changeset 28cd11dc2915 by Martin Panter <vadmium> in branch '3.5': New changeset 3ecb5766ba15 by Martin Panter <vadmium> in branch 'default': |
New changeset cfb0481c89d7 by Martin Panter <vadmium> in branch '2.7': |
Mentioned functions are not exact equivalents of codecs. They are preferable way to to obtain the similar (apart from minor details) output. |
The list of functions were added in bpo-17844. I made the change today because I forgot that the listed functions weren’t exactly equivalent when investigating bpo-25075. Base64-codec encodes to multiple lines, but b64encode() returns the raw encoding without line breaks. I see that base64.encodebytes() is listed as a “legacy interface”, but as far as I can tell nothing outside the legacy interface does any line splitting. Hex-codec encodes to lowercase, but b16encode() returns uppercase, following RFC 4648. Quopri-codec encodes all whitespace, but quopri.encodestring() lets most whitespace through verbatim by default. In this case I think it would be reasonable to change back to encodestring() if we say that quotetabs=True is passed in. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: