-
-
Notifications
You must be signed in to change notification settings - Fork 30.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
zlib.Decompress.decompress/flush do not raise any exceptions when given truncated input streams #56855
Comments
If a truncated input stream is given to the zlib.decompress function, it raises a zlib.error. However, if the same stream is fed to a zlib.Decompress object, no exception is raised during the entire lifetime of the object. Attached is an example demonstrating the discrepancy. zlib.decompress raises this exception when it gets a Z_BUF_ERROR from zlib, while the implementation of zlib.Decompress.decompress just skips every Z_BUF_ERROR it gets as described in this (apparently inaccurate) comment: |
I believe the attached patch fixes this problem, making zlib.Decompress.flush() raise the exception raised by zlib.decompress(). In the same patch, I also took the opportunity to correct a wrong comment in the implementation of flush() and change the error messages given by zlib.{De,C}ompress.flush() on {in,de}flateEnd() errors to the more end-user-friendly ones given in the same occasions by zlib.{de,}compress(). If this does not sound like a good thing to do, feel free (whoever ends up committing this) to remove these changes. One uncomfortable issue I see with the patch is that zlib.Decompress.flush() now potentially gives an error message with Z_OK as the error code, but unless I misunderstand the comments in the real zlib’s zlib.h and that never happens (I was unable to produce a situation that would cause this), the only other options are faking another error code and setting an exception message whose format is different from all other exceptions raised by the zlib module. |
Actually, I think another call to inflate(), which would be valid at that point, would just return the other error code, so it can as well be faked. |
Looking at Lib/test/test_zlib.py, it appears that this behaviour is intentional |
I'm not a zlib specialist, but I think what this means is that the stream is not finished, but it's valid anyway. For example, you get the same behaviour by doing: c = zlib.compressobj()
s = c.compress(b'This is just a test string.')
s += c.flush(zlib.Z_FULL_FLUSH) The resulting bytestring is a non-terminated zlib stream. It still decompresses to the original data fine. I think the appropriate fix is to add an argument to flush(). Here is a patch, I named the argument "strict" by lack of imagination :) |
I like the new patch, but shouldn’t the default be to behave the same way zlib.decompress() behaves, i. e. raise? (Or perhaps zlib.decompress() should be modified not to raise instead. I’m just aiming at consistency.) Of course this will break code that relies on the old behaviour but the fix (adding strict=False) is trivial. |
I have another proposition (as an alternative). The new _bz2.BZ2Decompressor objects have an attribute called eof which is False until the end of the stream is read. The same attribute could be added to zlib.Decompress objects. |
+1, I like the idea of being consistent across related modules. |
Here's the patch. |
New changeset bb6c2d5c811d by Nadeem Vawda in branch 'default': |
New changeset 65d61ed991d9 by Nadeem Vawda in branch 'default': |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: