-
-
Notifications
You must be signed in to change notification settings - Fork 30k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gzip.compress and gzip.decompress are sub-optimally implemented. #87779
Comments
When working on python-isal which aims to provide faster drop-in replacements for the zlib and gzip modules I found that the gzip.compress and gzip.decompress are suboptimally implemented which hurts performance. gzip.compress and gzip.decompress both do the following things:
That means there is way more Python code involved than strictly necessary. Also the 'data' is already fully in memory, but the data is streamed anyway. That is quite a waste. I propose the following:
And that brings to another point. Should non-descriptive EOFErrors be raised when reading the gzip header? Or throw informative BadGzipFile errors when the header is parsed. I tend towards the latter. For example BadGzipFile("Truncated header") instead of EOFError. Or at least EOFError("Truncated gzip header"). I am aware that confounds this issue with another issue, but these things are coupled in the implementation so both need to be solved at the same time. Given the headaches that gzip.decompress gives it might be easier to solve gzip.compress first in a first PR and do gzip.decompress later. |
I created bpo-43621 for the error issue. There should only be BadGzipFile. Once that is fixed, having only one error type will make it easier to implement some functions that are shared across the gzip.py codebase. |
Issue was solved by moving code from _GzipReader to separate functions and maintaining the same error structure. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: