-
-
Notifications
You must be signed in to change notification settings - Fork 31k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bz2/lzma: Compressor/Decompressor objects are only initialized in __init__ #67413
Comments
Noticed in a patch review around LZMModules/lzmamodule.c:1055 that the C-level LZMADecompressor object is being initialized in an __init_() method. It crashes if you create the object with __new__() and never call __init__(): >>> from lzma import LZMADecompressor
>>> LZMADecompressor.__new__(LZMADecompressor).decompress(bytes())
Segmentation fault (core dumped)
[Exit 139] |
A patch for this might conflict with the LZMA patch for bpo-15955, so it would be simplest to wait for that issue to be resolved first |
Similar situation in the bzip module: >>> BZ2Decompressor.__new__(BZ2Decompressor).decompress(bytes())
Segmentation fault (core dumped) |
I've upload a patch which should address the issue in both the lzma and bz2 modules. |
This is not the only issue. Calling __init__ multiple times causes leaking locks, bz2/lzma internal buffers, etc. It looks to me that the most straightforward way is using __new__ instead of __init__. |
Calling directly __new__() is very unusual, so I'm not surprised that you found bugs. Calling __init__() twice is also an unusal practice. I concur with Serhiy: implementing new instead of init should prevent such bug. I recall that some objects decided instead to raise an exception if they are not fully initialized (if init has not been called), but using new instead of init seems to be safer approach if we can implement it. In the meanwhile, I blocked by the memory leak (handle leak) in bz2 and lzma: bpo-33916. I proposed a very simple fix: PR 7843. I propose to apply this one since it's easy to backport it to all branches, whereas replacing init with new would be more difficult and risky to backport: I would suggest to only make this change in the master. What do you think? |
bz2 in 2.7 is also affected. Victor, do we want to fix the crash at all in stable branches? If yes, IMHO taking the slight risk of __init__ -> __new__ change is preferable to taking the trouble to implement the alternative backwards-compatible fix (init checks in all methods) for stable branches. |
The issue is not solved yet. |
These can be done in .__new__() method:
In .__init__() method, only set (de)?compression parameters. And prevent .__init__() method from being called multiple times. This mode works fine in my pyzstd module (A Python bindings to zstd library). |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: