-
-
Notifications
You must be signed in to change notification settings - Fork 29.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Corrupt input data when using lzma to decompress a file #92018
Comments
Does it work with FORMAT_RAW? |
@serhiy-storchaka, no, |
It seems you are hitting a problem with how xz (and its lzma lib used by python) handles the end-of-stream marker. Which doesn't match Igor Pavlov's specs from his DOC/lzma-specification.txt:
xz and python can decompress your file after replacing the uncompressed size at the head of your file:
|
thanks @kbeldan, very helpful! after replacing the size in the header to -1, I was indeed able to decompress the file/buffer. fangq/pyjdata@61bb884#diff-d43e2c1f1c71d64169fcaa6b02fcdaee40d8c841335a61645029ceb263fe087fR168-R169 from reading the spec you quoted above, I still consider this a bug on https://bugs.launchpad.net/ubuntu/+source/xz-utils/+bug/1970762 |
Bug report
The built-in lzma module failed to decompress a valid .lzma file with the following errors:
The .lzma file was compressed using the public-domain C library written by Igor Pavlov, see
https://github.com/fangq/zmat/tree/master/src/easylzma/pavlov
before compression, the binary buffer has a length of 1966104 bytes, after compression, the file,
mat.lzma
(can be downloaded from this link) has a length of 1536957 bytes.when running
file mat.lzma
, it printsI was able to decompress this file using either the C library mentioned above, or using the below NodeJS/JavaScript script (with either
lzma-purejs
orlzma
npm modules)the above script corrected decoded the buffer:
however, using the below python script, I got an error
error message:
Because Igor Pavlov's C library implements the original lzma algorithm, so I believe the FORMAT_ALONE flag was used correctly.
I want to mention that the test file
mat.lzma
can be correctly decompressed usinglzma -d
on Ubuntu 20.04, but it gives an error on Ubuntu 18.04 and 22.04 (both uses xz utils based lzma), I believe this is due to the nature that the two lzma commands are differentYour environment
Python 3.6 on Ubuntu 18.04
Python 3.8 on Ubuntu 20.04
Python 3.10 on Ubuntu 22.04
The text was updated successfully, but these errors were encountered: