You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have many, many files which were compressed as part of a long-standing real-time loop using pixz.
Many are big and so I want to decompress them individually. But others are small and I expect efficiencies from concatenating them and then pass that concatenation to pixz for decompression.
So at root I am looking to do something like:
cat small_files*.txt.xz > file.txt.xz
pixz -d -p 4 > file.txt < file.txt.xz
Even better would be something like
pixz -d -p 4 > file.txt < small_files*.txt.xz
The text was updated successfully, but these errors were encountered:
Reading the index + footer of the first file, and header and first block of the second file, just consume from the read buffer. That's fine
But when we dispatch the first block of the second file to a decompressor threads, we pass along the entire read buffer, even though it contains too much data. Then the next block doesn't get that data and blows up.
So this bug only happens when there's small, concatenated files. Fun!
Should be fixed, please give it a try. You can do cat f1.xz f2.xz f3.xz | pixz -d > outut.txt.
Note that pixz isn't really better than xz for compression/decompressing lots of individual small files. We can only really use parallelization with large files (including tarballs that contain lots of small files).
I have many, many files which were compressed as part of a long-standing real-time loop using pixz.
Many are big and so I want to decompress them individually. But others are small and I expect efficiencies from concatenating them and then pass that concatenation to pixz for decompression.
So at root I am looking to do something like:
cat small_files*.txt.xz > file.txt.xz
pixz -d -p 4 > file.txt < file.txt.xz
Even better would be something like
pixz -d -p 4 > file.txt < small_files*.txt.xz
The text was updated successfully, but these errors were encountered: