pigz compiled with zlib-ng generates errors with level=1 #536

neurolabusc · 2020-02-19T15:14:04Z

@reddydp discovered that pigz compiled with zlib-ng generates errors with compression level 1. This is specific to zlib-ng, as versions of pigz compiled to default, Intel or CloudFlare zlib does not show this error. Since the zlib-ng and lib-Intel use similar strategy for level-1 compression, I wonder if this is related to a fixed intel zlib issue.

We have replicated the issue on Linux as well as MacOS. To see the issue run the following commands:

 git clone git clone https://github.com/neurolabusc/pigz-bench
 cd pigz-bench
 ./1compile.sh 
 ./4verify.sh

The issue is only seen with pigzNG with the level 1 compression:

Segmentation fault: 11  $cmd
pigzNG	1	1	33
pigzNG: skipping: /Users/chris/src/pigz-bench/corpus/asl16.nii.gz: corrupted -- invalid deflate data (invalid stored block lengths)

The text was updated successfully, but these errors were encountered:

nmoinvaz · 2020-02-20T21:22:42Z

It happens due to deflateParams call which changes the level mid-stream. It starts out as level 6 and then switches to level 1.

I believe the max window size for deflate level 1 is 8192 (the size of quick_dist_codes). This is set as (1 << 13) here:

zlib-ng/deflate.c

Line 308 in bd7d222

#ifdef X86_QUICK_STRATEGY

When the call to deflateParams happens, the w_size is never updated. This can allow dist in deflate_quick to exceed the size of quick_dist_codes array. See the check here:

zlib-ng/arch/x86/deflate_quick.c

Line 240 in bd7d222

if (dist > 0 && (dist-1) < (s->w_size - 1)) {

I think that perhaps since w_size is dependent upon compression level, we should be setting it during deflateParams call. Also perhaps configuration_level should have a setting for max window size for each algorithm and we should use that.

Removing the check for s->last_flush != 2 in deflateParams fixes the problem. It seems the change was made in 11413c0#diff-8940271ef2146523af486ca4408361da and appears to not be part of Intel zlib.

Perhaps when changing the level, if the window size changes, it should force block to flush.

mtl1979 · 2020-02-20T22:11:46Z

If deflate_quick can't handle safely larger window size than 8192, shouldn't we limit dist - 1 to smallest of s->w_size - 1 and 8191 in

zlib-ng/arch/x86/deflate_quick.c

Line 240 in bd7d222

if (dist > 0 && (dist-1) < (s->w_size - 1)) {

... This allows changing the level back and forth without changing the code too much.

nmoinvaz · 2020-02-20T22:46:35Z

Yes that is possible, however there is a problem in pigz if we don't flush the block before switching to level 1.

pigz: ../pigz.c:2449: void single_compress(int): Assertion `strm->avail_in == 0' failed.

nmoinvaz · 2020-02-20T23:07:06Z

That problem goes away if change quick deflate like so, adding s->strm->avail_in == 0 check:

The issue is a combination of madler's changes and deflate_quick which I don't think aleast Intel or Cloudflare have.

mtl1979 · 2020-02-20T23:15:33Z

I'm not completely sure adding s->strm->avail_in == 0 has no negative side-effects...

nmoinvaz · 2020-02-21T00:22:46Z

I did a burn test against switchlevels using various inputs and without s->strm->avail_in == 0 I get this:

Debug\switchlevels.exe 6 88933 1 195840 2 45761
deflate() did not consume 79973 bytes of input
Debug\switchlevels.exe 6 19196 1 123183 2 72888
deflate() did not consume 3115 bytes of input
Debug\switchlevels.exe 6 61854 1 222485 2 18297
deflate() did not consume 104115 bytes of input
Debug\switchlevels.exe 6 74302 1 94616 2 57517
Debug\switchlevels.exe 6 105968 1 227353 2 13465
deflate() did not consume 103945 bytes of input
Debug\switchlevels.exe 6 23044 1 113788 2 48375

Because deflate_quick is early exiting even though there is more available input. If I add that check those warnings go away. It could also be due to the fault of switchlevels doing this kind of check..

mtl1979 · 2020-02-21T00:35:15Z

Obviously when the check is there, it will consume more input than without the check, but deflate* is never supposed to consume more input than necessary to finish one block of output. When Z_FINISH is used, it doesn't wait for enough data to create full block.

nmoinvaz · 2020-02-21T00:53:57Z

In my test case, it is needing to flush pending buf before it emits an end block. Is that expected? That means that the block is larger than s->pending_buf_size. That would mean a single call to deflate() would not emit a single block because it returns need_more.

Perhaps it should be:

if (s->strm->avail_in == 0 && flush != Z_FINISH) {
    static_emit_end_block(s, 0);
    return need_more; 
}

mtl1979 · 2020-02-21T02:42:10Z

Because you can call deflate() with just 1 byte input, one call doesn't have to emit at least one full block, but it shouldn't emit more than one full block because that could cause overflow of internal structures.

nmoinvaz · 2020-02-21T02:52:36Z

I see, then if that case I don't see a problem with s->strm->avail_in == 0 allowing it to continue to process input after flush so long as it hasn't output a complete block which doesn't appear to be the case in that particular loop.

mtl1979 · 2020-02-21T02:56:43Z

Maybe you should make a pull requests with all the modifications so it's easier for everyone to review...

nmoinvaz · 2020-02-21T03:08:34Z

The reason I suggest calling static_emit_end_block during the s->strm->avail_in == 0 loop, is because without it I ran across deflateParams returning Z_BUF_ERROR here:

zlib-ng/deflate.c

Line 635 in bd7d222

return Z_BUF_ERROR;

This happens when deflateParams is called without a prior call to deflateReset. The test app pigz that is referenced in this issue seems to call it, while switchlevels does not. deflateReset sets last_flush == -2 which prevents it from flushing the last block in deflateParams.

But when deflateParams does flush the last block with Z_BLOCK then deflate_quick() never completely finishes the block because of the early exit with need_more. At the least the end block should be emitted when flush is Z_SYNC_FLUSH || Z_FULL_FLUSH || Z_BLOCK upon early exit.

Adding the check s->strm->avail_in == 0 only allows it to continue to process more, but there is still an edge case where avail_in reaches 0 and flush at the same time - so we need to emit the end block even with the check.

I will try to submit a PR in the coming days.

nmoinvaz · 2020-02-21T03:35:02Z

Here are my test cases for now: nmoinvaz@fb0324b

neurolabusc · 2020-03-24T17:12:36Z

Not sure if this is related to issue 284 (which was also with level 1, but not a parallel compressor). I note a couple of the Pull Requests remain open .

Here is a minimal adaptation of my Python scripts to illustrate the issue. This script will compile pigz using zlib-ng and validate the compression/decompression.

a_compile.py
c_decompress.py

The crucial line is for lvl in range(1, 3): - if compression level 1 is tested pigz-ng will generate an error (tested just now on x86-64 MacOS)

python_script.zip

nmoinvaz · 2020-04-30T23:19:08Z

The changes have been merged.

neurolabusc · 2020-05-03T11:40:13Z

@nmoinvaz the error persists in the develop (default) branch of zlib-ng. You can see this by running pigz-bench-python. You use the script a_compile.py to compile pigs using zlib-ng. Next, set up the scripts to test level 1 (which was disabled due to this bug.

For c_decompress.py edit line 172
for lvl in range(2, 10):
to read
for lvl in range(1, 10):

For f_speed_size_decompress.py edit line 81
for lvl in range(2, max_level + 1):
to read
for lvl in range(1, max_level + 1):

When you run the modified scripts, they will generate errors:

pigzng: skipping: ./temp/pigzng1_asl16.nii.gz: corrupted -- invalid deflate data (invalid distance code)
...
pigzng: skipping: ./temp/pigzng1_asl32.nii.gz: corrupted -- crc32 mismatch

It seems the issue persists with files created by pigz at level 1 when pigz is compiled with zlib-ng.

nmoinvaz · 2020-05-03T17:24:07Z

I think this may still need #549 but I haven’t tested against it myself.

nmoinvaz · 2020-05-14T00:41:37Z

@neurolabusc can you check again against latest develop branch? Thanks.

nmoinvaz · 2020-05-14T00:53:33Z

It looks like it may still be occurring.

nmoinvaz · 2020-05-25T14:58:49Z

@neurolabusc this should be resolved now.

neurolabusc · 2020-05-26T12:06:06Z

@nmoinvaz not yet fixed on the default (develop) branch of zlib-ng. I tested on an Intel i5 using MacOS and an AMD Ryzen 3900x using Ubuntu 20.04. To replicate my test:

Download pigz-bench-python
Run a_compile.py to compile various versions of pigz.
Use a text editor to change a line in c_decompress.py to read for lvl in range(1, 10): instead of for lvl in range(2, 10):
Run c_decompress.py
Notice that gzip, and pigz (regardless of zlib variant) are unable to decompress the level 1 compressed file created by the zlib-ng variant of pigz:

gzip: ./temp/pigzng1_asl16.nii.gz: uncompress failed
gzip	9279	221.97
pigzSystem: skipping: ./temp/pigzng1_asl32.nii.gz: corrupted -- invalid deflate data (invalid distance code)
pigzSystem: skipping: ./temp/pigzng1_asl16.nii.gz: corrupted -- invalid deflate data (invalid distance code)
pigzSystem	7981	258.09
pigzCloudflare: skipping: ./temp/pigzng1_asl32.nii.gz: corrupted -- invalid deflate data (invalid distance code)
pigzCloudflare: skipping: ./temp/pigzng1_asl16.nii.gz: corrupted -- invalid deflate data (invalid distance code)
pigzCloudflare	7807	263.84
pigzng: skipping: ./temp/pigzng1_asl32.nii.gz: corrupted -- invalid deflate data (invalid distance code)

On the other hand, the latest zlib-ng does appear to caught up with Cloudflare compression on both systems (Ryzen 3900x shown):

If one runs the unedited version of c_decompress.py (which only tests level 2..9) the decompression is fine, and zlib-ng is ~20% faster than the other pigz variants and twice as fast as gzip.

nmoinvaz · 2020-06-01T02:17:51Z

@neurolabusc can you confirm whether or not level 1 deflate_quick works when -p 1 command line argument is used? It sets it to single threaded. For me there is no issue when one thread is used. It has been a battle to get an environment setup to debug pigz on Windows but that is where I am at now.

neurolabusc · 2020-06-01T11:40:18Z

@nmoinvaz at least some errors are still present in the default (develop) branch of zlib-ng, even if pigz is only run in single-threaded mode. Here is how I tested this on my Linux system:

download benchmark
git clone https://github.com/neurolabusc/pigz-bench-python
open f_speed_size_decompress.py with a text editor

Insert ' -p 1 ' into line 376 and also optionally remove the quiet mode decompression (-q) so it reads:
exes.append({'exe': exe, 'uncompress': ' -f -k -d ', 'compress': ' -p 1 -q -f -k -', 'max_level': 9, 'ext': '.gz' })
Edit line 271 to test from level 1 not 2, so it reads:
for lvl in range(1, max_level+1):
Edit line 81 to test from level 1 not 2, so it reads:
for lvl in range(1, max_level + 1)::

compile the pigz variants
a_compile.py
test the pigz variants
f_speed_size_decompress.py
Notice gzip is unhappy with the level 1 compressed file created by the zlib-ng variant of pigz:
gzip: ./temp/pigzng1_asl32.nii.gz: invalid compressed data--format violated

Likewise, the terminal reveals the zlib-ng variant of pigz is unable to extract the file it created:

../exe/pigzng -k -d pigzng1_asl32.nii.gz
pigzng: skipping: pigzng1_asl32.nii.gz: corrupted -- invalid deflate data (invalid distance code)
pigzng: abort: internal threads error

nmoinvaz · 2020-06-02T04:04:27Z

Sometimes setting ZLIB_VERNUM to 0x1240 seems to make the problem disappear..

nmoinvaz · 2020-06-02T04:05:23Z

I also noticed the single_compress function outputs Z_NO_FLUSH at the end of the ZLIB_VERNUM >= 0x1260 block and parallel_compress outputs Z_BLOCK. Perhaps a bug?

nmoinvaz · 2020-06-02T04:40:38Z

It appears with Z_BLOCK it shouldn't write a block unless there is s->strm->avail_in. This the reason for the s->sym_next check at the bottom of deflate_fast. @neurolabusc, can you test against my tests/def-v1 branch and see if the error happens?

neurolabusc · 2020-06-02T13:39:39Z

@nmoinvaz this passes my tests both for Neuroimaging data as well as the Silesia corpus (shown below, pigznm is your tests/def-v1 build).

DecompressMethod	ms	mb/s
gzip	25305	226.13
pigznm	19474	293.84
pigzCloudflare	24176	236.69

neurolabusc · 2020-06-02T15:16:13Z

@nmoinvaz while your branch worked on my MacOS 4-core 8-thread MacBook, I was not so lucky with my Ubuntu 20.04 Ryzen 3900X. It handles my neuroimaging corpus without issue, but reliably generates a segmentation fault when asked to compress the xray file from the silesia corpus using level 1. This can be seen in the command line outside of the script:

./pigznm -1 -k -f ./x-ray
Segmentation fault (core dumped)
./pigznm -1 -k -f -p 1 ./x-ray
munmap_chunk(): invalid pointer
Aborted (core dumped)
./pigznm -1 -k -f -p 2 ./x-ray
Segmentation fault (core dumped)

For my MacOS system, all values of -p work (including default) except -p 1:

./pigznm -1 -k -f ./x-ray
./pigznm -1 -k -f -p 1 ./x-ray
Segmentation fault: 11
./pigznm -1 -k -f -p 2 ./x-ray

CloudFlare and System zlib work in all these situations.

nmoinvaz · 2020-06-03T02:23:50Z

@neurolabusc, can you please try again with tests/def-v1. I have made some additional changes. I don't really have the ability to track down some of these segfaults that are hardware specific.

nmoinvaz · 2020-06-03T03:00:55Z

I don't experience a crash with silesia x-ray with latest tests/def-v1 even with -p 1.

neurolabusc · 2020-06-03T11:09:12Z

Your recent commit works on all my computers with both the Silesia corpus and my neuroimaging data. If this can be merged as a pull request, I think this will close the issue. Thanks!

Below are the results for the 3900X

DecompressMethod	ms	mb/s
gzip	11045	186.49
pigzCloudflare	6378	322.92
pigzSystem	6388	322.41
pigzNM	5276	390.39

Silesia Compression:

Neuroimaging Compression:

nmoinvaz · 2020-06-19T04:21:24Z

Should be fixed now in latest develop.

…ginal because we are using switchlevels and only compressing certain parts of the original.

…levels.

neurolabusc mentioned this issue Feb 22, 2020

Improved zlib/pigz on ARM WorksOnArm/equinix-metal-arm64-cluster#195

Closed

This was referenced Feb 25, 2020

check that pigz is working afni/afni#138

Open

cross platform support (e.g. with CMake) madler/pigz#77

Open

neurolabusc mentioned this issue Mar 13, 2020

Convert shell scripts to python neurolabusc/pigz-bench#3

Closed

neurolabusc mentioned this issue Apr 7, 2020

Decompressing with pigz is not parallel madler/pigz#36

Closed

nmoinvaz closed this as completed Apr 30, 2020

nmoinvaz reopened this May 3, 2020

neurolabusc mentioned this issue May 12, 2020

Support for Huffman only strategy in pigz madler/pigz#79

Closed

This was referenced May 14, 2020

Check distance in deflate_quick using MAX_DIST(s) #583

Merged

Fixed zero length stored block left open when using Z_SYNC_FLUSH. #585

Merged

neurolabusc mentioned this issue May 18, 2020

Split up memcopy by architecture #560

Merged

nmoinvaz mentioned this issue May 30, 2020

Fixed wrong 64-bit casting in deflatePrime causing bits to be lost. #615

Closed

nmoinvaz mentioned this issue Jun 3, 2020

Fixed deflate_quick block emitting #620

Merged

nmoinvaz added the bug label Jun 3, 2020

nmoinvaz closed this as completed Jun 19, 2020

neurolabusc added a commit to neurolabusc/pigz-bench-python that referenced this issue Jun 19, 2020

Enable Level 1 testing (zlib-ng/zlib-ng#536)

9201517

nmoinvaz added a commit to nmoinvaz/zlib-ng that referenced this issue Sep 19, 2020

Turn off compare in zlib-ngGH-536 tests. We can't compare against ori…

12118cd

…ginal because we are using switchlevels and only compressing certain parts of the original.

nmoinvaz added a commit to nmoinvaz/zlib-ng that referenced this issue Sep 19, 2020

Turn off compare and gzip verify in zlib-ngGH-536 tests due to switch…

ae12480

…levels.

nmoinvaz added a commit to nmoinvaz/zlib-ng that referenced this issue Sep 20, 2020

Turn off compare and gzip verify in zlib-ngGH-536 tests due to switch…

9e72248

…levels.

nmoinvaz added a commit to nmoinvaz/zlib-ng that referenced this issue Sep 23, 2020

Turn off compare and gzip verify in zlib-ngGH-536 tests due to switch…

d3c1fe8

…levels.

nmoinvaz added a commit to nmoinvaz/zlib-ng that referenced this issue Sep 23, 2020

Turn off compare and gzip verify in zlib-ngGH-536 tests due to switch…

636646b

…levels.

Dead2 pushed a commit that referenced this issue Sep 23, 2020

Turn off compare and gzip verify in GH-536 tests due to switchlevels.

4ec27fa

nmoinvaz mentioned this issue Jun 4, 2021

Added pigz to unit tests #985

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pigz compiled with zlib-ng generates errors with level=1 #536

pigz compiled with zlib-ng generates errors with level=1 #536

neurolabusc commented Feb 19, 2020

nmoinvaz commented Feb 20, 2020 •

edited

Loading

mtl1979 commented Feb 20, 2020 •

edited

Loading

nmoinvaz commented Feb 20, 2020

nmoinvaz commented Feb 20, 2020 •

edited

Loading

mtl1979 commented Feb 20, 2020

nmoinvaz commented Feb 21, 2020

mtl1979 commented Feb 21, 2020

nmoinvaz commented Feb 21, 2020 •

edited

Loading

mtl1979 commented Feb 21, 2020

nmoinvaz commented Feb 21, 2020 •

edited

Loading

mtl1979 commented Feb 21, 2020

nmoinvaz commented Feb 21, 2020 •

edited

Loading

nmoinvaz commented Feb 21, 2020

neurolabusc commented Mar 24, 2020

nmoinvaz commented Apr 30, 2020

neurolabusc commented May 3, 2020

nmoinvaz commented May 3, 2020

nmoinvaz commented May 14, 2020

nmoinvaz commented May 14, 2020 •

edited

Loading

nmoinvaz commented May 25, 2020

neurolabusc commented May 26, 2020 •

edited

Loading

nmoinvaz commented Jun 1, 2020 •

edited

Loading

neurolabusc commented Jun 1, 2020 •

edited

Loading

nmoinvaz commented Jun 2, 2020

nmoinvaz commented Jun 2, 2020 •

edited

Loading

nmoinvaz commented Jun 2, 2020

neurolabusc commented Jun 2, 2020

neurolabusc commented Jun 2, 2020 •

edited

Loading

nmoinvaz commented Jun 3, 2020

nmoinvaz commented Jun 3, 2020

neurolabusc commented Jun 3, 2020

nmoinvaz commented Jun 19, 2020

pigz compiled with zlib-ng generates errors with level=1 #536

pigz compiled with zlib-ng generates errors with level=1 #536

Comments

neurolabusc commented Feb 19, 2020

nmoinvaz commented Feb 20, 2020 • edited Loading

mtl1979 commented Feb 20, 2020 • edited Loading

nmoinvaz commented Feb 20, 2020

nmoinvaz commented Feb 20, 2020 • edited Loading

mtl1979 commented Feb 20, 2020

nmoinvaz commented Feb 21, 2020

mtl1979 commented Feb 21, 2020

nmoinvaz commented Feb 21, 2020 • edited Loading

mtl1979 commented Feb 21, 2020

nmoinvaz commented Feb 21, 2020 • edited Loading

mtl1979 commented Feb 21, 2020

nmoinvaz commented Feb 21, 2020 • edited Loading

nmoinvaz commented Feb 21, 2020

neurolabusc commented Mar 24, 2020

nmoinvaz commented Apr 30, 2020

neurolabusc commented May 3, 2020

nmoinvaz commented May 3, 2020

nmoinvaz commented May 14, 2020

nmoinvaz commented May 14, 2020 • edited Loading

nmoinvaz commented May 25, 2020

neurolabusc commented May 26, 2020 • edited Loading

nmoinvaz commented Jun 1, 2020 • edited Loading

neurolabusc commented Jun 1, 2020 • edited Loading

nmoinvaz commented Jun 2, 2020

nmoinvaz commented Jun 2, 2020 • edited Loading

nmoinvaz commented Jun 2, 2020

neurolabusc commented Jun 2, 2020

neurolabusc commented Jun 2, 2020 • edited Loading

nmoinvaz commented Jun 3, 2020

nmoinvaz commented Jun 3, 2020

neurolabusc commented Jun 3, 2020

nmoinvaz commented Jun 19, 2020

nmoinvaz commented Feb 20, 2020 •

edited

Loading

mtl1979 commented Feb 20, 2020 •

edited

Loading

nmoinvaz commented Feb 20, 2020 •

edited

Loading

nmoinvaz commented Feb 21, 2020 •

edited

Loading

nmoinvaz commented Feb 21, 2020 •

edited

Loading

nmoinvaz commented Feb 21, 2020 •

edited

Loading

nmoinvaz commented May 14, 2020 •

edited

Loading

neurolabusc commented May 26, 2020 •

edited

Loading

nmoinvaz commented Jun 1, 2020 •

edited

Loading

neurolabusc commented Jun 1, 2020 •

edited

Loading

nmoinvaz commented Jun 2, 2020 •

edited

Loading

neurolabusc commented Jun 2, 2020 •

edited

Loading