-
-
Notifications
You must be signed in to change notification settings - Fork 250
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
x86_64 deflate_quick fails on chromium-benchmark fireworks.jpeg #284
Comments
I still see it failing at level 1 with
|
It passes on aarch64 and fails on x86_64 which points to some bug in intel's patches for |
When replacing |
It would be good to know why deflate_quick fails... That pretty much requires enabling debug output and possibly adding more sanity checks. |
The benchmark uses an estimate of the compress size to resize a buffer before decompressing.
The benchmark passes if we set the estimated compression size to double of the input:
When printing the original estimated size, we see far less than 5%:
The fix is to special case intel's deflate_quick to have a different function to estimate its output buffer size. |
There is at least one more bug that we see with lcet10.txt and others that returns -3 which is different than the jpeg which was returning 0 instead of 1. |
Maybe we should adjust one of the shifts to make it return slightly larger value... That way we don't completely butcher the logic inside compressBound...
This should allow 6.25~6.28% increase. |
I did some further investigation and it's not entirely clear to me who's at fault. Here's my patch to alter the method that the chromium zlib benchmark compresses data that passes all tests and requires no changes to zlib-ng.
This patch was inspired by minigzip.c's method of decompressing in chunks of 3 * 16KB instead of attempting to do it all in one call. This patch passes for all the input files at level 1 and level 9 with all 3 wrapper types. In the case of It's also a bit unexpected that all other algorithm variant has no trouble with @sebpop and I can try to push for changes to the benchmark but I'm worried that may just be hiding a deeper set of bugs. |
I've made a small change to my previous patch and everything continues to pass:
The big difference is I am now using zlib-ng's |
I think the change in behavior needs to be documented for the compression algorithm on x86 at level 1. |
For those interested, here is my latest patch to the Google zlib benchmark. It is less prone to failure, especially if you alter the
|
Why can't this be fixed instead by changing |
Hello @Myriachan , I think that's one option. My concern with such a patch is if it masks a deeper bug or issue. I cant say with confidence if x86 at level 1 is operating correctly and don't have the deeper knowledge of Intel's |
Does PR #382 affect this problem at all? It seems like a real bug in deflate_quick, but I am not clear on whether it fixes this problem or not. |
@bmrzycki Thank you for checking, I really appreciate it :) I am hoping that this bug is related to #390, since that seems to erroneously emit an extra end block each time it runs out of data in the in-buffer. |
@Dead2 you're welcome. :) I manually patched c0f0acf with the code from PR #390 :
and I still see the same |
With fireworks.jpeg I am getting the following error using deflate level 1 and mem level 1:
When I comment out the following line, it works:
|
Hello @nmoinvaz , I applied the code in #396 on top of 5d4d630 :
And the test still failed on x86_64 (zlib wrapper):
|
Worst case for
That is, 1 bit extra for each byte + 3 bits for block start + 10 bits for block end + 7 bits for round up, converted to bytes + the input size + 18 bytes for gzip header and footer. Here is a text file that produces this worse case result: Buffer size must be about |
@neurolabusc is still indicating this triggers a segfault in pigz at least: afni/afni#138 (comment) |
I think that might be a separate issue fixed in #541. This change made in |
Hello @nmoinvaz. I am unable to perform this testing now. I'm at a new company and do not have formal clearance from management to contribute to this project yet. In the meantime you should be able to test it yourself on any x86_64 Linux host. The code diff to fix the errors is from #284 (comment) on top of the Chromium zlib benchmark C++ code: https://github.com/chromium/chromium/blob/master/third_party/zlib/contrib/bench/zlib_bench.cc . Without the patch we see the failure. Linking to zlib.a is a simple compile. The zlib_bench is a standalone C++ program:
The source files for compression/decompression come from Google's snappy testdata directory: https://github.com/google/snappy/tree/master/testdata . You can execute the test with the following:
|
Thanks @sebpop. I have tested it on x86 myself but I was looking for independent verification since this issue has been open a long time. Congrats on your new position! Hopefully they will let you contribute. |
This should be resolved now that #541 has been merged. |
In the past I have seen some errors on x86_64 with the snappy chromium-benchmark at compression level 1. Check that these errors are fixed.
The text was updated successfully, but these errors were encountered: