Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
Case where brotli quality 9 compresses better than quality 11 #411
I found a test case where brotli with quality 9 compresses better than with quality 11. This seemed worth reporting in case there might be some convenient way to detect and handle this case (short of compressing the whole file both ways to check).
Test case and compressed versions: brotli-test.tar.gz
(This test case came from a locally modified version of bsdiff that doesn't use bzip2 internally, so that I could test it with brotli. The bsdiff goes from jquery-2.2.3.min.js to jquery-2.2.4.min.js .)
Compressing jquery.ubsdiff with brotli quality 11 produces 264 bytes; compressing with quality 9 produces 252 bytes.
I looked a bit into it, and it seems that brotli literal costs estimates are spoiled by the many zeros in the test file. In fact, brotli quality 10 performs ~10% worse than quality 9, whereas 9 and 11 are rather close. When inspecting the compressed files, I noticed that quality 10 produces less commands and more literal insertions than 11. Raising the estimated literal cost if it is smaller than 1.0 in BrotliEstimateBitCostsForLiterals mitigates the issue to some extent, and actually improves compression at quality 11 for the provided test file.
The challenge would be to not break benchmarks on other corpora by playing with the cost estimates.