New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Case where brotli quality 9 compresses better than quality 11 #411

Open
joshtriplett opened this Issue Aug 14, 2016 · 5 comments

Comments

Projects
None yet
5 participants
@joshtriplett

joshtriplett commented Aug 14, 2016

I found a test case where brotli with quality 9 compresses better than with quality 11. This seemed worth reporting in case there might be some convenient way to detect and handle this case (short of compressing the whole file both ways to check).

Test case and compressed versions: brotli-test.tar.gz

(This test case came from a locally modified version of bsdiff that doesn't use bzip2 internally, so that I could test it with brotli. The bsdiff goes from jquery-2.2.3.min.js to jquery-2.2.4.min.js .)

Compressing jquery.ubsdiff with brotli quality 11 produces 264 bytes; compressing with quality 9 produces 252 bytes.

@eustas

This comment has been minimized.

Contributor

eustas commented Aug 15, 2016

Thanks for the report. I'll investigate it later. Feel free to ping me in a month or so, it I do not report back...

@RajuSuranagi

This comment has been minimized.

RajuSuranagi commented Jun 25, 2018

@eustas, since it has been more than a month and @joshtriplett didn't ping you, is there anything on this question? I am just curious about the number 11 than anything else. Why 11?

@dikmax

This comment has been minimized.

dikmax commented Jun 29, 2018

@RajuSuranagi Homage to Spın̈al Tap I suppose.

@mraszyk

This comment has been minimized.

mraszyk commented Jul 10, 2018

I looked a bit into it, and it seems that brotli literal costs estimates are spoiled by the many zeros in the test file. In fact, brotli quality 10 performs ~10% worse than quality 9, whereas 9 and 11 are rather close. When inspecting the compressed files, I noticed that quality 10 produces less commands and more literal insertions than 11. Raising the estimated literal cost if it is smaller than 1.0 in BrotliEstimateBitCostsForLiterals mitigates the issue to some extent, and actually improves compression at quality 11 for the provided test file.

The challenge would be to not break benchmarks on other corpora by playing with the cost estimates.

@RajuSuranagi

This comment has been minimized.

RajuSuranagi commented Jul 11, 2018

Thanks for the insight!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment