Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

less aggressive lz4 estimated uncompressed size #2185

Merged
merged 2 commits into from Jan 15, 2019
Merged

less aggressive lz4 estimated uncompressed size #2185

merged 2 commits into from Jan 15, 2019

Conversation

mhowlett
Copy link
Contributor

I believe 4x is a good constant here. Based on measurements included in the cited reference, it will cover the majority of cases. On the flip side of this, any adjustment up results in meaningfully more memory usage in the scenario i've been testing.

including contentSize in the lz4 frame is marked as a TODO in the java code. we should encourage them to add this, as it would be very beneficial for librdkafka.

* default (4x compression) and reallocate if needed
* More info on max size: http://stackoverflow.com/a/25751871/1821055
* More info on lz4 compression ratios seen for different data sets:
* http://dev.ti.com/tirex/content/simplelink_msp432p4_sdk_1_50_00_12/docs/lz4/users_guide/docguide.llQpgm/benchmarking.html
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That link suggests something closer to 2x as the "default" rate, but 4 is fine and is alot better than 255 :)

@edenhill edenhill merged commit 3cf6848 into confluentinc:master Jan 15, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants