Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

T64 full bit-transpose variant #5742

Merged
merged 5 commits into from Jul 2, 2019
Merged

Conversation

4ertus2
Copy link
Contributor

@4ertus2 4ertus2 commented Jun 25, 2019

I hereby agree to the terms of the CLA available at: https://yandex.ru/legal/cla/?lang=en

Category (leave one):

  • Improvement

Short description (up to few sentences):
Full bit transpose variant for T64 codec. Could lead to better compression with zstd.

Detailed description (optional):
Continuation of #5557 and #5731. Allow to set option to T64 codec: T64('byte'), T64('bit'). So it would generate byte-transposed or bit-transposed data. By default it uses 'byte' variant.

@4ertus2 4ertus2 changed the title T64 T64 full bit-transpose variant Jun 25, 2019
@4ertus2
Copy link
Contributor Author

4ertus2 commented Jun 25, 2019

I've compressed some real data DateTime column with different variants (one month data, not in key)

lz4             833569732
zstd            515097904
t64(byte)+lz4   407128098
t64(bit)+lz4    472855596
t64(byte)+zstd  345330458
t64(bit)+zstd   318123008

delta+lz4       407962488
delta+zstd      241532181

It looks that full bit-transpose helps zstd and do not helps lz4.

@alexey-milovidov
Copy link
Member

alexey-milovidov commented Jun 26, 2019

You can also compare with different levels of ZSTD, because level 1 is used by default.
When comparing by compressed size, you should also mention compression and decompression speed.

@alexey-milovidov
Copy link
Member

alexey-milovidov commented Jun 26, 2019

PS. What's in stress test?

@4ertus2
Copy link
Contributor Author

4ertus2 commented Jun 26, 2019

PS. What's in stress test?

Code: 202. DB::Exception: Received from localhost:9000, ::1. DB::Exception: Too many simultaneous queries. Maximum: 100.

@alexey-milovidov
Copy link
Member

alexey-milovidov commented Jun 28, 2019

Code: 202. DB::Exception: Received from localhost:9000, ::1. DB::Exception: Too many simultaneous queries. Maximum: 100.

FYI. This message cannot relate to stress test failure. It fails only in case of sanitizer reports or server crash.

Copy link
Member

@alexey-milovidov alexey-milovidov left a comment

.

@alexey-milovidov alexey-milovidov merged commit 7ea3320 into ClickHouse:master Jul 2, 2019
22 of 23 checks passed
@stavrolia stavrolia added the pr-improvement Pull request with some product improvements label Jul 3, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr-improvement Pull request with some product improvements
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants