Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bitpacking storage compression #2679

Merged
merged 57 commits into from
Dec 3, 2021

Conversation

samansmink
Copy link
Contributor

@samansmink samansmink commented Nov 26, 2021

Implemented the bitpacking for in the storage compression framework. The bitpacking itself is done with the FastPfor library from which I moved the necessary code into a ./third_party folder. I ran some benchmarks for evaluation, which show that read performance does suffer significantly for some queries. especially Q06 and Q11 have some serious slowdown, overal times are not too bad though. I'm interested to hear what you think!

I think there's an optimization possibility for INT32 and INT64 which would allow not using the decompression buffer and decompressing straight into the result vector in some cases. I could add that to this pull request or do that in a separate one, wanted to get your opinion on the code first!

Evaluation

TPC-H SF1 (DuckDB in persistent mode)

Query Without Bitpacking With Bitpacking Slowdown %
1 0.0564074 0.062565 10.92%
2 0.0105152 0.011236 6.85%
3 0.0346146 0.0359086 3.74%
4 0.0710824 0.0802214 12.86%
5 0.0232784 0.0264334 13.55%
6 0.008228 0.0123128 49.65%
7 0.070887 0.0745774 5.21%
8 0.0242554 0.0294072 21.24%
9 0.2923278 0.28811 -1.44%
10 0.0687286 0.067365 -1.98%
11 0.0058776 0.010745 82.81%
12 0.0668858 0.0737374 10.24%
13 0.0408016 0.0405922 -0.51%
14 0.0146562 0.0181906 24.12%
15 0.0307516 0.0389414 26.63%
16 0.0480024 0.048662 1.37%
17 0.104161 0.1041996 0.04%
18 0.1079024 0.1099522 1.90%
19 0.0501276 0.0540862 7.90%
20 0.0473648 0.052386 10.60%
21 0.1891522 0.182864 -3.32%
22 0.0359452 0.036143 0.55%
Query Without Bitpacking With Bitpacking Slowdown %
average TPC-H 0.06630165455 0.06372514545 12.86%
total TPC-H 1.4019532 1.4586364 4.04%

To get an estimate of the real world compression ratio, I can the following query on the TPC-H SF1 lineitem table.

select count(distinct block_id) from pragma_storage_info('lineitem') where segment_type not in('VARCHAR', 'VALIDITY');

Compression Distinct blocks Compression
Uncompressed 1369 1.00
Only RLE 1320 1.04
RLE + BP (byte-aligned) 538 2.54
RLE + BP 477 2.87
Only BP (byte-aligned) 564 2.43
Only BP 511 2.68

The overal size of the TPC-H SF1 file is:

implementation Overal storage size
without BP 1.1G
BP byte aligned 876M
BP 855M

@pdet
Copy link
Contributor

pdet commented Nov 30, 2021

Hey @samansmink ! Cool stuff!
I had a quick look at your tests, I think there are none with the SQL compression options?
I guess would also be cool to support combinations of compressions in there as well!
#2473

@samansmink
Copy link
Contributor Author

@pdet Ah I missed the SQL compression option completely, I added a bitpacking column to the existing test.
Multiple compression methods on the same column is interesting for sure! I'll give it some thought!

@samansmink
Copy link
Contributor Author

Reran TPC-H at SF10 (filesize with/without bp 8.4G/11G)

benchmark no_bitpacking with_bitpacking with_bitpacking_diff
benchmark/tpch/sf10/q01.benchmark 0.51997950 0.54696925 5%
benchmark/tpch/sf10/q02.benchmark 0.08547680 0.08904300 4%
benchmark/tpch/sf10/q03.benchmark 0.43510460 0.45110340 4%
benchmark/tpch/sf10/q04.benchmark 0.85312320 0.88854640 4%
benchmark/tpch/sf10/q05.benchmark 0.26582940 0.27807340 5%
benchmark/tpch/sf10/q06.benchmark 0.08049180 0.09928340 23%
benchmark/tpch/sf10/q07.benchmark 0.98799560 0.98849440 0%
benchmark/tpch/sf10/q08.benchmark 0.29346280 0.30395680 4%
benchmark/tpch/sf10/q09.benchmark 4.63613540 4.62440660 -0%
benchmark/tpch/sf10/q10.benchmark 0.43968820 0.45437000 3%
benchmark/tpch/sf10/q11.benchmark 0.04535040 0.04977500 10%
benchmark/tpch/sf10/q12.benchmark 0.75616780 0.77537620 3%
benchmark/tpch/sf10/q13.benchmark 0.65411400 0.65292860 -0%
benchmark/tpch/sf10/q14.benchmark 0.14892700 0.16555840 11%
benchmark/tpch/sf10/q15.benchmark 0.33285100 0.37085260 11%
benchmark/tpch/sf10/q16.benchmark 1.20767760 1.23794460 3%
benchmark/tpch/sf10/q17.benchmark 1.68935180 1.68888780 -0%
benchmark/tpch/sf10/q18.benchmark 1.59503580 1.57809420 -1%
benchmark/tpch/sf10/q19.benchmark 0.43726780 0.45906720 5%
benchmark/tpch/sf10/q20.benchmark 0.61146240 0.63571480 4%
benchmark/tpch/sf10/q21.benchmark 1.88267740 1.90106120 1%
benchmark/tpch/sf10/q22.benchmark 0.47079560 0.47187200 0%
TPCH TOTAL 18.42896590 18.71137925 2%
TPCH AVG 0.83768027 0.85051724 2%
benchmark/micro/compression/bitpacking_read.benchmark 6.19573150 7.34437775 19%

Copy link
Collaborator

@Mytherin Mytherin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the updates! Looks excellent. Some more minor comments, then I think this is ready to merge:

PRAGMA force_compression = 'bitpacking'

statement ok
CREATE TABLE test (id INTEGER, l INTEGER[]);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we add one more extra test:

  • A few really long lists (>vector size, you can use the LIST aggregate function to create this, e.g. SELECT LIST(i) FROM range(10000) tbl(i), you can use GROUP BY to create multiple long lists)

src/storage/compression/bitpacking.cpp Outdated Show resolved Hide resolved
src/storage/compression/bitpacking.cpp Show resolved Hide resolved
@samansmink
Copy link
Contributor Author

@Mytherin TPC-H SF10 latest results added as bitpacking_current:

benchmark no_bitpacking with_bitpacking_old with_bitpacking_current with_bitpacking_old_diff with_bitpacking_current_diff
benchmark/tpch/sf10/q01.benchmark 0.51997950 0.54696925 0.52500350 5% 1%
benchmark/tpch/sf10/q02.benchmark 0.08547680 0.08904300 0.08890440 4% 4%
benchmark/tpch/sf10/q03.benchmark 0.43510460 0.45110340 0.42735180 4% -2%
benchmark/tpch/sf10/q04.benchmark 0.85312320 0.88854640 0.86999780 4% 2%
benchmark/tpch/sf10/q05.benchmark 0.26582940 0.27807340 0.26286400 5% -1%
benchmark/tpch/sf10/q06.benchmark 0.08049180 0.09928340 0.08039640 23% -0%
benchmark/tpch/sf10/q07.benchmark 0.98799560 0.98849440 0.99694380 0% 1%
benchmark/tpch/sf10/q08.benchmark 0.29346280 0.30395680 0.28696540 4% -2%
benchmark/tpch/sf10/q09.benchmark 4.63613540 4.62440660 4.67923960 -0% 1%
benchmark/tpch/sf10/q10.benchmark 0.43968820 0.45437000 0.44265600 3% 1%
benchmark/tpch/sf10/q11.benchmark 0.04535040 0.04977500 0.04618320 10% 2%
benchmark/tpch/sf10/q12.benchmark 0.75616780 0.77537620 0.77153380 3% 2%
benchmark/tpch/sf10/q13.benchmark 0.65411400 0.65292860 0.65441340 -0% 0%
benchmark/tpch/sf10/q14.benchmark 0.14892700 0.16555840 0.14612360 11% -2%
benchmark/tpch/sf10/q15.benchmark 0.33285100 0.37085260 0.34588980 11% 4%
benchmark/tpch/sf10/q16.benchmark 1.20767760 1.23794460 1.27569320 3% 6%
benchmark/tpch/sf10/q17.benchmark 1.68935180 1.68888780 1.68836480 -0% -0%
benchmark/tpch/sf10/q18.benchmark 1.59503580 1.57809420 1.58604440 -1% -1%
benchmark/tpch/sf10/q19.benchmark 0.43726780 0.45906720 0.43757320 5% 0%
benchmark/tpch/sf10/q20.benchmark 0.61146240 0.63571480 0.62259700 4% 2%
benchmark/tpch/sf10/q21.benchmark 1.88267740 1.90106120 1.87933240 1% -0%
benchmark/tpch/sf10/q22.benchmark 0.47079560 0.47187200 0.47776080 0% 1%

@Mytherin
Copy link
Collaborator

Mytherin commented Dec 3, 2021

Excellent results!

@Mytherin Mytherin merged commit 29e6c28 into duckdb:master Dec 3, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants