Bitpacking storage compression #2679

samansmink · 2021-11-26T09:12:28Z

Implemented the bitpacking for in the storage compression framework. The bitpacking itself is done with the FastPfor library from which I moved the necessary code into a ./third_party folder. I ran some benchmarks for evaluation, which show that read performance does suffer significantly for some queries. especially Q06 and Q11 have some serious slowdown, overal times are not too bad though. I'm interested to hear what you think!

I think there's an optimization possibility for INT32 and INT64 which would allow not using the decompression buffer and decompressing straight into the result vector in some cases. I could add that to this pull request or do that in a separate one, wanted to get your opinion on the code first!

Evaluation

TPC-H SF1 (DuckDB in persistent mode)

Query	Without Bitpacking	With Bitpacking	Slowdown %
1	0.0564074	0.062565	10.92%
2	0.0105152	0.011236	6.85%
3	0.0346146	0.0359086	3.74%
4	0.0710824	0.0802214	12.86%
5	0.0232784	0.0264334	13.55%
6	0.008228	0.0123128	49.65%
7	0.070887	0.0745774	5.21%
8	0.0242554	0.0294072	21.24%
9	0.2923278	0.28811	-1.44%
10	0.0687286	0.067365	-1.98%
11	0.0058776	0.010745	82.81%
12	0.0668858	0.0737374	10.24%
13	0.0408016	0.0405922	-0.51%
14	0.0146562	0.0181906	24.12%
15	0.0307516	0.0389414	26.63%
16	0.0480024	0.048662	1.37%
17	0.104161	0.1041996	0.04%
18	0.1079024	0.1099522	1.90%
19	0.0501276	0.0540862	7.90%
20	0.0473648	0.052386	10.60%
21	0.1891522	0.182864	-3.32%
22	0.0359452	0.036143	0.55%

Query	Without Bitpacking	With Bitpacking	Slowdown %
average TPC-H	0.06630165455	0.06372514545	12.86%
total TPC-H	1.4019532	1.4586364	4.04%

To get an estimate of the real world compression ratio, I can the following query on the TPC-H SF1 lineitem table.

select count(distinct block_id) from pragma_storage_info('lineitem') where segment_type not in('VARCHAR', 'VALIDITY');

Compression	Distinct blocks	Compression
Uncompressed	1369	1.00
Only RLE	1320	1.04
RLE + BP (byte-aligned)	538	2.54
RLE + BP	477	2.87
Only BP (byte-aligned)	564	2.43
Only BP	511	2.68

The overal size of the TPC-H SF1 file is:

implementation	Overal storage size
without BP	1.1G
BP byte aligned	876M
BP	855M

…tually compress yet

… tuples

…tion

…n test for VZ2

…rage-compression

…sed data

pdet · 2021-11-30T12:28:56Z

Hey @samansmink ! Cool stuff!
I had a quick look at your tests, I think there are none with the SQL compression options?
I guess would also be cool to support combinations of compressions in there as well!
#2473

… int8 and int16

samansmink · 2021-11-30T17:51:47Z

@pdet Ah I missed the SQL compression option completely, I added a bitpacking column to the existing test.
Multiple compression methods on the same column is interesting for sure! I'll give it some thought!

…ments

…eters as argument

samansmink · 2021-12-01T16:09:44Z

Reran TPC-H at SF10 (filesize with/without bp 8.4G/11G)

benchmark	no_bitpacking	with_bitpacking	with_bitpacking_diff
benchmark/tpch/sf10/q01.benchmark	0.51997950	0.54696925	5%
benchmark/tpch/sf10/q02.benchmark	0.08547680	0.08904300	4%
benchmark/tpch/sf10/q03.benchmark	0.43510460	0.45110340	4%
benchmark/tpch/sf10/q04.benchmark	0.85312320	0.88854640	4%
benchmark/tpch/sf10/q05.benchmark	0.26582940	0.27807340	5%
benchmark/tpch/sf10/q06.benchmark	0.08049180	0.09928340	23%
benchmark/tpch/sf10/q07.benchmark	0.98799560	0.98849440	0%
benchmark/tpch/sf10/q08.benchmark	0.29346280	0.30395680	4%
benchmark/tpch/sf10/q09.benchmark	4.63613540	4.62440660	-0%
benchmark/tpch/sf10/q10.benchmark	0.43968820	0.45437000	3%
benchmark/tpch/sf10/q11.benchmark	0.04535040	0.04977500	10%
benchmark/tpch/sf10/q12.benchmark	0.75616780	0.77537620	3%
benchmark/tpch/sf10/q13.benchmark	0.65411400	0.65292860	-0%
benchmark/tpch/sf10/q14.benchmark	0.14892700	0.16555840	11%
benchmark/tpch/sf10/q15.benchmark	0.33285100	0.37085260	11%
benchmark/tpch/sf10/q16.benchmark	1.20767760	1.23794460	3%
benchmark/tpch/sf10/q17.benchmark	1.68935180	1.68888780	-0%
benchmark/tpch/sf10/q18.benchmark	1.59503580	1.57809420	-1%
benchmark/tpch/sf10/q19.benchmark	0.43726780	0.45906720	5%
benchmark/tpch/sf10/q20.benchmark	0.61146240	0.63571480	4%
benchmark/tpch/sf10/q21.benchmark	1.88267740	1.90106120	1%
benchmark/tpch/sf10/q22.benchmark	0.47079560	0.47187200	0%
TPCH TOTAL	18.42896590	18.71137925	2%
TPCH AVG	0.83768027	0.85051724	2%
benchmark/micro/compression/bitpacking_read.benchmark	6.19573150	7.34437775	19%

Mytherin

Thanks for the updates! Looks excellent. Some more minor comments, then I think this is ready to merge:

Mytherin · 2021-11-30T11:09:27Z

test/sql/storage/compression/bitpacking/bitpacking_lists.test_coverage

+PRAGMA force_compression = 'bitpacking'
+
+statement ok
+CREATE TABLE test (id INTEGER, l INTEGER[]);


Could we add one more extra test:

A few really long lists (>vector size, you can use the LIST aggregate function to create this, e.g. SELECT LIST(i) FROM range(10000) tbl(i), you can use GROUP BY to create multiple long lists)

src/storage/compression/bitpacking.cpp

samansmink · 2021-12-03T16:29:44Z

@Mytherin TPC-H SF10 latest results added as bitpacking_current:

benchmark	no_bitpacking	with_bitpacking_old	with_bitpacking_current	with_bitpacking_old_diff	with_bitpacking_current_diff
benchmark/tpch/sf10/q01.benchmark	0.51997950	0.54696925	0.52500350	5%	1%
benchmark/tpch/sf10/q02.benchmark	0.08547680	0.08904300	0.08890440	4%	4%
benchmark/tpch/sf10/q03.benchmark	0.43510460	0.45110340	0.42735180	4%	-2%
benchmark/tpch/sf10/q04.benchmark	0.85312320	0.88854640	0.86999780	4%	2%
benchmark/tpch/sf10/q05.benchmark	0.26582940	0.27807340	0.26286400	5%	-1%
benchmark/tpch/sf10/q06.benchmark	0.08049180	0.09928340	0.08039640	23%	-0%
benchmark/tpch/sf10/q07.benchmark	0.98799560	0.98849440	0.99694380	0%	1%
benchmark/tpch/sf10/q08.benchmark	0.29346280	0.30395680	0.28696540	4%	-2%
benchmark/tpch/sf10/q09.benchmark	4.63613540	4.62440660	4.67923960	-0%	1%
benchmark/tpch/sf10/q10.benchmark	0.43968820	0.45437000	0.44265600	3%	1%
benchmark/tpch/sf10/q11.benchmark	0.04535040	0.04977500	0.04618320	10%	2%
benchmark/tpch/sf10/q12.benchmark	0.75616780	0.77537620	0.77153380	3%	2%
benchmark/tpch/sf10/q13.benchmark	0.65411400	0.65292860	0.65441340	-0%	0%
benchmark/tpch/sf10/q14.benchmark	0.14892700	0.16555840	0.14612360	11%	-2%
benchmark/tpch/sf10/q15.benchmark	0.33285100	0.37085260	0.34588980	11%	4%
benchmark/tpch/sf10/q16.benchmark	1.20767760	1.23794460	1.27569320	3%	6%
benchmark/tpch/sf10/q17.benchmark	1.68935180	1.68888780	1.68836480	-0%	-0%
benchmark/tpch/sf10/q18.benchmark	1.59503580	1.57809420	1.58604440	-1%	-1%
benchmark/tpch/sf10/q19.benchmark	0.43726780	0.45906720	0.43757320	5%	0%
benchmark/tpch/sf10/q20.benchmark	0.61146240	0.63571480	0.62259700	4%	2%
benchmark/tpch/sf10/q21.benchmark	1.88267740	1.90106120	1.87933240	1%	-0%
benchmark/tpch/sf10/q22.benchmark	0.47079560	0.47187200	0.47776080	0%	1%

Mytherin · 2021-12-03T16:36:23Z

Excellent results!

samansmink added 30 commits November 9, 2021 17:49

set up some boiler plate code for bitpacking compression, does not ac…

866b220

…tually compress yet

bitpacking compression somewhat complete, some tests fail though

3fbd216

added tests for bitpacking

9af385a

added missing types, fixed several issues

86d03a8

finished bitpacking tests

ebe6c0f

refactor code

5ae6ff7

refactored code to store bitpacking width for a vector_size amount of…

f111c9a

… tuples

Bitpacking analyze estimate written, test added for compression selec…

d07775b

…tion

wip rewriting to support lemire bitpacking algorithm

74500a1

WIP bitpacking, grouped compression now works with placeholder

03510a4

Wip adding lemire bitpacking lib, basic aligned bitpacking works now

99b0755

cleanup and fix of bitwidth detection

f629b17

first properly working version, still needs restructure and more tests

4453575

refactor

f524d0c

some refactoring plus optimization of bit width calculation

32bdda5

refactor plus some small fixes

9049a4a

dummy commit for ci

2a64a12

added fastpfor to amalgamation build

7f492e0

rewrote include

ad81b16

formatting fixes

2e393bd

Merge branch 'master' into adding-compression-functions

841f8ca

fixes for CI plus adding support for int16 and int8

2565fa8

fixed statistics updating bug plus some refactoring

f310156

more CI fixing

32a6da9

fixed incorrect bitpacking analysis result

813e273

added bitpacking tests for tpch and tpc-ds, fixed bitpacking selectio…

dd76ec0

…n test for VZ2

typo

f688f34

more CI fixes

fdbad1a

Merge branch 'master' into bitpacking-storage-compression

c38e03e

added option to run interpreted benchmarks in disk storage mode

67f991d

Sam Ansmink and others added 10 commits November 28, 2021 13:10

added test testing bitpacking limits

32d3505

added missing list tests for bitpacking

d708754

removed without mask functions

a4bd897

updated tests, added decompression optimization, boolean types

80f3f2b

fixed wrong cast in assertion

164e170

Merge branch 'master' of github.com:duckdb/duckdb into bitpacking-sto…

26a3f4a

…rage-compression

format fix

fd5a79e

Refactored, limited possible widths, added optimization for uncompres…

34c476b

…sed data

Missing header

3411ec7

Disabled optimization for non standard vector sizes

2f6718a

samansmink added 5 commits November 30, 2021 16:18

Added int16 bitpacking

68abedc

Added int8 bitpacking

bb0df14

Removed pre-casting code, no longer needed with bitpacking supporting…

50a77df

… int8 and int16

Improved bitpacking tests, added header i missed from last commits

29698b8

Added bitpacking to create test, added an extra d_assert

807a50c

samansmink added 3 commits December 1, 2021 10:13

Fixed Incorrect test name, fixed unintended file rename

2d48b4f

Refactored bitpacking code to reduce code duplication, added some com…

452d4f6

…ments

Reverted some of the refactoring due to GCC not liking template param…

8aa8065

…eters as argument

samansmink requested a review from Mytherin December 1, 2021 16:10

Mytherin reviewed Dec 1, 2021

View reviewed changes

samansmink added 4 commits December 1, 2021 20:52

Added sign extension skipping optimization

a0f79fc

Added large list test

b6902b7

Implemented faster sign extend

ad59fe6

Hopefully fixed sign extend for 32bit

6e13bfb

Mytherin merged commit 29e6c28 into duckdb:master Dec 3, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bitpacking storage compression #2679

Bitpacking storage compression #2679

samansmink commented Nov 26, 2021 •

edited

pdet commented Nov 30, 2021 •

edited

samansmink commented Nov 30, 2021

samansmink commented Dec 1, 2021

Mytherin left a comment

Mytherin Nov 30, 2021

samansmink commented Dec 3, 2021

Mytherin commented Dec 3, 2021

Bitpacking storage compression #2679

Bitpacking storage compression #2679

Conversation

samansmink commented Nov 26, 2021 • edited

Evaluation

pdet commented Nov 30, 2021 • edited

samansmink commented Nov 30, 2021

samansmink commented Dec 1, 2021

Mytherin left a comment

Choose a reason for hiding this comment

Mytherin Nov 30, 2021

Choose a reason for hiding this comment

samansmink commented Dec 3, 2021

Mytherin commented Dec 3, 2021

samansmink commented Nov 26, 2021 •

edited

pdet commented Nov 30, 2021 •

edited