Column-level compression block sizes #55201

canhld94 · 2023-10-02T06:12:17Z

Changelog category (leave one):

New Feature

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

Certain settings (currently min_compress_block_size and max_compress_block_size) can now be specified at column-level where they take precedence over the corresponding table-level setting. Example: CREATE TABLE tab (col String SETTINGS (min_compress_block_size = 81920, max_compress_block_size = 163840)) ENGINE = MergeTree ORDER BY tuple();

Documentation entry for user-facing changes

Documentation is written (mandatory for new features)

Signed-off-by: Duc Canh Le <duccanh.le@ahrefs.com>

canhld94 · 2023-10-06T08:59:18Z

@alexey-milovidov @nikitamikhaylov can I get some reviews for this feature :D.
I think the syntax may be not yet the best one. However the feature itself is definitely useful.

We have tested on our prod table.

The table xxxx_html_local has one big column, xxxx_html, so we previously set compress block size = 64MB to achieve high compression ratio. But it results in high memory consumption during SELECT, even we don't select the big column.
The new table xxxx_html_local2 has same schema but no global compression block size and only column xxxx_html level compress block size.
Result: the compression ratio is almost same, but memory usage is 100x time better when query table xxxx_html_local2.
If we don't tune the table compress block size (use default values), the compression ratio is only ~5.x.

-- Compression ratio
┌─table─────────────┬─count()─┬─compressed_sz─┬─uncompressed_sz─┬──────────────ratio─┐
│ xxxx_html_local   │      14 │ 228.20 GiB    │ 3.43 TiB        │ 15.385512604598656 │
│ xxxx_html_local2  │      12 │ 226.07 GiB    │ 3.42 TiB        │ 15.504667251480628 │
└───────────────────┴─────────┴───────────────┴─────────────────┴────────────────────┘

-- SELECT * on origin table with min_compression_block_size = 64MB and max_compress_block_size = 64M 
-- on table level
SELECT * EXCEPT xxxx_html
FROM xxxx_html_local
WHERE _partition_id = '9-4-0'
SETTINGS max_threads = 16
FORMAT `Null`

Query id: 4304bfcd-a3e4-4d95-b5fa-96becee33ad0

Ok.

0 rows in set. Elapsed: 1.105 sec. Processed 5.53 million rows, 725.27 MB (5.00 million rows/s., 656.11 MB/s.)
Peak memory usage: 7.68 GiB.

-- SELECT * on new table with min_compression_block_size = 64MB and max_compress_block_size = 64M
-- on column `xxxx_html ` level
SELECT * EXCEPT xxxx_html
FROM xxxx_html_local2
WHERE _partition_id = '9-4-0'
SETTINGS max_threads = 16
FORMAT `Null`

Query id: 55e7290d-a6ef-4a96-badd-7569f30fb409

Ok.

0 rows in set. Elapsed: 0.172 sec. Processed 5.53 million rows, 719.40 MB (32.19 million rows/s., 4.19 GB/s.)
Peak memory usage: 33.01 MiB.

If we don't tune the table compress block size, the compression ratio (with default settings) is only ~5.x.

alexey-milovidov · 2023-10-08T13:54:44Z

@canhld94 The default compress block size is from 64 KB to 1 MB, and it is strange to see it could lead to the difference in 7 GB of memory usage. Is it possible that you have also changed the defaults? I don't think it's ever needed to increase the compress block size. It is one of the "factory" settings that are not expected to be changed.

canhld94 · 2023-10-08T14:43:04Z

@canhld94 The default compress block size is from 64 KB to 1 MB, and it is strange to see it could lead to the difference in 7 GB of memory usage. Is it possible that you have also changed the defaults?

@alexey-milovidov May be the example is not clear. With default compress block, memory consumption is normal, but the compression ratio is not good. Previously, we need to increase table level min_compress_block_size and max_compress_block_size to 64MB, but it results in high memory consumption during SELECT queries. We tried to tune min_compress_block_size and max_compress_block_size and index_granularity_bytes accordingly, but none of them can achieve good compression ratio as fixed block size of 64MB.

I don't think it's ever needed to increase the compress block size. It is one of the "factory" settings that are not expected to be changed.

In our use case, the table has a big string column (e.g. the whole html source of a website). If we use default compression block size, the compression ratio is from 5-6, which is too low for our demand.

canhld94 · 2023-10-10T03:04:00Z

@alexey-milovidov I've revised the example in my previous comment as well. Hope it is more clear to you.

robot-ch-test-poll · 2023-10-17T17:23:39Z

This is an automated comment for commit 7246655 with description of existing statuses. It's updated for the latest CI running

❌ Click here to open a full report in a separate page

Successful checks

Check name	Description	Status
AST fuzzer	Runs randomly generated queries to catch program errors. The build type is optionally given in parenthesis. If it fails, ask a maintainer for help	✅ success
ClickBench	Runs [ClickBench](https://github.com/ClickHouse/ClickBench/) with instant-attach table	✅ success
ClickHouse build check	Builds ClickHouse in various configurations for use in further steps. You have to fix the builds that fail. Build logs often has enough information to fix the error, but you might have to reproduce the failure locally. The cmake options can be found in the build log, grepping for cmake. Use these options and follow the general build process	✅ success
Compatibility check	Checks that clickhouse binary runs on distributions with old libc versions. If it fails, ask a maintainer for help	✅ success
Docker server and keeper images	There's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS	✅ success
Docs check	There's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS	✅ success
Fast tests	There's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS	✅ success
Flaky tests	Checks if new added or modified tests are flaky by running them repeatedly, in parallel, with more randomization. Functional tests are run 100 times with address sanitizer, and additional randomization of thread scheduling. Integrational tests are run up to 10 times. If at least once a new test has failed, or was too long, this check will be red. We don't allow flaky tests, read the doc	✅ success
Install packages	Checks that the built packages are installable in a clear environment	✅ success
Mergeable Check	Checks if all other necessary checks are successful	✅ success
Performance Comparison	Measure changes in query performance. The performance test report is described in detail here. In square brackets are the optional part/total tests	✅ success
SQLancer	Fuzzing tests that detect logical bugs with SQLancer tool	✅ success
Sqllogic	Run clickhouse on the sqllogic test set against sqlite and checks that all statements are passed	✅ success
Stateful tests	Runs stateful functional tests for ClickHouse binaries built in various configurations -- release, debug, with sanitizers, etc	✅ success
Stress test	Runs stateless functional tests concurrently from several clients to detect concurrency-related errors	✅ success
Style check	There's no description for the check yet, please add it to tests/ci/ci_config.py:CHECK_DESCRIPTIONS	✅ success
Unit tests	Runs the unit tests for different release types	✅ success

Check name	Description	Status
CI running	A meta-check that indicates the running CI. Normally, it's in success or pending state. The failed status indicates some problems with the PR	⏳ pending
Integration tests	The integration tests report. In parenthesis the package type is given, and in square brackets are the optional part/total tests	❌ failure
Stateless tests	Runs stateless functional tests for ClickHouse binaries built in various configurations -- release, debug, with sanitizers, etc	❌ failure
Upgrade check	Runs stress tests on server version from last release and then tries to upgrade it to the version from the PR. It checks if the new server can successfully startup without any errors, crashes or sanitizer asserts	❌ failure

botsbreeder · 2023-10-19T03:48:01Z

@alexey-milovidov we have tables where one column is a big string (100KB on average) and other columns don't have that much data. Setting min_compress_block_size to 64MB for string column helps to increase compression ratio almost twice. But if min_compress_block_size applied to the whole table all other columns use 64MB compress block and it slows down select queries and these queries require more memory. The solution is to apply min_compress_block_size to one column only and it works well in our fork (high compression ratio AND fast queries AND lower memory usage).

Resolve conflicts Signed-off-by: Duc Canh Le <duccanh.le@ahrefs.com>

canhld94 · 2023-12-20T10:33:50Z

Upgrade check #57893
Stateless 00002_log_and_exception_messages_formatting look unrelated

canhld94 · 2023-12-20T10:36:44Z

@alexey-milovidov we change the syntax to declare compress block size as parameters of CODEC to be more ClickHouse friendly. Document is also updated.

CREATE TABLE t
(
        ...
        big_column String CODEC(16384, 16384)(ZSTD(9, 24)),
)
ENGINE = MergeTree ORDER BY tuple();

This change is backward incompatible, so I also add a query setting allow_declare_codec_with_parameters that is default to false now, and we can enable (or remove it) after few stable release.

UnamedRus · 2023-12-20T18:16:22Z

Honestly, i preferred old syntax, it's more self describing, and also allow to support more per column settings. (low_cardinality or potential dictionary support for ZSTD, i'm looking for you.)

        big_column String CODEC(ZSTD(9, 24)) SETTINGS (min_compress_block_size = 16384, max_compress_block_size = 16384),

BTW, there is another potential syntax option (inspired by YDB):

CREATE TABLE series_with_families (
    series_id Uint64,
    title Utf8,
    series_info Utf8 FAMILY family_large,
    release_date Uint64,
    PRIMARY KEY (series_id),
    FAMILY default (
        DATA = "ssd",
        COMPRESSION = "off"
    ),
    FAMILY family_large (
        DATA = "rot",
        COMPRESSION = "lz4"
    )
);

canhld94 · 2023-12-21T02:32:45Z

@UnamedRus yes, the old syntax is more declarative and more generic, but its scope is beyond of the main purpose of this PR (to have explicit compress block size for each column). For now we want to push this PR to upstream first.

Re. column level settings, it's definitely a needed feature, but different people will have different preferred syntax and we may need lots of discussion. I still advocate my previous proposed syntax and will try to push it to upstream.

COLUMN TYPE ATRRIBUTES SETTINGS (<list of settings>),

But it'll be in another issue and PR.

rschu1ze · 2024-01-11T15:46:31Z

@canhld94 @UnamedRus After reading this PR, #54821 and #36428, I think there is some value in per-column min/max block sizes when the columns have very different average byte sizes per value ("big string column" use case) and I like to help get this merged.

Settings in ClickHouse come in global form (configured via cfg file), session/query form (SET ... = ... or SELECT ... SETTINGS) or MergeTree settings (CREATE TABLE ... SETTINGS ...). Some specific settings, e.g. min/max_compress_block_size, exist at session and MergeTree level. This PR would make it possible for specific (not all) MergeTree settings to be overruled at column level. But we should be careful which settings we really expose at column-level, it is a balance between code complexity and benefit. E.g. low_cardinality_max_dictionary_size (#36428) sounds too obscure to me.

Re syntax: We should strive for maximum consistency. CODEC(min_compress_block_size, max_compress_block_size) isn't consistent with the SETTINGS clause as used elsewhere. I also agree with @UnamedRus that such syntax isn't easily extensible, it is also hard to decipher for casual users.

COLUMN TYPE ATRRIBUTES SETTINGS (<list of settings>)
But it'll be in another issue and PR.

I like that. Is there perhaps a PR already? I am afraid that if we implement CODEC(min_compress_block_size, max_compress_block_size), it will be orthogonal to the better SETTINGS syntax, so we better start with SYNTAX right away.

alexey-milovidov · 2024-01-11T18:59:59Z

I like the syntax:

big_column String CODEC(ZSTD(9, 24)) SETTINGS (min_compress_block_size = 16384, max_compress_block_size = 16384),

Let's finish this PR and merge...

Resolve conflicts Signed-off-by: Duc Canh Le <duccanh.le@ahrefs.com>

Signed-off-by: Duc Canh Le <duccanh.le@ahrefs.com>

…mpress_block

tests/queries/0_stateless/02870_per_column_settings.sql

src/Parsers/ParserCreateQuery.h

tests/queries/0_stateless/02870_per_column_settings.sql

Signed-off-by: Duc Canh Le <duccanh.le@ahrefs.com>

canhld94 added 5 commits October 2, 2023 05:59

per-column compress min-max block sizes

00c2a3d

Signed-off-by: Duc Canh Le <duccanh.le@ahrefs.com>

change syntax to per-column settings

bb95215

Signed-off-by: Duc Canh Le <duccanh.le@ahrefs.com>

column-level settings: alter-able

4ff4b0a

Signed-off-by: Duc Canh Le <duccanh.le@ahrefs.com>

only allow column level settings for MergeTree family

bb62b91

Signed-off-by: Duc Canh Le <duccanh.le@ahrefs.com>

fix wrong column record on zookeeper

50ad645

Signed-off-by: Duc Canh Le <duccanh.le@ahrefs.com>

UnamedRus mentioned this pull request Oct 2, 2023

Greatest Common Divisor codec #48349

Closed

support ALTER table MODIFY column REMOVE SETTING ..

8e37488

Signed-off-by: Duc Canh Le <duccanh.le@ahrefs.com>

alexey-milovidov added the can be tested Allows running workflows for external contributors label Oct 17, 2023

robot-ch-test-poll added the pr-feature Pull request with new product feature label Oct 17, 2023

Merge branch 'master' into column_level_compress_block

c3ffce1

Resolve conflicts Signed-off-by: Duc Canh Le <duccanh.le@ahrefs.com>

UnamedRus mentioned this pull request Dec 20, 2023

Add Intel-QAT-based ZSTD_QAT compression codec #57509

Merged

rschu1ze self-assigned this Jan 11, 2024

canhld94 added 2 commits January 14, 2024 09:32

Merge branch 'master' into column_level_compress_block

ac76eec

Resolve conflicts Signed-off-by: Duc Canh Le <duccanh.le@ahrefs.com>

move column level setting to the end of declaration

f755e77

Signed-off-by: Duc Canh Le <duccanh.le@ahrefs.com>

canhld94 force-pushed the column_level_compress_block branch from 3e368bf to f755e77 Compare January 14, 2024 09:36

canhld94 added 2 commits January 18, 2024 05:17

add ALTER COLUMN MODIFY/RESET SETTING

22afcd4

Signed-off-by: Duc Canh Le <duccanh.le@ahrefs.com>

add document

b16a4cf

Signed-off-by: Duc Canh Le <duccanh.le@ahrefs.com>

canhld94 force-pushed the column_level_compress_block branch from 105aa33 to b16a4cf Compare January 18, 2024 07:55

rschu1ze changed the title ~~[RFC] Column level compress block size~~ Column-level compression block sizes Jan 18, 2024

rschu1ze added 2 commits January 18, 2024 19:12

Merge remote-tracking branch 'ClickHouse/master' into column_level_co…

1570059

…mpress_block

Some fixups

1bfeee1

rschu1ze reviewed Jan 18, 2024

View reviewed changes

tests/queries/0_stateless/02870_per_column_settings.sql Outdated Show resolved Hide resolved

rschu1ze reviewed Jan 18, 2024

View reviewed changes

src/Parsers/ParserCreateQuery.h Outdated Show resolved Hide resolved

rschu1ze reviewed Jan 18, 2024

View reviewed changes

tests/queries/0_stateless/02870_per_column_settings.sql Show resolved Hide resolved

rschu1ze reviewed Jan 18, 2024

View reviewed changes

tests/queries/0_stateless/02870_per_column_settings.sql Show resolved Hide resolved

address review comments

ed031f3

Signed-off-by: Duc Canh Le <duccanh.le@ahrefs.com>

canhld94 force-pushed the column_level_compress_block branch from eeda0e5 to ed031f3 Compare January 22, 2024 16:06

rschu1ze approved these changes Jan 22, 2024

View reviewed changes

update test

7246655

Signed-off-by: Duc Canh Le <duccanh.le@ahrefs.com>

rschu1ze merged commit e67076e into ClickHouse:master Jan 23, 2024
240 of 250 checks passed

robot-ch-test-poll2 added the pr-synced-to-cloud The PR is synced to the cloud repo label Jan 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Column-level compression block sizes #55201

Column-level compression block sizes #55201

canhld94 commented Oct 2, 2023 •

edited by rschu1ze

canhld94 commented Oct 6, 2023 •

edited

alexey-milovidov commented Oct 8, 2023

canhld94 commented Oct 8, 2023

canhld94 commented Oct 10, 2023

robot-ch-test-poll commented Oct 17, 2023 •

edited by robot-clickhouse

botsbreeder commented Oct 19, 2023

canhld94 commented Dec 20, 2023

canhld94 commented Dec 20, 2023

UnamedRus commented Dec 20, 2023 •

edited

canhld94 commented Dec 21, 2023 •

edited

rschu1ze commented Jan 11, 2024 •

edited

alexey-milovidov commented Jan 11, 2024

Column-level compression block sizes #55201

Column-level compression block sizes #55201

Conversation

canhld94 commented Oct 2, 2023 • edited by rschu1ze

Changelog category (leave one):

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

Documentation entry for user-facing changes

canhld94 commented Oct 6, 2023 • edited

alexey-milovidov commented Oct 8, 2023

canhld94 commented Oct 8, 2023

canhld94 commented Oct 10, 2023

robot-ch-test-poll commented Oct 17, 2023 • edited by robot-clickhouse

botsbreeder commented Oct 19, 2023

canhld94 commented Dec 20, 2023

canhld94 commented Dec 20, 2023

UnamedRus commented Dec 20, 2023 • edited

canhld94 commented Dec 21, 2023 • edited

rschu1ze commented Jan 11, 2024 • edited

alexey-milovidov commented Jan 11, 2024

canhld94 commented Oct 2, 2023 •

edited by rschu1ze

canhld94 commented Oct 6, 2023 •

edited

robot-ch-test-poll commented Oct 17, 2023 •

edited by robot-clickhouse

UnamedRus commented Dec 20, 2023 •

edited

canhld94 commented Dec 21, 2023 •

edited

rschu1ze commented Jan 11, 2024 •

edited