Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DROP/MODIFY COLUMN for compact part memory usage for tables with thousands of columns. #27502

Closed
UnamedRus opened this issue Aug 9, 2021 · 11 comments · Fixed by #41122
Closed
Labels
memory When memory usage is higher than expected minor Priority: minor st-need-info We need extra data to continue (waiting for response)

Comments

@UnamedRus
Copy link
Contributor

UnamedRus commented Aug 9, 2021

If you have really wide table with many thousands of columns and you want to drop / modify column.
For wide parts it working fine, but for compact clickhouse would try to allocate a lot of memory and going to be killed via OOM killer.

Does it reproduce on recent release?

Yes, 21.9

How to reproduce

clickhouse-client -mn --query="SELECT 'CREATE TABLE xxx_really_wide( ' || arrayStringConcat(groupArray('column_'|| toString(number) || ' Nullable(UInt32)'), ',') || ') ENGINE=MergeTree ORDER BY assumeNotNull(column_0)' FROM numbers(6000) FORMAT TSVRaw" | clickhouse-client -mn

clickhouse-client -mn --query="SELECT arrayStringConcat(replicate('1', range(6000)), ',') FROM numbers(300) FORMAT TSVRaw" | clickhouse-client -mn --query "INSERT INTO xxx_really_wide FORMAT CSV " --send_logs_level='trace'
SELECT *
FROM system.merges
FORMAT Vertical

Query id: d9751847-a689-49f0-aab3-f868d9c08e34

Row 1:
──────
database:                    default
table:                       xxx_really_wide
elapsed:                     84.8952157
progress:                    0
num_parts:                   1
source_part_names:           ['all_1_1_0']
result_part_name:            all_1_1_0_3
source_part_paths:           ['/var/lib/clickhouse/data/default/xxx_really_wide/all_1_1_0/']
result_part_path:            /var/lib/clickhouse/data/default/xxx_really_wide/all_1_1_0_3/
partition_id:                all
is_mutation:                 1
total_size_bytes_compressed: 456027
total_size_marks:            2
bytes_read_uncompressed:     0
rows_read:                   0
bytes_written_uncompressed:  0
rows_written:                0
columns_written:             0
memory_usage:                16836533272
thread_id:                   16171
merge_type:
merge_algorithm:

Row 2:
──────
database:                    default
table:                       xxx_really_wide
elapsed:                     84.8947852
progress:                    0
num_parts:                   1
source_part_names:           ['all_2_2_0']
result_part_name:            all_2_2_0_3
source_part_paths:           ['/var/lib/clickhouse/data/default/xxx_really_wide/all_2_2_0/']
result_part_path:            /var/lib/clickhouse/data/default/xxx_really_wide/all_2_2_0_3/
partition_id:                all
is_mutation:                 1
total_size_bytes_compressed: 468027
total_size_marks:            2
bytes_read_uncompressed:     0
rows_read:                   0
bytes_written_uncompressed:  0
rows_written:                0
columns_written:             0
memory_usage:                16706294896
thread_id:                   16174
merge_type:
merge_algorithm:
@UnamedRus UnamedRus added the bug Confirmed user-visible misbehaviour in official release label Aug 9, 2021
@alexey-milovidov alexey-milovidov added minor Priority: minor memory When memory usage is higher than expected and removed bug Confirmed user-visible misbehaviour in official release labels Aug 9, 2021
@alexey-milovidov
Copy link
Member

How much memory does it use before OOM?

@alexey-milovidov alexey-milovidov added the st-need-info We need extra data to continue (waiting for response) label Aug 9, 2021
@UnamedRus
Copy link
Contributor Author

Around ~30GB as i understand

[276308.624413] Out of memory: Kill process 16736 (clickhouse-serv) score 976 or sacrifice child
[276308.625091] Killed process 16736 (clickhouse-serv) total-vm:49429712kB, anon-rss:31671880kB, file-rss:196308kB, shmem-rss:0kB

@alexey-milovidov
Copy link
Member

alexey-milovidov commented Aug 9, 2021

We see that it's not enough for 6000 columns.
But how many columns this server can process successfully? 3000? 1500?

@UnamedRus
Copy link
Contributor Author

But it also make fix for #6943 problem not addressing it fully, because if clickhouse would try to merge such compact parts you again need much more memory for that and problem just moved one step further from INSERT to merge and ALTERs.

So if you would have some concurrent merges happening, it could OOM even big servers.

But how many columns this server can process successfully? 3000? 1500?

For 3000 columns clickhouse would take 23.21 GiB of memory.

@alexey-milovidov
Copy link
Member

I'm trying to improve it in #26929.

By the way, the issue should not exist if vertical merge algorithm is selected.
What algorithm is selected in your case?

@alexey-milovidov
Copy link
Member

Looks like "vertical" algorithm cannot be used for compact parts.

@alexey-milovidov
Copy link
Member

@CurtizJ It looks possible to enable vertical merge when doing Compact -> Wide parts.

@filimonov
Copy link
Contributor

filimonov commented Feb 16, 2022

Any updates?

BTW: why OOM, not the memory exception? Why does it happen with compact parts (or it happens when they get converted to wide?)

@UnamedRus
Copy link
Contributor Author

(or it happens when they get converted to wide?)

In order to drop column, ClickHouse tries to convert compact part to wide afaik.

@filimonov
Copy link
Contributor

In order to drop column, ClickHouse tries to convert compact part to wide afaik.

But that means the same issue will happen when a few simultaneous (horizontal) merges will be triggered.

@alexey-milovidov
Copy link
Member

No updates.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
memory When memory usage is higher than expected minor Priority: minor st-need-info We need extra data to continue (waiting for response)
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants