Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is there a way to avoid cancelling merges when running ALTER TABLE DROP PARTITION? #63586

Closed
jaumecastell opened this issue May 10, 2024 · 6 comments
Labels
question Question?

Comments

@jaumecastell
Copy link

jaumecastell commented May 10, 2024

I'm using version 24.2 (but most probably happening on all versions), and I saw that when running an ALTER TABLE

DROP PARTITION , it cancels all running merges and they start again, even for the ones that do not correspond to the partition I'm dropping.
Is there any way to avoid cancelling merges for other partitions, when performing this operation?

As an additional information, also an OPTIMIZE was running and it was cancelled:

Code: 236. DB::Exception: Received from localhost:9000. DB::Exception: Cancelled merging parts. (ABORTED)
(query: OPTIMIZE TABLE <table_name> PARTITION 20240425 FINAL)

Partition dropped was 20240423.

@jaumecastell jaumecastell added the question Question? label May 10, 2024
@davenger
Copy link
Member

Looks like in the ordinary MergeTree there is currently no way to stop merges only in the partition that is being dropped.
In the code I see it is not implemented: https://github.com/ClickHouse/clickhouse/blob/master/src/Storages/StorageMergeTree.cpp#L1655

@jaumecastell
Copy link
Author

Hi @davenger, thanks for your response. The main problem I have is that the OPTIMIZE is cancelled too. Do you know some workaround to avoid this?

@davenger
Copy link
Member

@jaumecastell OPTIMIZE is essentially a merge.
From what you describe it looks like you have daily partitions, and you manually run OPTIMIZE ... FINAL on them. Would you mind sharing a bit more details on your use case: why you need to run those optimize queries, are you concerned about optimize being cancelled because it take long time to complete?

@jaumecastell
Copy link
Author

@davenger yes, I have a table that is partitioned by day, which contains more or less 300GB each partition. I use this table to save backups of my data, so I want to maximize the compression ratio of each partition, that's why I execute OPTIMIZE FINAL before backing up the partition. Also, each day I delete the partitions that have been already backed up, and the OPTIMIZEs (merges involved) are killed. The optimizes, are taking between 12 and 20 hours to finish. For now, I tried to adjust the timings for when the optimizes and the deletes are executed, so the merges are finished before deleting the previous partition, but it would be good for the merges to not be killed if their partition is not involved.

@den-crane
Copy link
Contributor

den-crane commented May 13, 2024

@jaumecastell you can modify tables parameters
max_bytes_to_merge_at_max_space_in_pool,
min_age_to_force_merge_seconds, min_age_to_force_merge_on_partition_only
then Clickhouse will merge it by itself and retry merge after cancellation.

@alexey-milovidov
Copy link
Member

I highly do not recommend modifying any settings.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Question?
Projects
None yet
Development

No branches or pull requests

4 participants