Improve concurrent parts removal with zero copy replication #49630

tavplubix · 2023-05-07T16:22:20Z

Changelog category (leave one):

Improvement

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

More parallelism on Outdated parts removal with "zero-copy replication"

It should avoid cases like this (https://pastila.nl/?7fff376d/54299199b5aba5ed8c7a293d8ceb4b87), when some threads remove just a few parts and one thread removes 5000 parts

robot-clickhouse · 2023-05-07T16:25:50Z

This is an automated comment for commit 9a824a0 with description of existing statuses. It's updated for the latest CI running
The full report is available here
The overall status of the commit is 🟡 pending

Check name	Description	Status
CI running	A meta-check that indicates the running CI. Normally, it's in success or pending state. The failed status indicates some problems with the PR	🟡 pending
Mergeable Check	Checks if all other necessary checks are successful	🟢 success
Push to Dockerhub	The check for building and pushing the CI related docker images to docker hub	🟢 success

alesapin

TBH I don't think it's simpler than layered scheme that we discussed yesterday....

alesapin · 2023-05-08T09:04:58Z

src/Storages/MergeTree/MergeTreeData.cpp

-    size_t num_threads = std::min<size_t>(settings->max_part_removal_threads, parts_to_remove.size());
+    size_t num_threads = settings->max_part_removal_threads;
+    if (!num_threads)
+        num_threads = getNumberOfPhysicalCPUCores() * 2;


TBH I don't understand why size of this pure IO-thread pool depend on number of cores.... But maybe it make sense.

It's for max_part_removal_threads = 0 which means "auto", and it's default value. Previously it was just getNumberOfPhysicalCPUCores() and also min(threads, parts) did not work correctly when thread = 0. We can change the default value to, for example, 32 or 128, and throw an exception if it's set to 0.

alesapin · 2023-05-08T10:05:20Z

src/Storages/MergeTree/MergeTreeData.cpp

+        std::vector<UInt64> split_level;
+    };
+
+    auto split_into_independent_ranges = [this](const DataPartsVector & parts_to_remove_, size_t split_level = 0) -> RemovalRanges


please, let's make split_level without default value.

Actually it's not level (collision with term merge level). It's just how many times we tried to split source range? (split_times?)

It's a kind of "reversed level", each time we increase it, we get parts with lower merge levels. Okay, let's rename

alesapin · 2023-05-08T10:10:04Z

src/Storages/MergeTree/MergeTreeData.cpp

+                std::move(subranges.split_level.begin(), subranges.split_level.end(), std::back_inserter(independent_ranges.split_level));
+                num_ranges += subranges.infos.size();
+                total_excluded += top_level_count;
+                continue;


What exactly is redundant and why?

It's just last line of loop, looks like we will continue without this continue :)

It's not the last line

alesapin · 2023-05-08T10:40:13Z

src/Storages/MergeTree/MergeTreeData.cpp

+            };
+
+            size_t top_level_count = std::count_if(parts_in_range.begin(), parts_in_range.end(), top_level_parts_pred);
+            if (settings->zero_copy_concurrent_part_removal_max_postpone_ratio < static_cast<Float32>(top_level_count) / parts_in_range.size())


This condition is not obvious. So if we have range with 100 parts where 1 is covering and 99 is covered (merge from [0, ..., 99] to [0_99]) it will be true (while it should be false, isn't it?). But I'm not sure about right condition here...

Why should it be false? The purpose of this condition is to limit excluded parts percentage. For example, if we have 100 parts and 50 of them are top-level (it's possible when we have long mutations chain or repeated merges of the same blocks range), then excluding 50 parts probably will make it worse

tavplubix · 2023-05-08T13:10:33Z

TBH I don't think it's simpler than layered scheme that we discussed yesterday....

I doubt layered scheme will work well with mutations, and after #49619 we need it for mutations mostly. And it's still simpler than combined scheme

alesapin · 2023-05-09T11:02:11Z

src/Storages/MergeTree/MergeTreeData.cpp

-    size_t num_threads = std::min<size_t>(settings->max_part_removal_threads, parts_to_remove.size());
+    size_t num_threads = settings->max_part_removal_threads;
+    if (!num_threads)
+        num_threads = getNumberOfPhysicalCPUCores() * 2;


alesapin

LGTM, but failures related to changes.

tavplubix · 2023-05-17T11:02:17Z

Integration tests (asan) [3/6] - #48726
Unit tests (asan) - IOResourceDynamicResourceManager was broken in master
Upgrade check (tsan) - we don't have diagnostics for OOMs in prev release

robot-clickhouse added the pr-improvement Pull request with some product improvements label May 7, 2023

tavplubix force-pushed the improve_concurrent_parts_removal branch from 5a480a8 to 1b8b509 Compare May 7, 2023 18:00

alesapin self-assigned this May 7, 2023

improve concurrent parts removal

2a68bef

tavplubix force-pushed the improve_concurrent_parts_removal branch from 1b8b509 to 2a68bef Compare May 7, 2023 22:30

alesapin reviewed May 8, 2023

View reviewed changes

alesapin approved these changes May 9, 2023

View reviewed changes

tavplubix added the do not test disable testing on pull request label May 11, 2023

tavplubix added 2 commits May 15, 2023 22:57

Merge branch 'master' into improve_concurrent_parts_removal

1b2c774

fix

7298652

tavplubix force-pushed the improve_concurrent_parts_removal branch from 3921f79 to 7298652 Compare May 15, 2023 23:05

tavplubix removed the do not test disable testing on pull request label May 15, 2023

alesapin approved these changes May 16, 2023

View reviewed changes

fix

c1c210d

Merge branch 'master' into improve_concurrent_parts_removal

9a824a0

tavplubix merged commit 36c31e1 into master May 17, 2023
9 of 11 checks passed

tavplubix deleted the improve_concurrent_parts_removal branch May 17, 2023 11:07

nikitamikhaylov added the pr-must-backport-cloud label Jun 2, 2023

robot-clickhouse-ci-1 added the pr-backports-created-cloud label Jun 6, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve concurrent parts removal with zero copy replication #49630

Improve concurrent parts removal with zero copy replication #49630

tavplubix commented May 7, 2023

robot-clickhouse commented May 7, 2023 •

edited by robot-clickhouse-ci-2

alesapin left a comment

alesapin May 8, 2023

tavplubix May 8, 2023

alesapin May 9, 2023

alesapin May 8, 2023

alesapin May 8, 2023

tavplubix May 8, 2023

alesapin May 8, 2023

tavplubix May 8, 2023

alesapin May 9, 2023

tavplubix May 15, 2023

alesapin May 8, 2023

tavplubix May 8, 2023

alesapin May 9, 2023

tavplubix commented May 8, 2023

alesapin May 9, 2023

alesapin left a comment

tavplubix commented May 17, 2023

Improve concurrent parts removal with zero copy replication #49630

Improve concurrent parts removal with zero copy replication #49630

Conversation

tavplubix commented May 7, 2023

Changelog category (leave one):

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):

robot-clickhouse commented May 7, 2023 • edited by robot-clickhouse-ci-2

alesapin left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tavplubix commented May 8, 2023

Choose a reason for hiding this comment

alesapin left a comment

Choose a reason for hiding this comment

tavplubix commented May 17, 2023

robot-clickhouse commented May 7, 2023 •

edited by robot-clickhouse-ci-2