New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve concurrent parts removal with zero copy replication #49630
Conversation
This is an automated comment for commit 9a824a0 with description of existing statuses. It's updated for the latest CI running
|
5a480a8
to
1b8b509
Compare
1b8b509
to
2a68bef
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TBH I don't think it's simpler than layered scheme that we discussed yesterday....
size_t num_threads = std::min<size_t>(settings->max_part_removal_threads, parts_to_remove.size()); | ||
size_t num_threads = settings->max_part_removal_threads; | ||
if (!num_threads) | ||
num_threads = getNumberOfPhysicalCPUCores() * 2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TBH I don't understand why size of this pure IO-thread pool depend on number of cores.... But maybe it make sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's for max_part_removal_threads = 0
which means "auto", and it's default value. Previously it was just getNumberOfPhysicalCPUCores()
and also min(threads, parts)
did not work correctly when thread = 0
. We can change the default value to, for example, 32 or 128, and throw an exception if it's set to 0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, agree
std::vector<UInt64> split_level; | ||
}; | ||
|
||
auto split_into_independent_ranges = [this](const DataPartsVector & parts_to_remove_, size_t split_level = 0) -> RemovalRanges |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please, let's make split_level without default value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually it's not level
(collision with term merge level
). It's just how many times we tried to split source range? (split_times
?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a kind of "reversed level", each time we increase it, we get parts with lower merge levels. Okay, let's rename
std::move(subranges.split_level.begin(), subranges.split_level.end(), std::back_inserter(independent_ranges.split_level)); | ||
num_ranges += subranges.infos.size(); | ||
total_excluded += top_level_count; | ||
continue; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Redundant?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What exactly is redundant and why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's just last line of loop, looks like we will continue
without this continue
:)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not the last line
}; | ||
|
||
size_t top_level_count = std::count_if(parts_in_range.begin(), parts_in_range.end(), top_level_parts_pred); | ||
if (settings->zero_copy_concurrent_part_removal_max_postpone_ratio < static_cast<Float32>(top_level_count) / parts_in_range.size()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This condition is not obvious. So if we have range with 100 parts where 1 is covering and 99 is covered (merge from [0, ..., 99] to [0_99]) it will be true (while it should be false, isn't it?). But I'm not sure about right condition here...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why should it be false? The purpose of this condition is to limit excluded parts percentage. For example, if we have 100 parts and 50 of them are top-level (it's possible when we have long mutations chain or repeated merges of the same blocks range), then excluding 50 parts probably will make it worse
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok
I doubt layered scheme will work well with mutations, and after #49619 we need it for mutations mostly. And it's still simpler than combined scheme |
size_t num_threads = std::min<size_t>(settings->max_part_removal_threads, parts_to_remove.size()); | ||
size_t num_threads = settings->max_part_removal_threads; | ||
if (!num_threads) | ||
num_threads = getNumberOfPhysicalCPUCores() * 2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, agree
3921f79
to
7298652
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, but failures related to changes.
Integration tests (asan) [3/6] - #48726 |
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
More parallelism on
Outdated
parts removal with "zero-copy replication"It should avoid cases like this (https://pastila.nl/?7fff376d/54299199b5aba5ed8c7a293d8ceb4b87), when some threads remove just a few parts and one thread removes 5000 parts