-
Notifications
You must be signed in to change notification settings - Fork 6.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parallel reading from replicas #29279
Parallel reading from replicas #29279
Conversation
03bd5df
to
5de301c
Compare
5de301c
to
5925c1c
Compare
5925c1c
to
46f5412
Compare
/// We are the first who wants to process parts in partition | ||
if (partition_it == partitions.end()) | ||
{ | ||
auto part_and_projection = request.part_name + "#" + request.projection_name; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why don't we store part name and projection separately?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want to underline their strict connectivity and also this concatenation simplifies the code a little bit (There are no additional checks that we want to read from the same projection). Maybe we can move them to std::pair or somewhat similar.
result.intersect(intervals_to_do); | ||
|
||
/// Update intervals_to_do | ||
intervals_to_do.intersect(HalfIntervals::initializeFromMarkRanges(std::move(request.mark_ranges)).negate()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you should add some comments, from your paper :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally ok
Ok, just need to disable for |
…el-reading-from-replicas
@Mergifyio update |
✅ Branch has been successfully updated |
Internal documentation ticket: DOCSUP-19955 |
I hereby agree to the terms of the CLA available at: https://yandex.ru/legal/cla/?lang=en
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Added an ability to read from all replicas within a shard during distributed query. To enable this, set
allow_experimental_parallel_reading_from_replicas=true
andmax_parallel_replicas
to any number. This closes #26748Detailed description / Documentation draft:
We already have such mechanism, but it works only for tables with
SAMPLE BY
key. This feature will work for any kind of MergeTree tables. Ifmax_parallel_replicas
is set to some value > 1, then the old algorithm is enabled (backward compatible change).