Parallel reading from replicas #29279

nikitamikhaylov · 2021-09-22T22:29:35Z

I hereby agree to the terms of the CLA available at: https://yandex.ru/legal/cla/?lang=en

Changelog category (leave one):

New Feature

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Added an ability to read from all replicas within a shard during distributed query. To enable this, set allow_experimental_parallel_reading_from_replicas=true and max_parallel_replicas to any number. This closes #26748

Detailed description / Documentation draft:
We already have such mechanism, but it works only for tables with SAMPLE BY key. This feature will work for any kind of MergeTree tables. If max_parallel_replicas is set to some value > 1, then the old algorithm is enabled (backward compatible change).

CLAassistant · 2021-09-28T08:29:28Z

All committers have signed the CLA.

KochetovNicolai · 2021-11-26T10:12:38Z

src/Storages/MergeTree/ParallelReplicasReadingCoordinator.cpp

+    /// We are the first who wants to process parts in partition
+    if (partition_it == partitions.end())
+    {
+        auto part_and_projection = request.part_name + "#" + request.projection_name;


Why don't we store part name and projection separately?

I want to underline their strict connectivity and also this concatenation simplifies the code a little bit (There are no additional checks that we want to read from the same projection). Maybe we can move them to std::pair or somewhat similar.

KochetovNicolai · 2021-11-26T10:16:52Z

src/Storages/MergeTree/ParallelReplicasReadingCoordinator.cpp

+            result.intersect(intervals_to_do);
+
+            /// Update intervals_to_do
+            intervals_to_do.intersect(HalfIntervals::initializeFromMarkRanges(std::move(request.mark_ranges)).negate());


I think you should add some comments, from your paper :)

src/Storages/MergeTree/ParallelReplicasReadingParticipator.h

KochetovNicolai

Generally ok

…el-reading-from-replicas

src/Storages/MergeTree/MergeTreeThreadSelectProcessor.cpp

KochetovNicolai · 2021-12-03T16:17:26Z

Ok, just need to disable for FINAL and we can merge.

…el-reading-from-replicas

nikitamikhaylov · 2021-12-08T20:31:22Z

@Mergifyio update

mergify · 2021-12-08T20:32:03Z

update

✅ Branch has been successfully updated

ka1bi4 · 2021-12-11T20:42:38Z

Internal documentation ticket: DOCSUP-19955

robot-clickhouse added doc-alert pr-feature Pull request with new product feature labels Sep 22, 2021

nikitamikhaylov added the force tests The label does nothing, NOOP, None, nil label Sep 22, 2021

nikitamikhaylov force-pushed the parallel-reading-from-replicas branch from 03bd5df to 5de301c Compare September 24, 2021 12:33

nikitamikhaylov force-pushed the parallel-reading-from-replicas branch from 5de301c to 5925c1c Compare September 28, 2021 20:05

nikitamikhaylov added 5 commits September 29, 2021 14:05

Implement parallel reading. Part 1

dd4f333

Disable parallel processing for plain MergeTree

1055fc7

Fix build

cd2536e

More logging + fixed some tests

1562aa3

Move coordinator to distributed

46f5412

nikitamikhaylov force-pushed the parallel-reading-from-replicas branch from 5925c1c to 46f5412 Compare September 29, 2021 14:05

nikitamikhaylov added 18 commits September 29, 2021 14:09

Delete prints

8cb839a

Some changes...

9b1cf87

Merge upstream/master into parallel-reading-from-replicas (using imerge)

28c62ef

Fixes after merging master

e10e361

Try to test in CI

54594a3

Fix test

b372e08

Merge upstream/master into parallel-reading-from-replicas (using imerge)

f1df990

After merge

bb59e40

Fix storage distributed

da1a224

Merge upstream/master into parallel-reading-from-replicas (using imerge)

c5963f1

Added setting collaborate_with_initiator

4d1b624

Fix unit tests

2bfe562

Better

80bee6e

Fix tests

b573f4c

show create table

0ddd66d

Merge upstream/master into parallel-reading-from-replicas (using imerge)

920f226

Merge upstream/master into parallel-reading-from-replicas (using imerge)

0074f82

Added separe setting for reading with sample/offset

d9050f4

KochetovNicolai reviewed Nov 26, 2021

View reviewed changes

src/Storages/MergeTree/ParallelReplicasReadingParticipator.h Outdated Show resolved Hide resolved

KochetovNicolai requested changes Nov 26, 2021

View reviewed changes

nikitamikhaylov added 7 commits November 27, 2021 13:01

Review fixes

d505bec

Merge branch 'master' of github.com:ClickHouse/ClickHouse into parall…

96e32b9

…el-reading-from-replicas

Review fixes [2]

076ae04

Review fixes [3]

a09a6c2

Review fixed [?]

29a01db

Review fixes

f2d2ee2

Merge branch 'master' into parallel-reading-from-replicas

2b9954d

KochetovNicolai reviewed Dec 3, 2021

View reviewed changes

src/Storages/MergeTree/MergeTreeThreadSelectProcessor.cpp Outdated Show resolved Hide resolved

nikitamikhaylov added 3 commits December 7, 2021 11:31

Merge branch 'master' of github.com:ClickHouse/ClickHouse into parall…

6d70297

…el-reading-from-replicas

Forbid final

b7608c9

Style

fb7a64e

KochetovNicolai approved these changes Dec 7, 2021

View reviewed changes

nikitamikhaylov added 2 commits December 8, 2021 12:33

Rename setting to experimental

4a45335

Update 00168_parallel_processing_on_replicas_part_1.sh

fcc55fc

Merge branch 'master' into parallel-reading-from-replicas

0a67dfd

nikitamikhaylov merged commit dbf5091 into ClickHouse:master Dec 9, 2021

alexey-milovidov added the 🎅 🎁 gift🎄 To make people wonder label Dec 23, 2021

azat mentioned this pull request Jan 7, 2022

Fix query cancellation in case of allow_experimental_parallel_reading_from_replicas #33456

Merged

azat mentioned this pull request Feb 19, 2022

Fix parallel_reading_from_replicas with clickhouse-bechmark #34751

Merged

LvLiangLiang-dev mentioned this pull request Mar 4, 2022

Clickhouse计算与存储分离调研 cloudnativecube/octopus#48

Open

alexey-milovidov mentioned this pull request Oct 8, 2022

Roadmap 2022 (discussion) #32513

Closed

den-crane mentioned this pull request Jan 9, 2023

max_parallel_replicas does not apply sampling when query is run by non-default user #44877

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parallel reading from replicas #29279

Parallel reading from replicas #29279

nikitamikhaylov commented Sep 22, 2021 •

edited

CLAassistant commented Sep 28, 2021 •

edited

KochetovNicolai Nov 26, 2021

nikitamikhaylov Nov 26, 2021

KochetovNicolai Nov 26, 2021

KochetovNicolai left a comment

KochetovNicolai commented Dec 3, 2021

nikitamikhaylov commented Dec 8, 2021

mergify bot commented Dec 8, 2021

ka1bi4 commented Dec 11, 2021

Parallel reading from replicas #29279

Parallel reading from replicas #29279

Conversation

nikitamikhaylov commented Sep 22, 2021 • edited

CLAassistant commented Sep 28, 2021 • edited

KochetovNicolai Nov 26, 2021

Choose a reason for hiding this comment

nikitamikhaylov Nov 26, 2021

Choose a reason for hiding this comment

KochetovNicolai Nov 26, 2021

Choose a reason for hiding this comment

KochetovNicolai left a comment

Choose a reason for hiding this comment

KochetovNicolai commented Dec 3, 2021

nikitamikhaylov commented Dec 8, 2021

mergify bot commented Dec 8, 2021

✅ Branch has been successfully updated

ka1bi4 commented Dec 11, 2021

nikitamikhaylov commented Sep 22, 2021 •

edited

CLAassistant commented Sep 28, 2021 •

edited