feat(batch): parallel table scan #3251

xxchan · 2022-06-15T09:48:07Z

What's changed and what's your intention?

Mainly added vnode_bitmap to ExecutorBuilder.

Not finished yet... See the todo!s

schedule: how to partition vnode bitmap for different executors?
executor: scan different ranges according to vnode bitmap

BTW, should we use vnode bitmap or other representations? e.g., vnode ranges, maybe like ParallelUnitMapping. I used bitmap here because of previous storage implementation 😇

Checklist

I have written necessary docs and comments
I have added necessary unit tests and integration tests
All checks passed in ./risedev check (or alias, ./risedev c)

Refer to a related PR or issue link (optional)

#3237

github-actions

license-eye has totally checked 851 files.

Valid	Invalid	Ignored	Fixed
849	1	1	0

Click to see the invalid file list

src/common/src/consistent_hashing.rs

xxchan · 2022-06-15T18:24:02Z

Just tried another representation use vnode_ranges instead of vnode_bitmap, which may be easier for RowSecScanExecutor to create vnode prefix range iterators

proto/task_service.proto

liurenjie1024 · 2022-06-16T07:35:28Z

src/batch/src/execution/grpc_exchange.rs

@@ -37,6 +37,7 @@ impl GrpcExchangeSource {
        let task_id = task_output_id.get_task_id()?.clone();
        let client = ComputeClient::new(addr).await?;
        let local_execute_plan = exchange_source.local_execute_plan;
+        let vnode_ranges = todo!();


The vnode_range should be determined by optimier/fragmenter.

I think it should be determined by scheduler? Each scan task scans different ranges

I think vnodes of a table scan plan node should be determined by optimizer. And the compute node where a task should be sent to determined fragmenter, this way we can reuse this logic in local execution mode. The scheduler should only care about sending task to right compute node.

For optimizer & fragmenter, they only have one plan. e.g., scan full table

Then the scan is divided into non-overlapping vnode ranges, and each task of the fragment is assigned a range to scan. How can we do that in optimizer/fragmenter?

Oh we can let fragmenter store num_parallelsm different plans if you want...

Here after the filter has been pushed to table scan, it should be able to prune unnecessary vnodes.

Optimizer is responsible for pushing predicate (SARGS) to table scan, but is not responsible for deciding the specific vnode number.

And it cannot do it, because it doesn't know whether (1, 2) are in the same parallel unit.

For 1, it's not the target of this PR and the vnodes_range introduced here.

This PR cares mainly about given a range (e.g., full table scan), divide (schedule) it into small ranges and assign them to tasks.

I think the ScanRange introduced earlier (in BatchSeqScan) is responsible for this part? And pruning vnodes can be done at the same place where we partition the vnode ranges, instead of optimizer

Per my understanding, the parallel unit is just a partition, which should be maintained by optimizer. It should not know vnode, but should understand parallel unit.

Per my understanding, the parallel unit is just a partition, which should be maintained by optimizer. It should not know vnode, but should understand parallel unit.

I don't think it as partition 😇 To me, it's exactly same thing to the routing metadata of KV databases like Cassendra, HBase or TikV

Anyway, we need to map filters in table scan to parallel units, and it can't be don't in scheduler since we need to reuse it in local mode. Maybe fragmenter is a better place.

proto/task_service.proto

src/common/src/consistent_hashing.rs

proto/task_service.proto

src/batch/src/task/task_manager.rs

src/common/src/catalog/physical_table.rs

xxchan · 2022-06-19T22:51:45Z

PTAL the partitioning logic in distributed scheduler (stage.rs schedule_tasks) & local mode (local.rs convert_plan_node) first. I'm not sure whether I understand it correctly 😇 (Whether the representation looks good/can be placed somewhere else can be further discussed if there's no other big problems)

And the remaining work is the executor part, which is blocked by storage side's work (new encoding & vnode ranges scan API).

This reverts commit 7424af8.

This reverts commit 3cdab8f.

xxchan · 2022-06-30T20:25:26Z

https://github.com/singularity-data/risingwave/pull/3251/files/354869f6331d125b5b00c0cb6f65ad9cf6d4b483..d3107963513716c8b88591fe559de6a2ec0e6b2a Am I doing right things about distinct agg? 🤡 cc @st1page

xxchan · 2022-06-30T21:27:14Z

PTAL the final updates:

ordering: test: set sortmode rowsort #3582 , test: order of e2e test results #3583
- FIXME: parallel scan on ordered mv may have unorderd results. should do a merge sort.
fix incorrect result for distinct agg
fix incorrect result for topn in local mode

I'd like to fix the FIXME in a separate PR. We merge this PR if other fixes look good?

cc @liurenjie1024 @fuyufjh

update: fix already merged #3599

BugenZhao · 2022-07-01T13:01:43Z

So this does not enable real parallel table scan according to discussions in #3583? I've removed "close" in PR body. 🤣

* add vnode_bitmap in row seq scan * use vnode_ranges instead of vnode_bitmap * todo! * use vnode_mapping in table catalog * style: change the representation of vnode_ranges * update local mode * update local mode workers * move vnode ranges from tasks into RowSeqScan * remove build_vnode_mapping * add some comments * trivial fix * use vnode_bitmap instead of vnode_ranges (let table do the conversion instead) * fix vnodes * fix local mode (system table) * buf format * revert the change in table, ignore in executor if vnodes not set * revert cell based table * ignore get row * fix distinct * Revert "fix distinct" This reverts commit 7424af8. * Revert "Revert "fix distinct"" This reverts commit 3cdab8f. * let distinct agg be singleton * single distribution for BatchTopN * remove should_ignore * fmt

github-actions bot added the type/feature label Jun 15, 2022

add vnode_bitmap in row seq scan

62b6415

xxchan force-pushed the xxchan/partition-read branch from 7b0c556 to 9255aa6 Compare June 15, 2022 18:21

github-actions bot reviewed Jun 15, 2022

View reviewed changes

use vnode_ranges instead of vnode_bitmap

5abf456

xxchan force-pushed the xxchan/partition-read branch from 9255aa6 to 5abf456 Compare June 15, 2022 18:23

risingwavelabs deleted a comment from github-actions bot Jun 15, 2022

todo!

1e423c9

fuyufjh requested review from xx01cyx, fuyufjh and BugenZhao June 16, 2022 05:41

liurenjie1024 reviewed Jun 16, 2022

View reviewed changes

xxchan added 2 commits June 16, 2022 11:29

Merge branch 'main' into xxchan/partition-read

f33bc64

Merge remote-tracking branch 'origin/main' into xxchan/partition-read

c46952e

fuyufjh reviewed Jun 16, 2022

View reviewed changes

proto/task_service.proto Outdated Show resolved Hide resolved

xx01cyx reviewed Jun 16, 2022

View reviewed changes

src/common/src/consistent_hashing.rs Outdated Show resolved Hide resolved

xx01cyx mentioned this pull request Jun 17, 2022

feat(meta): inform frontend of mview data distribution #3304

Merged

3 tasks

Merge remote-tracking branch 'origin/main' into xxchan/partition-read

102ead3

BugenZhao reviewed Jun 17, 2022

View reviewed changes

proto/task_service.proto Outdated Show resolved Hide resolved

proto/task_service.proto Outdated Show resolved Hide resolved

src/batch/src/task/task_manager.rs Outdated Show resolved Hide resolved

use vnode_mapping in table catalog

cc9382f

xx01cyx reviewed Jun 17, 2022

View reviewed changes

src/common/src/catalog/physical_table.rs Outdated Show resolved Hide resolved

xxchan and others added 3 commits June 19, 2022 10:20

style: change the representation of vnode_ranges

a9f6526

Merge branch 'main' into xxchan/partition-read

b12080a

update local mode

dd2592a

xxchan requested review from liurenjie1024, BugenZhao, fuyufjh and xx01cyx June 19, 2022 22:52

ignore get row

a16945a

xxchan mentioned this pull request Jun 30, 2022

test: set sortmode rowsort #3582

Merged

xxchan force-pushed the xxchan/partition-read branch from 00b1a75 to a16945a Compare June 30, 2022 17:24

xxchan added 2 commits June 30, 2022 19:32

Merge branch 'main' into xxchan/partition-read

354869f

fix distinct

7424af8

xxchan force-pushed the xxchan/partition-read branch from 948e8d2 to 7424af8 Compare June 30, 2022 17:33

xxchan added 3 commits June 30, 2022 19:55

Revert "fix distinct"

3cdab8f

This reverts commit 7424af8.

Revert "Revert "fix distinct""

ed50fa6

This reverts commit 3cdab8f.

let distinct agg be singleton

d310796

single distribution for BatchTopN

d02dad9

BugenZhao mentioned this pull request Jul 1, 2022

fix(storage): allow accessing other vnodes from batch scan #3594

Merged

3 tasks

xxchan mentioned this pull request Jul 1, 2022

fix(optimizer): batch distinct agg & local topn distribution #3599

Merged

xxchan added 2 commits July 1, 2022 11:26

Merge branch 'main' into xxchan/partition-read

a79e891

remove should_ignore

9a36ea8

xxchan mentioned this pull request Jul 1, 2022

test: order of e2e test results #3583

Closed

3 tasks

Merge branch 'main' into xxchan/partition-read

a86094b

BugenZhao approved these changes Jul 1, 2022

View reviewed changes

Merge branch 'main' into xxchan/partition-read

60e1abe

xxchan force-pushed the xxchan/partition-read branch from 3881b08 to e615829 Compare July 1, 2022 12:48

fmt

aa29d20

xxchan force-pushed the xxchan/partition-read branch from e615829 to aa29d20 Compare July 1, 2022 12:51

xxchan added the mergify/can-merge Indicates that the PR can be added to the merge queue label Jul 1, 2022

mergify bot merged commit 420fcd5 into main Jul 1, 2022

mergify bot deleted the xxchan/partition-read branch July 1, 2022 13:01

xxchan mentioned this pull request Jul 6, 2022

feat(batch): prune scan partition according to scan_range #3698

Merged

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(batch): parallel table scan #3251

feat(batch): parallel table scan #3251

xxchan commented Jun 15, 2022 •

edited by BugenZhao

github-actions bot left a comment

xxchan commented Jun 15, 2022 •

edited

liurenjie1024 Jun 16, 2022

xxchan Jun 16, 2022

liurenjie1024 Jun 16, 2022

xxchan Jun 16, 2022

xxchan Jun 16, 2022 •

edited

fuyufjh Jun 16, 2022 •

edited

xxchan Jun 16, 2022

liurenjie1024 Jun 16, 2022

fuyufjh Jun 16, 2022 •

edited

liurenjie1024 Jun 16, 2022

xxchan commented Jun 19, 2022

xxchan commented Jun 30, 2022

xxchan commented Jun 30, 2022 •

edited

BugenZhao commented Jul 1, 2022

feat(batch): parallel table scan #3251

feat(batch): parallel table scan #3251

Conversation

xxchan commented Jun 15, 2022 • edited by BugenZhao

What's changed and what's your intention?

Checklist

Refer to a related PR or issue link (optional)

github-actions bot left a comment

Choose a reason for hiding this comment

xxchan commented Jun 15, 2022 • edited

liurenjie1024 Jun 16, 2022

Choose a reason for hiding this comment

xxchan Jun 16, 2022

Choose a reason for hiding this comment

liurenjie1024 Jun 16, 2022

Choose a reason for hiding this comment

xxchan Jun 16, 2022

Choose a reason for hiding this comment

xxchan Jun 16, 2022 • edited

Choose a reason for hiding this comment

fuyufjh Jun 16, 2022 • edited

Choose a reason for hiding this comment

xxchan Jun 16, 2022

Choose a reason for hiding this comment

liurenjie1024 Jun 16, 2022

Choose a reason for hiding this comment

fuyufjh Jun 16, 2022 • edited

Choose a reason for hiding this comment

liurenjie1024 Jun 16, 2022

Choose a reason for hiding this comment

xxchan commented Jun 19, 2022

xxchan commented Jun 30, 2022

xxchan commented Jun 30, 2022 • edited

BugenZhao commented Jul 1, 2022

xxchan commented Jun 15, 2022 •

edited by BugenZhao

xxchan commented Jun 15, 2022 •

edited

xxchan Jun 16, 2022 •

edited

fuyufjh Jun 16, 2022 •

edited

fuyufjh Jun 16, 2022 •

edited

xxchan commented Jun 30, 2022 •

edited