Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(streaming): enable consistent hash for hash agg #3454

Merged
merged 11 commits into from
Jun 29, 2022

Conversation

BugenZhao
Copy link
Member

@BugenZhao BugenZhao commented Jun 24, 2022

I hereby agree to the terms of the Singularity Data, Inc. Contributor License Agreement.

What's changed and what's your intention?

This PR enables consistent hash for hash agg. The assertions in CellBasedTable shows that we've correctly sharded data with hash dispatcher.

Checklist

  • I have written necessary docs and comments
  • I have added necessary unit tests and integration tests
  • All checks passed in ./risedev check (or alias, ./risedev c)

Refer to a related PR or issue link (optional)

Signed-off-by: Bugen Zhao <i@bugenzhao.com>
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
@BugenZhao BugenZhao marked this pull request as ready for review June 29, 2022 05:33
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
@@ -129,21 +129,22 @@ impl<S: StateStore, SER: RowSerializer> CellBasedTableBase<S, SER, READ_ONLY> {
/// set of `column_ids`. The output will only contains columns with the given ids in the same
/// order.
/// This is parameterized on cell based row serializer.
// TODO: allow specifying the distribution keys and vnodes.
Copy link
Member Author

@BugenZhao BugenZhao Jun 29, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CellBasedTable is directly constructed only in batch query executor or seq scan. We leave them untouched, while specifying vnodes and dist keys for streaming-purpose StateTable is implemented in this PR.

@codecov
Copy link

codecov bot commented Jun 29, 2022

Codecov Report

Merging #3454 (16e0747) into main (9815936) will increase coverage by 0.01%.
The diff coverage is 66.98%.

@@            Coverage Diff             @@
##             main    #3454      +/-   ##
==========================================
+ Coverage   74.44%   74.45%   +0.01%     
==========================================
  Files         770      770              
  Lines      108409   108455      +46     
==========================================
+ Hits        80703    80749      +46     
  Misses      27706    27706              
Flag Coverage Δ
rust 74.45% <66.98%> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
src/batch/src/executor/row_seq_scan.rs 17.68% <ø> (+0.11%) ⬆️
src/stream/src/executor/aggregation/mod.rs 76.60% <0.00%> (-1.78%) ⬇️
src/stream/src/from_proto/batch_query.rs 0.00% <ø> (ø)
src/stream/src/from_proto/global_simple_agg.rs 0.00% <0.00%> (ø)
src/stream/src/from_proto/hash_agg.rs 0.00% <0.00%> (ø)
src/storage/src/table/cell_based_table.rs 73.72% <81.63%> (+1.92%) ⬆️
src/storage/src/table/state_table.rs 96.49% <100.00%> (+0.62%) ⬆️
src/stream/src/executor/dispatch.rs 77.35% <100.00%> (+0.03%) ⬆️
src/stream/src/executor/test_utils.rs 95.06% <100.00%> (+0.03%) ⬆️
src/meta/src/hummock/mock_hummock_meta_client.rs 40.56% <0.00%> (-0.95%) ⬇️
... and 7 more

📣 Codecov can now indicate which changes are the most critical in Pull Requests. Learn more

Copy link
Contributor

@xx01cyx xx01cyx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Copy link
Contributor

@BowenXiao1999 BowenXiao1999 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LSTM!

src/storage/src/table/cell_based_table.rs Show resolved Hide resolved
src/storage/src/table/cell_based_table.rs Show resolved Hide resolved
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
@BugenZhao BugenZhao added the mergify/can-merge Indicates that the PR can be added to the merge queue label Jun 29, 2022
@BugenZhao BugenZhao mentioned this pull request Jun 24, 2022
5 tasks
@mergify
Copy link
Contributor

mergify bot commented Jun 29, 2022

Hey @BugenZhao, this pull request failed to merge and has been dequeued from the merge train. If you believe your PR failed in the merge train because of a flaky test, requeue it by commenting with @mergifyio requeue. More details can be found on the Queue: Embarked in merge train check-run.

@BugenZhao
Copy link
Member Author

@Mergifyio requeue

@mergify
Copy link
Contributor

mergify bot commented Jun 29, 2022

requeue

❌ This pull request head commit has not been previously disembarked from queue.

@mergify mergify bot merged commit e40327b into main Jun 29, 2022
@mergify mergify bot deleted the bz/vnode-in-key-part-5 branch June 29, 2022 08:17
huangjw806 pushed a commit that referenced this pull request Jul 5, 2022
* allow passing vnodes

Signed-off-by: Bugen Zhao <i@bugenzhao.com>

* enable distribution for hash agg

Signed-off-by: Bugen Zhao <i@bugenzhao.com>

* fix datum hash

Signed-off-by: Bugen Zhao <i@bugenzhao.com>

* debug

Signed-off-by: Bugen Zhao <i@bugenzhao.com>

* clean up

Signed-off-by: Bugen Zhao <i@bugenzhao.com>

* fix state table generation

Signed-off-by: Bugen Zhao <i@bugenzhao.com>

* remove logs and refine docs

Signed-off-by: Bugen Zhao <i@bugenzhao.com>

* refine fallback vnodes doc

Signed-off-by: Bugen Zhao <i@bugenzhao.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
mergify/can-merge Indicates that the PR can be added to the merge queue type/feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Enable consistent hash for HashAgg
3 participants