-
Notifications
You must be signed in to change notification settings - Fork 525
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(streaming): enable consistent hash for hash agg #3454
Conversation
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
@@ -129,21 +129,22 @@ impl<S: StateStore, SER: RowSerializer> CellBasedTableBase<S, SER, READ_ONLY> { | |||
/// set of `column_ids`. The output will only contains columns with the given ids in the same | |||
/// order. | |||
/// This is parameterized on cell based row serializer. | |||
// TODO: allow specifying the distribution keys and vnodes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CellBasedTable
is directly constructed only in batch query executor or seq scan. We leave them untouched, while specifying vnodes and dist keys for streaming-purpose StateTable
is implemented in this PR.
Codecov Report
@@ Coverage Diff @@
## main #3454 +/- ##
==========================================
+ Coverage 74.44% 74.45% +0.01%
==========================================
Files 770 770
Lines 108409 108455 +46
==========================================
+ Hits 80703 80749 +46
Misses 27706 27706
Flags with carried forward coverage won't be shown. Click here to find out more.
📣 Codecov can now indicate which changes are the most critical in Pull Requests. Learn more |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LSTM!
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
Hey @BugenZhao, this pull request failed to merge and has been dequeued from the merge train. If you believe your PR failed in the merge train because of a flaky test, requeue it by commenting with |
Signed-off-by: Bugen Zhao <i@bugenzhao.com>
@Mergifyio requeue |
❌ This pull request head commit has not been previously disembarked from queue. |
* allow passing vnodes Signed-off-by: Bugen Zhao <i@bugenzhao.com> * enable distribution for hash agg Signed-off-by: Bugen Zhao <i@bugenzhao.com> * fix datum hash Signed-off-by: Bugen Zhao <i@bugenzhao.com> * debug Signed-off-by: Bugen Zhao <i@bugenzhao.com> * clean up Signed-off-by: Bugen Zhao <i@bugenzhao.com> * fix state table generation Signed-off-by: Bugen Zhao <i@bugenzhao.com> * remove logs and refine docs Signed-off-by: Bugen Zhao <i@bugenzhao.com> * refine fallback vnodes doc Signed-off-by: Bugen Zhao <i@bugenzhao.com>
I hereby agree to the terms of the Singularity Data, Inc. Contributor License Agreement.
What's changed and what's your intention?
This PR enables consistent hash for hash agg. The assertions in
CellBasedTable
shows that we've correctly sharded data with hash dispatcher.Checklist
./risedev check
(or alias,./risedev c
)Refer to a related PR or issue link (optional)
StreamNode::HashAgg
should be group key #3456Hash
implementation onDatum
is not consistent withArray::hash_at
#3457