Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(batch): support string_agg with and without order by clause #3952

Merged
merged 21 commits into from Jul 18, 2022

Conversation

stdrc
Copy link
Contributor

@stdrc stdrc commented Jul 18, 2022

I hereby agree to the terms of the Singularity Data, Inc. Contributor License Agreement.

What's changed and what's your intention?

This PR adds support for string_agg for batch backend. Order-by clause is supported, while NULLS {FIRST|LAST} is ignored for now.

Checklist

  • I have written necessary rustdoc comments
  • I have added necessary unit tests and integration tests
  • All checks passed in ./risedev check (or alias, ./risedev c)

Documentation

If your pull request contains user-facing changes, please specify the types of the changes, and create a release note. Otherwise, please feel free to remove this section.

Types of user-facing changes

Please keep the types that apply to your changes, and remove those that do not apply.

  • SQL commands, functions, and operators

Release note

Please create a release note for your changes. In the release note, focus on the impact on users, and mention the environment or conditions where the impact may occur.

  • Support string_agg function in batch mode

Refer to a related PR or issue link (optional)

#3838

@stdrc stdrc linked an issue Jul 18, 2022 that may be closed by this pull request
@stdrc stdrc marked this pull request as draft July 18, 2022 06:26
@codecov
Copy link

codecov bot commented Jul 18, 2022

Codecov Report

Merging #3952 (2fca466) into main (c832057) will increase coverage by 0.04%.
The diff coverage is 87.27%.

@@            Coverage Diff             @@
##             main    #3952      +/-   ##
==========================================
+ Coverage   73.81%   73.85%   +0.04%     
==========================================
  Files         821      822       +1     
  Lines      116045   116447     +402     
==========================================
+ Hits        85662    86006     +344     
- Misses      30383    30441      +58     
Flag Coverage Δ
rust 73.85% <87.27%> (+0.04%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
src/expr/src/lib.rs 100.00% <ø> (ø)
...c/expr/src/vector_op/agg/general_sorted_grouper.rs 86.29% <ø> (ø)
src/batch/src/executor/order_by.rs 82.75% <33.33%> (ø)
src/expr/src/vector_op/agg/aggregator.rs 66.82% <40.90%> (-3.12%) ⬇️
...rc/frontend/src/optimizer/plan_node/logical_agg.rs 91.88% <55.00%> (-0.77%) ⬇️
src/expr/src/vector_op/agg/string_agg.rs 82.72% <82.72%> (ø)
src/batch/src/executor/hash_agg.rs 90.04% <100.00%> (+0.08%) ⬆️
src/batch/src/executor/sort_agg.rs 93.12% <100.00%> (+0.05%) ⬆️
src/common/src/util/encoding_for_comparison.rs 99.00% <100.00%> (+2.85%) ⬆️
src/common/src/util/sort_util.rs 88.99% <100.00%> (+9.32%) ⬆️
... and 11 more

📣 Codecov can now indicate which changes are the most critical in Pull Requests. Learn more

@stdrc stdrc marked this pull request as ready for review July 18, 2022 06:35
Copy link
Collaborator

@TennyZhuang TennyZhuang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@stdrc stdrc added the mergify/can-merge Indicates that the PR can be added to the merge queue label Jul 18, 2022
@mergify mergify bot merged commit 2923159 into main Jul 18, 2022
@mergify mergify bot deleted the rc/support-batch-string-agg branch July 18, 2022 09:07
nasnoisaac pushed a commit to nasnoisaac/risingwave that referenced this pull request Aug 9, 2022
…isingwavelabs#3952)

* pass order by clause to backend

* support basic string_agg without order by

* handle string_agg when generating 2-phase agg

* suppot order by in string_agg

* convert `Vec<OrderByField>` to `Vec<OrderPair>` while building `AggStateFactory`

* add StringAggState to handle string agg w/ and w/o order by seperately

* little update to todo comment

* make functions in `encoding_for_comparison` consistent with `sort_util`

* don't care about how to store the encoded chunk in `encode_chunk`

* encode order keys to speed up comparison

* remove useless logs

* assert that index in range before accessing element

* remove unnecessary `EncodedColumn` struct

* add unittests for `encode_chunk` and `encode_row`

* adjust arguments order to make things consistent

* add unittests for `compare_rows` and `compare_rows_in_chunk`

* fix clippy check

* add unittest for string agg

* add e2e tests for string_agg

* replace `into_value_at` with `std::mem:take`

Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
mergify/can-merge Indicates that the PR can be added to the merge queue type/feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

batch: support string_agg with and without order by clause
3 participants