-
Notifications
You must be signed in to change notification settings - Fork 526
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor(optimizer): refactor logicalAgg::create #3533
refactor(optimizer): refactor logicalAgg::create #3533
Conversation
Codecov Report
@@ Coverage Diff @@
## main #3533 +/- ##
==========================================
+ Coverage 74.29% 74.30% +0.01%
==========================================
Files 773 773
Lines 109399 109438 +39
==========================================
+ Hits 81273 81323 +50
+ Misses 28126 28115 -11
Flags with carried forward coverage won't be shown. Click here to find out more.
📣 Codecov can now indicate which changes are the most critical in Pull Requests. Learn more |
struct LogicalAggBuilder { | ||
/// the builder of the input Project | ||
input_proj_builder: LogicalProjectBuilder, | ||
/// the column [0..input_group_key_num) in the project's output is the group key |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment may not be true when there are duplicates in group_exprs
, and can cause a panic:
with t(v1) as (values (1), (2), (1)) select v1 from t group by v1, v1;
More details in a prior discussion:
#1492 (comment)
(tbh I have not read the code yet. Just tested this case and got a panic. Maybe the root cause is another issue, as we should have added a regression test on duplicate group_exprs
...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is because the colprune can not handle the plan with same group key and I have fixed it. Also add some regression test.
…r_logical_agg_prepare_for_distinct
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
/// having clause. | ||
struct LogicalAggBuilder { | ||
/// the builder of the input Project | ||
input_proj_builder: LogicalProjectBuilder, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
input_proj_builder: LogicalProjectBuilder, | |
proj_builder: LogicalProjectBuilder, |
/// the builder of the input Project | ||
input_proj_builder: LogicalProjectBuilder, | ||
/// the column [0..input_group_key_num) in the project's output is the group key | ||
input_group_key_num: usize, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
input_group_key_num: usize, | |
group_key_num: usize, |
Co-authored-by: Kaige Li <55606560+likg227@users.noreply.github.com>
@mergify requeue |
❌ This pull request head commit has not been previously disembarked from queue. |
LogicalProject { exprs: [$1, $2, $0, $3] } | ||
LogicalAgg { group_keys: [0, 1, 1], agg_calls: [min($2), max($2)] } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still incorrect. The proj exprs should be [$1, $3, $0, $4]
, or we need to dedup group_keys
to [0, 1]
.
Repro:
create table t(v1 int, v2 int, v3 int) with ('appendonly' = false);
insert into t values (1, 2, 3);
select v2, min(v1) as min_v1, v3, max(v1) as max_v1 from t group by v3, v2, v2;
Actual incorrect output:
v2 | min_v1 | v3 | max_v1
----+--------+----+--------
2 | 2 | 3 | 1
(1 row)
Or even panic:
explain select v2, sum(v1) from t group by v3, v2, v2;
* add logical project builder for the expression dedup * refactor logicalAgg::create with logical builder * add clippy fix * add comments * add planner test for dup group key * fix when exist duplicate group key * fix prune col for logical agg with dup group key * conflict * clippy fix * Update src/frontend/src/optimizer/plan_node/logical_project.rs Co-authored-by: Kaige Li <55606560+likg227@users.noreply.github.com> Co-authored-by: Kaige Li <55606560+likg227@users.noreply.github.com> Co-authored-by: Alex Chi <iskyzh@gmail.com> Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
* add logical project builder for the expression dedup * refactor logicalAgg::create with logical builder * add clippy fix * add comments * add planner test for dup group key * fix when exist duplicate group key * fix prune col for logical agg with dup group key * conflict * clippy fix * Update src/frontend/src/optimizer/plan_node/logical_project.rs Co-authored-by: Kaige Li <55606560+likg227@users.noreply.github.com> Co-authored-by: Kaige Li <55606560+likg227@users.noreply.github.com> Co-authored-by: Alex Chi <iskyzh@gmail.com> Co-authored-by: mergify[bot] <37929162+mergify[bot]@users.noreply.github.com>
I hereby agree to the terms of the Singularity Data, Inc. Contributor License Agreement.
What's changed and what's your intention?
LogicalProjectBuilder
for dedup the expression and build a projectChecklist
./risedev check
(or alias,./risedev c
)btw, it was for distinct agg rewriting, but now I think maybe we can do the distinct rewriting as a rule?