-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Add support for FIRST_VALUE, LAST_VALUE Aggregate Functions #6445
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
# Conflicts: # datafusion/core/tests/sqllogictests/test_files/aggregate.slt
# Conflicts: # datafusion/core/src/physical_plan/planner.rs # datafusion/core/tests/sqllogictests/test_files/aggregate.slt # datafusion/expr/src/expr.rs # datafusion/expr/src/tree_node/expr.rs # datafusion/expr/src/udaf.rs # datafusion/optimizer/src/analyzer/type_coercion.rs # datafusion/optimizer/src/common_subexpr_eliminate.rs # datafusion/proto/src/logical_plan/from_proto.rs # datafusion/proto/src/logical_plan/mod.rs # datafusion/proto/src/logical_plan/to_proto.rs # datafusion/sql/src/expr/function.rs # datafusion/sql/src/utils.rs
…rst_last_aggregate # Conflicts: # datafusion/expr/Cargo.toml
# Conflicts: # datafusion-cli/Cargo.lock # datafusion/common/Cargo.toml # datafusion/core/Cargo.toml # datafusion/expr/Cargo.toml # datafusion/sql/Cargo.toml
|
Note that this change makes Having said that, there are few minor issues here that we will fix shortly. I think the example in the PR body should read something like: SELECT a, FIRST_VALUE(c ORDER BY b DESC) as first_c
FROM table
GROUP BY aThis will return the first Also some tests are failing. We will fix the tests, make the PR body clearer and possibly add another test that looks like this. |
|
Marking as draft while the PR is worked on |
# Please enter a commit message to explain why this merge is necessary, # especially if it merges an updated upstream into a topic branch. # # Lines starting with '#' will be ignored, and an empty message aborts # the commit.
# Please enter a commit message to explain why this merge is necessary, # especially if it merges an updated upstream into a topic branch. # # Lines starting with '#' will be ignored, and an empty message aborts # the commit.
| "+----------------------+", | ||
| "| 10 |", | ||
| "+----------------------+", | ||
| "+-----------------------+", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As part of this PR, I have added convert_camel_to_upper_snake util for displaying Aggregate functions. Their display name is automatically, calculated from struct name. Hence display name of some of the existing aggregators is changed(such as APPROXMEDIAN became APPROX_MEDIAN, ARRAYAGG became ARRAY_AGG, etc).
I have resolved the CI issue, also I have updated Pr body to reflect intent better. This PR is ready for further review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @mustafasrepo -- I think the feature looks great and really nicely coded 👌 .
I think a different approach for display name that is not so tightly bound to the Rust struct name would be better, but that is a stylistic opinion and something we can fix in a follow on PR. I left some more specific suggestions inline
| "+-------------------------------------------+", | ||
| "| 10 |", | ||
| "+-------------------------------------------+", | ||
| "+---------------------------------------------+", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is nice -- it makes the aggregates consistent with the (very) recent change from @2010YOUY01 for scalar functions: #6448
|
Thanks again @mustafasrepo ! |
| BoolOr, | ||
| } | ||
|
|
||
| impl AggregateFunction { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
Which issue does this PR close?
Closes #6444.
Rationale for this change
What changes are included in this PR?
This PR adds support for
FIRST_VALUE,LAST_VALUEordering sensitive aggregation functions. With the changes in this PR we now can run query belowThis query will return the first c value, according to the reverse b order, for every a group.
Are these changes tested?
Yes, new tests are added to the
groupby.sltfile for new aggregate functions.Are there any user-facing changes?