Skip to content

Conversation

@isidentical
Copy link
Contributor

@isidentical isidentical commented Aug 27, 2022

Which issue does this PR close?

Closes #2360.

Rationale for this change

It seems like we used to only collect column references from the actual aggregate expressions, not the grouped expressions. This PR changes the processing of rewrite_sort_col_by_aggs to also include that fact.

What changes are included in this PR?

Expression rewriter inside rewrite_sort_col_by_aggs now considers both distinct grouping expressions (e.g. GROUP BY <something>) as well as regular aggregate expressions when looking for a rewrite.

This PR also includes another change which corrects the behaviour of add_missing_columns when it is re-building the Projection. We used to ignore applying the alias to the field qualifiers in the schema, but with this patch that is also fixed (needed by the test sql::explain_analyze::explain_analyze_baseline_metrics).

Are there any user-facing changes?

This is a bug fix.

@github-actions github-actions bot added core Core DataFusion crate logical-expr Logical plan and expressions labels Aug 27, 2022
@codecov-commenter
Copy link

Codecov Report

Merging #3280 (6e6b674) into master (eedc787) will increase coverage by 0.07%.
The diff coverage is 96.87%.

@@            Coverage Diff             @@
##           master    #3280      +/-   ##
==========================================
+ Coverage   85.84%   85.92%   +0.07%     
==========================================
  Files         291      294       +3     
  Lines       52981    53470     +489     
==========================================
+ Hits        45484    45946     +462     
- Misses       7497     7524      +27     
Impacted Files Coverage Δ
datafusion/expr/src/expr_rewriter.rs 86.26% <80.00%> (+0.17%) ⬆️
datafusion/core/tests/sql/group_by.rs 96.91% <100.00%> (+0.27%) ⬆️
datafusion/expr/src/logical_plan/builder.rs 90.35% <100.00%> (ø)
datafusion/core/tests/provider_filter_pushdown.rs 70.45% <0.00%> (-14.48%) ⬇️
datafusion/core/src/physical_plan/metrics/value.rs 86.93% <0.00%> (-0.51%) ⬇️
datafusion/core/src/datasource/view.rs 86.54% <0.00%> (-0.08%) ⬇️
datafusion/expr/src/function.rs 97.75% <0.00%> (ø)
datafusion/core/src/datasource/file_format/csv.rs 98.91% <0.00%> (ø)
datafusion/physical-expr/src/expressions/column.rs 100.00% <0.00%> (ø)
...fusion/optimizer/src/pre_cast_lit_in_comparison.rs 91.95% <0.00%> (ø)
... and 16 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@isidentical isidentical marked this pull request as ready for review August 27, 2022 20:14
Copy link
Contributor

@Dandandan Dandandan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great to me

Copy link
Member

@andygrove andygrove left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@andygrove andygrove merged commit 256ea91 into apache:master Aug 30, 2022
@ursabot
Copy link

ursabot commented Aug 30, 2022

Benchmark runs are scheduled for baseline = 7dedcb1 and contender = 256ea91. 256ea91 is a master commit associated with this PR. Results will be available as each benchmark for each run completes.
Conbench compare runs links:
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on ec2-t3-xlarge-us-east-2] ec2-t3-xlarge-us-east-2
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on test-mac-arm] test-mac-arm
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on ursa-i9-9960x] ursa-i9-9960x
[Skipped ⚠️ Benchmarking of arrow-datafusion-commits is not supported on ursa-thinkcentre-m75q] ursa-thinkcentre-m75q
Buildkite builds:
Supported benchmarks:
ec2-t3-xlarge-us-east-2: Supported benchmark langs: Python, R. Runs only benchmarks with cloud = True
test-mac-arm: Supported benchmark langs: C++, Python, R
ursa-i9-9960x: Supported benchmark langs: Python, R, JavaScript
ursa-thinkcentre-m75q: Supported benchmark langs: C++, Java

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core Core DataFusion crate logical-expr Logical plan and expressions

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Cannot have order by expression that references complex group by expression

5 participants