Skip to content

[WIP][SPARK-34847][SQL] Simplify ResolveAggregateFunctions#31947

Closed
cloud-fan wants to merge 1 commit intoapache:masterfrom
cloud-fan:agg
Closed

[WIP][SPARK-34847][SQL] Simplify ResolveAggregateFunctions#31947
cloud-fan wants to merge 1 commit intoapache:masterfrom
cloud-fan:agg

Conversation

@cloud-fan
Copy link
Contributor

What changes were proposed in this pull request?

The current ResolveAggregateFunctions is very complicated. It recursively calls the entire analyzer, and has duplicated code for Filter and Sort.

This PR simplifies ResolveAggregateFunctions and just resolves the Filter condition/Sort ordering with Aggregate, instead of running the entire analyzer again. It also unifies the code for Filter and Sort.

Why are the changes needed?

Code cleanup and speed up query compilation.

Does this PR introduce any user-facing change?

No

How was this patch tested?

Existing tests and run TPCDSQuerySuite locally.

Before this PR

=== Metrics of Analyzer/Optimizer Rules ===
Total number of runs: 409582
Total time: 41.358926831 seconds

After this PR

=== Metrics of Analyzer/Optimizer Rules ===
Total number of runs: 370784
Total time: 35.875695094 seconds

@SparkQA
Copy link

SparkQA commented Mar 24, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41019/

@SparkQA
Copy link

SparkQA commented Mar 24, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41019/

@SparkQA
Copy link

SparkQA commented Mar 24, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41021/

@SparkQA
Copy link

SparkQA commented Mar 24, 2021

Test build #136435 has finished for PR 31947 at commit f79410b.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Mar 24, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41021/

@SparkQA
Copy link

SparkQA commented Mar 24, 2021

Test build #136437 has finished for PR 31947 at commit 603f1b4.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan cloud-fan changed the title [SPARK-34847][SQL] Simplify ResolveAggregateFunctions [WIP][SPARK-34847][SQL] Simplify ResolveAggregateFunctions Mar 24, 2021
@cloud-fan
Copy link
Contributor Author

The current solution doesn't work with subqueries and needs more time to investigate.

@SparkQA
Copy link

SparkQA commented Mar 24, 2021

Test build #136469 has finished for PR 31947 at commit 297d067.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Mar 24, 2021

Kubernetes integration test unable to build dist.

exiting with code: 1
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/41053/

@cloud-fan cloud-fan closed this Jun 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants