Skip to content

Comments

[SPARK-37538][SQL] Replace single projection expand#34800

Closed
tanelk wants to merge 6 commits intoapache:masterfrom
tanelk:ReplaceSingleProjectionExpand
Closed

[SPARK-37538][SQL] Replace single projection expand#34800
tanelk wants to merge 6 commits intoapache:masterfrom
tanelk:ReplaceSingleProjectionExpand

Conversation

@tanelk
Copy link
Contributor

@tanelk tanelk commented Dec 3, 2021

What changes were proposed in this pull request?

In the Optimizer replace all instances of Expand with only 1 projection with a Project.

Why are the changes needed?

Both grouping sets and distinct aggregations can create Expand with only 1 projection. Removing those can improve the performance in two ways:

  • Enable optimization rules, that can not work with Expand
  • Avoid unnecessary copying - ExpandExec has needCopyResult: Boolean = true

Does this PR introduce any user-facing change?

No

How was this patch tested?

New UT

@tanelk tanelk changed the title [SPARK-37538] Replace single projection expand [SPARK-37538][SQL] Replace single projection expand Dec 3, 2021
@github-actions github-actions bot added the SQL label Dec 3, 2021
@SparkQA
Copy link

SparkQA commented Dec 3, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50382/

@SparkQA
Copy link

SparkQA commented Dec 3, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50382/

@SparkQA
Copy link

SparkQA commented Dec 3, 2021

Test build #145907 has finished for PR 34800 at commit 5567766.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

ReplaceSingleProjectionExpand) :: Nil
}

test("Replace single projection expand with project") {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe let's add a JIRA number as the prefix.

@SparkQA
Copy link

SparkQA commented Dec 8, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50471/

@SparkQA
Copy link

SparkQA commented Dec 8, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50471/

@SparkQA
Copy link

SparkQA commented Dec 8, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50477/

@SparkQA
Copy link

SparkQA commented Dec 8, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/50477/

@SparkQA
Copy link

SparkQA commented Dec 8, 2021

Test build #145995 has finished for PR 34800 at commit 0e166aa.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Dec 8, 2021

Test build #146001 has finished for PR 34800 at commit fa36637.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@tanelk tanelk requested a review from HyukjinKwon December 9, 2021 06:24
# Conflicts:
#	sql/core/src/test/scala/org/apache/spark/sql/SQLQuerySuite.scala
@github-actions
Copy link

We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.
If you'd like to revive this PR, please reopen it and ask a committer to remove the Stale tag!

@github-actions github-actions bot added the Stale label May 21, 2022
@github-actions github-actions bot closed this May 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants