Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-23207][SQL][FOLLOW-UP] Don't perform local sort for DataFrame.repartition(1) #20426

Closed
wants to merge 2 commits into from

Conversation

jiangxb1987
Copy link
Contributor

What changes were proposed in this pull request?

In ShuffleExchangeExec, we don't need to insert extra local sort before round-robin partitioning, if the new partitioning has only 1 partition, because under that case all output rows go to the same partition.

How was this patch tested?

The existing test cases.

@SparkQA
Copy link

SparkQA commented Jan 29, 2018

Test build #86778 has finished for PR 20426 at commit 2c9ee7f.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jan 30, 2018

Test build #86785 has finished for PR 20426 at commit 5fa8de3.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@jiangxb1987
Copy link
Contributor Author

@cloud-fan
Copy link
Contributor

LGTM, merging to master/2.3!

asfgit pushed a commit that referenced this pull request Jan 30, 2018
…repartition(1)

## What changes were proposed in this pull request?

In `ShuffleExchangeExec`, we don't need to insert extra local sort before round-robin partitioning, if the new partitioning has only 1 partition, because under that case all output rows go to the same partition.

## How was this patch tested?

The existing test cases.

Author: Xingbo Jiang <xingbo.jiang@databricks.com>

Closes #20426 from jiangxb1987/repartition1.

(cherry picked from commit b375397)
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
@asfgit asfgit closed this in b375397 Jan 30, 2018
bersprockets pushed a commit to bersprockets/spark that referenced this pull request Aug 13, 2018
…repartition(1)

In `ShuffleExchangeExec`, we don't need to insert extra local sort before round-robin partitioning, if the new partitioning has only 1 partition, because under that case all output rows go to the same partition.

The existing test cases.

Author: Xingbo Jiang <xingbo.jiang@databricks.com>

Closes apache#20426 from jiangxb1987/repartition1.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants