Skip to content

[SPARK-31134][SQL] optimize skew join after shuffle partitions are coalesced#27893

Closed
cloud-fan wants to merge 5 commits intoapache:masterfrom
cloud-fan:aqe
Closed

[SPARK-31134][SQL] optimize skew join after shuffle partitions are coalesced#27893
cloud-fan wants to merge 5 commits intoapache:masterfrom
cloud-fan:aqe

Conversation

@cloud-fan
Copy link
Contributor

What changes were proposed in this pull request?

Run the OptimizeSkewedJoin rule after the CoalesceShufflePartitions rule.

Why are the changes needed?

Remove duplicated coalescing code in OptimizeSkewedJoin.

Does this PR introduce any user-facing change?

No

How was this patch tested?

existing tests

@cloud-fan
Copy link
Contributor Author

cc @maryannxue @JkSelf

@SparkQA
Copy link

SparkQA commented Mar 12, 2020

Test build #119721 has finished for PR 27893 at commit 4b8a195.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

val sizes = mapStats.bytesByPartitionId
val partitions = partitionSpecs.map {
case spec @ CoalescedPartitionSpec(start, end) =>
var sum = 0L
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: sizes.slice(start, end).sum?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

slice will create a new array, which is less efficient.

val mapStartIndices = getMapStartIndices(left, partitionIndex, leftTargetSize)
if (mapStartIndices.length > 1) {
val CoalescedPartitionSpec(start, end) = left.partitions(partitionIndex)._1
assert(start + 1 == end, "coalesced partition should never be skewed.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First of all, our factor check is not strict enough as being "> 0", what happens here if it's set to "1"?
Second, the assert is usually disabled in production, which could lead to errors later in this code.
We should probably make it more robust by putting this condition into isSkew. And you can still add such an assertion in isSkew implementation.

val rightSize = rightStats.bytesByPartitionId(partitionIndex)
val rightSize = rightSizes(partitionIndex)
val isRightSkew = isSkewed(rightSize, rightMedSize) && canSplitRight
if (isLeftSkew || isRightSkew) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not related to this PR, but I think we can remove this outer if now.

val rightParts = if (isRightSkew) {
val mapStartIndices = getMapStartIndices(right, partitionIndex, rightTargetSize)
if (mapStartIndices.length > 1) {
val CoalescedPartitionSpec(start, end) = right.partitions(partitionIndex)._1
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code in the calculation of leftParts and rightParts is almost same. It is better to wrap the code in a method.

@cloud-fan cloud-fan force-pushed the aqe branch 2 times, most recently from 5671ce2 to d7e55e8 Compare March 16, 2020 06:48
@SparkQA
Copy link

SparkQA commented Mar 16, 2020

Test build #119840 has finished for PR 27893 at commit a66933b.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Mar 16, 2020

Test build #119837 has finished for PR 27893 at commit fbf616c.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Mar 16, 2020

Test build #119839 has finished for PR 27893 at commit d7e55e8.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor Author

retest this please

@SparkQA
Copy link

SparkQA commented Mar 16, 2020

Test build #119841 has finished for PR 27893 at commit a66933b.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

val isRightCoalesced = rightPartSpec.startReducerIndex + 1 < rightPartSpec.endReducerIndex

// Ideally a skewed partition won't get coalesced, but skip it here for safety.
val leftParts = if (isLeftSkew && !isLeftCoalesced) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JkSelf I tried to create a common method to handle both sides, but the method takes too many parameters so I give up. Besides, it's not much duplicated code here.

val rightPartSpec = right.partitionsWithSizes(partitionIndex)._1
val isRightCoalesced = rightPartSpec.startReducerIndex + 1 < rightPartSpec.endReducerIndex

// Ideally a skewed partition won't get coalesced, but skip it here for safety.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: A skewed partition should never be coalesced, but skip it here just to be safe.

Copy link
Contributor

@maryannxue maryannxue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@SparkQA
Copy link

SparkQA commented Mar 16, 2020

Test build #119878 has finished for PR 27893 at commit ba07313.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Mar 16, 2020

Test build #119881 has finished for PR 27893 at commit 89ab703.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Mar 16, 2020

Test build #119883 has finished for PR 27893 at commit c67590a.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

val rightPartSpec = right.partitionsWithSizes(partitionIndex)._1
val isRightCoalesced = rightPartSpec.startReducerIndex + 1 < rightPartSpec.endReducerIndex

// A skewed partition should never be coalesced, but skip it here just to be safe.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Say we have original map output: 100, 10, 2000, and the coalesce target is 100. So, after CoalesceShufflePartitions, we shall have CoalescedPartitionSpec(0, 1) and CoalescedPartitionSpec(1, 3). Then, we start to apply OptimizeSkewedJoin where CoalescedPartitionSpec(1, 3) is obviously skewed but can be missed. Right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the coalesce rule will coalesce 10 and 2000, can you double check?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yeah, I checked, you're right!

@Ngone51
Copy link
Member

Ngone51 commented Mar 17, 2020

LGTM

@gatorsmile
Copy link
Member

Thanks! Merged to master/3.0

gatorsmile pushed a commit that referenced this pull request Mar 17, 2020
…alesced

### What changes were proposed in this pull request?

Run the `OptimizeSkewedJoin` rule after the `CoalesceShufflePartitions` rule.

### Why are the changes needed?

Remove duplicated coalescing code in `OptimizeSkewedJoin`.

### Does this PR introduce any user-facing change?

No

### How was this patch tested?

existing tests

Closes #27893 from cloud-fan/aqe.

Authored-by: Wenchen Fan <wenchen@databricks.com>
Signed-off-by: gatorsmile <gatorsmile@gmail.com>
(cherry picked from commit 30d9535)
Signed-off-by: gatorsmile <gatorsmile@gmail.com>
sjincho pushed a commit to sjincho/spark that referenced this pull request Apr 15, 2020
…alesced

### What changes were proposed in this pull request?

Run the `OptimizeSkewedJoin` rule after the `CoalesceShufflePartitions` rule.

### Why are the changes needed?

Remove duplicated coalescing code in `OptimizeSkewedJoin`.

### Does this PR introduce any user-facing change?

No

### How was this patch tested?

existing tests

Closes apache#27893 from cloud-fan/aqe.

Authored-by: Wenchen Fan <wenchen@databricks.com>
Signed-off-by: gatorsmile <gatorsmile@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants