Skip to content

[SPARK-30524] [SQL] follow up SPARK-30524 to resolve comments#27253

Closed
JkSelf wants to merge 4 commits intoapache:masterfrom
JkSelf:followup-skewjoinoptimization2
Closed

[SPARK-30524] [SQL] follow up SPARK-30524 to resolve comments#27253
JkSelf wants to merge 4 commits intoapache:masterfrom
JkSelf:followup-skewjoinoptimization2

Conversation

@JkSelf
Copy link

@JkSelf JkSelf commented Jan 17, 2020

What changes were proposed in this pull request?

Resolve the remaining comments in PR#27226.

Why are the changes needed?

Resolve the comments.

Does this PR introduce any user-facing change?

No

How was this patch tested?

Existing unit tests.

@JkSelf
Copy link
Author

JkSelf commented Jan 17, 2020

@cloud-fan @maryannxue help to review this PR. Thanks.

@SparkQA
Copy link

SparkQA commented Jan 17, 2020

Test build #116908 has finished for PR 27253 at commit 20e6e5e.

  • This patch fails Scala style tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jan 17, 2020

Test build #116910 has finished for PR 27253 at commit 93367a7.

  • This patch fails due to an unknown error code, -9.
  • This patch merges cleanly.
  • This patch adds no public classes.

left: SparkPlan,
right: SparkPlan) extends BinaryExecNode with CodegenSupport {
right: SparkPlan,
partialSMJ: Option[Boolean] = None) extends BinaryExecNode with CodegenSupport {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: isPartial: Boolean = false

PartialShuffleReaderExec(shuffleStage, skewedPartitions.toSet)
}
subJoins += optimizedSmj
subJoins += optimizedSmj.asInstanceOf[SortMergeJoinExec].copy(partialSMJ = Some(true))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the SMJ may not be the root node. Now I think it's better to match SMJ in the plan transformation above.

@@ -222,6 +203,8 @@ case class OptimizeSkewedJoin(conf: SQLConf) extends Rule[SparkPlan] {
case shuffleStage: ShuffleQueryStageExec if shuffleStage.id == left.id ||
Copy link
Contributor

@cloud-fan cloud-fan Jan 17, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry I was totally wrong. We are updating the smj here, why not just

val optimizedSmj = smj.copy(
  left = s1.copy(child = PartialShuffleReaderExec(left, skewedPartitions.toSet)),
  right = s2.copy(child = PartialShuffleReaderExec(right, skewedPartitions.toSet)),
  isPartial = true)

Copy link
Contributor

@cloud-fan cloud-fan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM except one comment

@SparkQA
Copy link

SparkQA commented Jan 17, 2020

Test build #116932 has finished for PR 27253 at commit 32038e0.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Jan 17, 2020

Test build #116951 has finished for PR 27253 at commit 0b6aae4.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@JkSelf
Copy link
Author

JkSelf commented Jan 18, 2020

retest this please

@SparkQA
Copy link

SparkQA commented Jan 18, 2020

Test build #116973 has finished for PR 27253 at commit 0b6aae4.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan cloud-fan closed this in 0d99d7e Jan 19, 2020
@cloud-fan
Copy link
Contributor

thanks, merging to master!

@@ -247,9 +225,9 @@ case class OptimizeSkewedJoin(conf: SQLConf) extends Rule[SparkPlan] {
if (shuffleStages.length == 2) {
// When multi table join, there will be too many complex combination to consider.
// Currently we only handle 2 table join like following two use cases.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor: only one case now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants