Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-36652][SQL] AQE dynamic join selection should not apply to non-equi join #33899

Closed
wants to merge 3 commits into from

Conversation

c21
Copy link
Contributor

@c21 c21 commented Sep 2, 2021

What changes were proposed in this pull request?

Currently DynamicJoinSelection has two features: 1.demote broadcast hash join, and 2.promote shuffled hash join. Both are achieved by adding join hint in query plan, and only works for equi join. However the rule is matching with Join operator now, so it would add hint for non-equi join by mistake (See added test query in JoinHintSuite.scala for an example).

This PR is to fix DynamicJoinSelection to only apply to equi-join, and improve checkHintNonEquiJoin to check we should not add PREFER_SHUFFLE_HASH for non-equi join.

Why are the changes needed?

Improve the logic of codebase to be better.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

Added unit test in JoinHintSuite.scala.

@c21
Copy link
Contributor Author

c21 commented Sep 2, 2021

@cloud-fan and @ulysses-you could you help take a look when you have time? Thanks.

@github-actions github-actions bot added the SQL label Sep 2, 2021
withLogAppender(hintAppender, level = Some(Level.WARN)) {
withSQLConf(
SQLConf.ADAPTIVE_EXECUTION_ENABLED.key -> "true",
SQLConf.ADAPTIVE_MAX_SHUFFLE_HASH_JOIN_LOCAL_MAP_THRESHOLD.key -> "67108865") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's this magic number?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's due to advisoryPartitionSize <= maxShuffledHashJoinLocalMapThreshold in https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/adaptive/DynamicJoinSelection.scala#L48. spark.sql.adaptive.advisoryPartitionSizeInBytes by default is 64MB, which is 67108864. +1 one here to trigger the bug. This is just an example.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can change the number to 64MB if it's preferred.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yea '64MB' seems better

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cloud-fan - sure, updated.

@cloud-fan
Copy link
Contributor

is it only a log issue?

@c21
Copy link
Contributor Author

c21 commented Sep 2, 2021

is it only a log issue?

@cloud-fan - yes I think so. We just logged a warning for these wrongly-applied hint. So it should not be a blocker for release.

@ulysses-you
Copy link
Contributor

@c21 thank you for the fix, I think this issue only exists in master branch since the hint check #32355 is only merged into master after branch-3.2 cut.

The origin idea of DynamicJoinSelection is giving some hint which can be applied potentially after match the join strategy so it's no harm for non-eqai join. The only thing changed is we add the hint check. So I think it's safe to only match the equi join at DynamicJoinSelection side.

@@ -169,7 +169,8 @@ abstract class SparkStrategies extends QueryPlanner[SparkPlan] {
}

private def checkHintNonEquiJoin(hint: JoinHint): Unit = {
if (hintToShuffleHashJoin(hint) || hintToSortMergeJoin(hint)) {
if (hintToShuffleHashJoin(hint) || hintToPreferShuffleHashJoin(hint) ||
hintToSortMergeJoin(hint)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PreferShuffleHashJoin is an internal hint so user cann't use it, we don't need do this check

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I am aware of this hint is internal only. I am more thinking towards to catch and expose the bug where we apply PREFER_SHUFFLE_HASH internally by mistake (e.g. for the added unit test query in JoinHintSuite.scala). I feel it's no harm to add here and help catch more bugs in the future. But if you guys think we should not add it, I can also remove it. cc @cloud-fan .

@dongjoon-hyun
Copy link
Member

cc @gengliangwang , too

@gengliangwang
Copy link
Member

@c21 thank you for the fix, I think this issue only exists in master branch since the hint check #32355 is only merged into master after branch-3.2 cut.

then we should update the affected version in jira SPARK-36652

@c21
Copy link
Contributor Author

c21 commented Sep 2, 2021

then we should update the affected version in jira SPARK-36652

Yep updated. Sorry for confusion @gengliangwang.

@dongjoon-hyun
Copy link
Member

Thank you for updating the affected version. :)

@SparkQA
Copy link

SparkQA commented Sep 2, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47440/

@SparkQA
Copy link

SparkQA commented Sep 2, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47440/

@@ -169,7 +169,8 @@ abstract class SparkStrategies extends QueryPlanner[SparkPlan] {
}

private def checkHintNonEquiJoin(hint: JoinHint): Unit = {
if (hintToShuffleHashJoin(hint) || hintToSortMergeJoin(hint)) {
if (hintToShuffleHashJoin(hint) || hintToPreferShuffleHashJoin(hint) ||
hintToSortMergeJoin(hint)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit. indentation

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dongjoon-hyun - sure, updated.

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, LGTM (with one nit comment)

@SparkQA
Copy link

SparkQA commented Sep 3, 2021

Test build #142939 has finished for PR 33899 at commit f535fdf.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Sep 3, 2021

Kubernetes integration test starting
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47449/

@SparkQA
Copy link

SparkQA commented Sep 3, 2021

Kubernetes integration test status failure
URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/47449/

@dongjoon-hyun
Copy link
Member

Thank you, @c21 , @cloud-fan , @ulysses-you , @gengliangwang .
The last commit is only updating indentation and the tests passed already at the previous commit.
Merged to master.

@c21
Copy link
Contributor Author

c21 commented Sep 3, 2021

Thank you all for review!

@SparkQA
Copy link

SparkQA commented Sep 3, 2021

Test build #142948 has finished for PR 33899 at commit b6002a1.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@AmplabJenkins
Copy link

Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/142948/

@c21 c21 deleted the aqe-test branch September 7, 2021 17:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants