Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-31719][SQL] Refactor JoinSelection #28540

Closed

Conversation

dbaliafroozeh
Copy link
Contributor

What changes were proposed in this pull request?

This PR extracts the logic for selecting the planned join type out of the JoinSelection rule and moves it to JoinSelectionHelper in Catalyst.

Why are the changes needed?

This change both cleans up the code in JoinSelection and allows the logic to be in one place and be used from other rules that need to make decision based on the join type before the planning time.

Does this PR introduce any user-facing change?

BuildSide, BuildLeft, and BuildRight are moved from org.apache.spark.sql.execution to Catalyst in org.apache.spark.sql.catalyst.optimizer.

How was this patch tested?

This is a refactoring, passes existing tests.

@hvanhovell
Copy link
Contributor

ok to test

@hvanhovell
Copy link
Contributor

add to whitelist

@SparkQA
Copy link

SparkQA commented May 15, 2020

Test build #122670 has finished for PR 28540 at commit 0b22913.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented May 18, 2020

Test build #122810 has finished for PR 28540 at commit 9db3a91.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented May 18, 2020

Test build #122816 has finished for PR 28540 at commit 42f3a29.

  • This patch fails PySpark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented May 18, 2020

Test build #122809 has finished for PR 28540 at commit a8f4e0b.

  • This patch fails from timeout after a configured wait of 400m.
  • This patch merges cleanly.
  • This patch adds no public classes.

@maropu
Copy link
Member

maropu commented May 19, 2020

retest this please

@SparkQA
Copy link

SparkQA commented May 19, 2020

Test build #122820 has finished for PR 28540 at commit 8a2d0ee.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented May 19, 2020

Test build #122821 has finished for PR 28540 at commit 8a2d0ee.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented May 19, 2020

Test build #122843 has finished for PR 28540 at commit 10da76a.

  • This patch fails from timeout after a configured wait of 400m.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented May 20, 2020

Test build #122901 has finished for PR 28540 at commit 8b30d51.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@dbaliafroozeh
Copy link
Contributor Author

retest this please

@SparkQA
Copy link

SparkQA commented May 24, 2020

Test build #123064 has finished for PR 28540 at commit f94fbd3.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

right: LogicalPlan,
joinType: JoinType,
hint: JoinHint,
onlyLookingAtHint: Boolean,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: selectOnlyFromHint? I couldn't tell what this meant just by the name onlyLookingAtHint .

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want to be super clear, maybe "pickJoinSideByHintOnly"? or simply "hintOnly".

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, hintOnly looks nice to me.

Copy link
Member

@maropu maropu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM if the tests passed.

@SparkQA
Copy link

SparkQA commented May 27, 2020

Test build #123185 has finished for PR 28540 at commit b455b12.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
  • class JoinSelectionHelperSuite extends PlanTest with JoinSelectionHelper

@cloud-fan
Copy link
Contributor

thanks, merging to master!

@cloud-fan cloud-fan closed this in f6f1e51 May 27, 2020
@cloud-fan
Copy link
Contributor

oops, the latest commit hasn't passed tests yet. I'll monitor the jenkins status.

@SparkQA
Copy link

SparkQA commented May 27, 2020

Test build #123190 has finished for PR 28540 at commit 8fbe29c.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
5 participants