-
Notifications
You must be signed in to change notification settings - Fork 28.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SQL]Extract the joinkeys from join condition #1190
Conversation
Merged build triggered. |
Merged build started. |
logger.debug(s"Considering join on: $condition") | ||
// Find equi-join predicates that can be evaluated before the join, and thus can be used | ||
// as join keys. | ||
val (joinPredicates, otherPredicates) = condition.map(splitConjunctivePredicates). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: dot should precede its operator immediately
Merged build finished. All automated tests passed. |
All automated tests passed. |
Merged build triggered. |
Merged build started. |
Merged build finished. |
Refer to this link for build results: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16051/ |
Jenkins, retest this please. |
@rxin , can you ask Jenkins to retest this? Seems he doesn't answer me. :) |
Merged build triggered. |
Merged build started. |
Oh, Jenkins is working. :) |
Merged build finished. All automated tests passed. |
All automated tests passed. |
I'm not sure what the point of this change is. It is only serving to make the planner more brittle and tied to the specifics of the current implementation of the optimizer. If the current pattern for hash joins is correct and more general, I think we should keep it. |
The |
BTW, if I followed the current implementation pattern, which means I have to handle predicate push down for the outer join as it's done for inner join, too, that may make the code duplicated(with the optimizer) and confusing. |
Okay, you've convinced me with the outer join argument. Remove HashFilteredJoin as its pretty redundant with your pattern. |
and please rebase to master. |
Merged build triggered. |
Merged build started. |
Thank you @marmbrus , updated, let's see the testing result. |
Merged build finished. All automated tests passed. |
All automated tests passed. |
@@ -65,7 +64,7 @@ private[sql] abstract class SparkStrategies extends QueryPlanner[SparkPlan] { | |||
def broadcastTables: Seq[String] = sqlContext.joinBroadcastTables.split(",").toBuffer | |||
|
|||
def apply(plan: LogicalPlan): Seq[SparkPlan] = plan match { | |||
case HashFilteredJoin( | |||
case ExtractEquiJoinKeys( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe annotation on line 48 can be modified.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch.
Thanks, updated. |
Merged build triggered. |
Merged build started. |
Merged build finished. All automated tests passed. |
All automated tests passed. |
Thanks, merged into master. |
Extract the join keys from equality conditions, that can be evaluated using equi-join. Author: Cheng Hao <hao.cheng@intel.com> Closes apache#1190 from chenghao-intel/extract_join_keys and squashes the following commits: 4a1060a [Cheng Hao] Fix some of the small issues ceb4924 [Cheng Hao] Remove the redundant pattern of join keys extraction cec34e8 [Cheng Hao] Update the code style issues dcc4584 [Cheng Hao] Extract the joinkeys from join condition
Extract the join keys from equality conditions, that can be evaluated using equi-join.