Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-13354] [SQL] push filter throughout outer join #11234

Closed
wants to merge 1 commit into from

Conversation

davies
Copy link
Contributor

@davies davies commented Feb 17, 2016

For a query

select * from a left outer join b on a.a = b.a where b.b > 10

The condition b.b > 10 will filter out all the row that the b part of it is empty.

In this case, we should use Inner join, and push down the filter into b.

@gatorsmile
Copy link
Member

It sounds like this PR is related to the following two PRs #10567 and #10566

If we can convert the outer joins to inner joins, the push down will be done by the existing rules.

@SparkQA
Copy link

SparkQA commented Feb 17, 2016

Test build #51410 has finished for PR 11234 at commit 7d87244.

  • This patch fails Spark unit tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@davies
Copy link
Contributor Author

davies commented Feb 17, 2016

@gatorsmile I will review those, you can take canFilterOutNull to #10567

@gatorsmile
Copy link
Member

Thank you very much! @davies Will do it. : )

@@ -954,7 +980,7 @@ object PushPredicateThroughJoin extends Rule[LogicalPlan] with PredicateHelper {
split(splitConjunctivePredicates(filterCondition), left, right)

joinType match {
case Inner =>
case _ if isInnerJoin(joinType, leftFilterConditions, rightFilterConditions) =>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should be doing this kind of reasoning in this rule. If you are going to convert outer joins to inner joins that should be its own rule.

@marmbrus
Copy link
Contributor

If we are going to reason about null intolerance we should do that in a principled way (and there is already a start in the form of constraints) and not just mix it randomly into existing rules.

@davies
Copy link
Contributor Author

davies commented Feb 18, 2016

@marmbrus I think #10567 make more sense, will pull out part of this into that one.

@davies davies closed this Feb 18, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants