Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-13668][SQL] Reorder filter/join predicates to short-circuit isNotNull checks #11511

Closed
wants to merge 4 commits into from

Conversation

sameeragarwal
Copy link
Member

What changes were proposed in this pull request?

If a filter predicate or a join condition consists of IsNotNull checks, we should reorder these checks such that these non-nullability checks are evaluated before the rest of the predicates.

For e.g., if a filter predicate is of the form a > 5 && isNotNull(b), we should rewrite this as isNotNull(b) && a > 5 during physical plan generation.

How was this patch tested?

new unit tests that verify the physical plan for both filters and joins in ReorderedPredicateSuite

@SparkQA
Copy link

SparkQA commented Mar 4, 2016

Test build #52447 has finished for PR 11511 at commit dfb33ec.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • class ReorderedPredicateSuite extends QueryTest with SharedSQLContext with PredicateHelper

@sameeragarwal
Copy link
Member Author

cc @nongli @yhuai


// Attempts to re-order the individual conjunctive predicates in an expression to short circuit
// the evaluation of relatively cheaper checks (e.g., checking for nullability) before others.
protected def getReorderedExpression(expr: Expression): Expression = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: i think this is more aptly named "reorderPredicates"

@nongli
Copy link
Contributor

nongli commented Mar 4, 2016

LGTM

@sameeragarwal
Copy link
Member Author

Thanks, comments addressed.

@SparkQA
Copy link

SparkQA commented Mar 4, 2016

Test build #52487 has finished for PR 11511 at commit d75d313.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@SparkQA
Copy link

SparkQA commented Mar 4, 2016

Test build #52488 has finished for PR 11511 at commit 4a8a9a3.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.


// Attempts to re-order the individual conjunctive predicates in an expression to short circuit
// the evaluation of relatively cheaper checks (e.g., checking for nullability) before others.
protected def reorderPredicates(expr: Expression): Expression = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have a test to make sure this reordering is stable?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I modified the 2 tests in ReorderPredicateSuite to verify that the sort is stable.

@SparkQA
Copy link

SparkQA commented Mar 8, 2016

Test build #52692 has finished for PR 11511 at commit 21d8897.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds the following public classes (experimental):
    • class ReorderedPredicateSuite extends SharedSQLContext with PredicateHelper

@yhuai
Copy link
Contributor

yhuai commented Mar 8, 2016

LGTM. Merging to master.

@asfgit asfgit closed this in e430614 Mar 8, 2016
roygao94 pushed a commit to roygao94/spark that referenced this pull request Mar 22, 2016
…NotNull checks

## What changes were proposed in this pull request?

If a filter predicate or a join condition consists of `IsNotNull` checks, we should reorder these checks such that these non-nullability checks are evaluated before the rest of the predicates.

For e.g., if a filter predicate is of the form `a > 5 && isNotNull(b)`, we should rewrite this as `isNotNull(b) && a > 5` during physical plan generation.

## How was this patch tested?

new unit tests that verify the physical plan for both filters and joins in `ReorderedPredicateSuite`

Author: Sameer Agarwal <sameer@databricks.com>

Closes apache#11511 from sameeragarwal/reorder-isnotnull.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants