Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-20350] Add optimization rules to apply Complementation Laws. #17650

Closed
wants to merge 5 commits into from

Conversation

ptkool
Copy link
Contributor

@ptkool ptkool commented Apr 16, 2017

What changes were proposed in this pull request?

Apply Complementation Laws during boolean expression simplification.

How was this patch tested?

Tested using unit tests, integration tests, and manual tests.

@ptkool ptkool force-pushed the apply_complementation_laws branch from 5245d32 to 688b2f0 Compare April 16, 2017 12:50
@gatorsmile
Copy link
Member

ok to test

@@ -42,6 +42,12 @@ class BooleanSimplificationSuite extends PlanTest with PredicateHelper {

val testRelation = LocalRelation('a.int, 'b.int, 'c.int, 'd.string)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about adding another boolean column?

  val testRelation = LocalRelation('a.int, 'b.int, 'c.int, 'd.string, 'e.boolean)

Our filter conditions are not allowed to accept the non-boolean predicates

@gatorsmile
Copy link
Member

LGTM cc @cloud-fan @hvanhovell

@SparkQA
Copy link

SparkQA commented Apr 16, 2017

Test build #75839 has finished for PR 17650 at commit 688b2f0.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@@ -153,6 +153,11 @@ object BooleanSimplification extends Rule[LogicalPlan] with PredicateHelper {
case TrueLiteral Or _ => TrueLiteral
case _ Or TrueLiteral => TrueLiteral

case a And b if Not(a).semanticEquals(b) => FalseLiteral
case a Or b if Not(a).semanticEquals(b) => TrueLiteral
case a And b if a.semanticEquals(Not(b)) => FalseLiteral
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logically it feels like duplication of code from line 156 ... but unfortunately Not is not smart enough to realise that. I think if you override the semanticEquals in Not then you should be able to get rid of this line. The advantage being we would make the expression smart enough to figure this out by itself rather than handling this in outside code (which is possibly more places in the code).

Same applies for line 159.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant something like this for Not:

  override def semanticEquals(other: Expression): Boolean = other match {
    case Not(otherChild) => child.semanticEquals(otherChild)
    case _ => child match {
      case Not(innerChild) =>
        // eliminate double negation
        innerChild.semanticEquals(other)
      case _ =>
        super.semanticEquals(other)
    }
  }

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we need this? Not(Not(a)) will be simplified to a

@@ -160,4 +166,12 @@ class BooleanSimplificationSuite extends PlanTest with PredicateHelper {
testRelation.where('a > 2 || ('b > 3 && 'b < 5)))
comparePlans(actual, expected)
}

test("Complementation Laws") {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about double negation ? ie. 'a && !(!'a)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this really required for this PR?

@@ -160,4 +166,12 @@ class BooleanSimplificationSuite extends PlanTest with PredicateHelper {
testRelation.where('a > 2 || ('b > 3 && 'b < 5)))
comparePlans(actual, expected)
}

test("Complementation Laws") {
checkCondition('a && !'a, LocalRelation(testRelation.output, Seq.empty))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just checkCondition('a && !'a, testRelation)?

Copy link
Contributor Author

@ptkool ptkool Apr 17, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought that doing this would make the test a bit misleading. I wanted to make sure it was clear that the resulting plan node was a LocalRelation node that produced no data.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah i see. Then we need to create a new test relation with data, and test if the result is a empty relation here

@cloud-fan
Copy link
Contributor

LGTM

@SparkQA
Copy link

SparkQA commented Apr 19, 2017

Test build #75953 has finished for PR 17650 at commit ade7255.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

@cloud-fan
Copy link
Contributor

thanks, merging to master/2.2!

asfgit pushed a commit that referenced this pull request Apr 20, 2017
## What changes were proposed in this pull request?

Apply Complementation Laws during boolean expression simplification.

## How was this patch tested?

Tested using unit tests, integration tests, and manual tests.

Author: ptkool <michael.styles@shopify.com>
Author: Michael Styles <michael.styles@shopify.com>

Closes #17650 from ptkool/apply_complementation_laws.

(cherry picked from commit 63824b2)
Signed-off-by: Wenchen Fan <wenchen@databricks.com>
@asfgit asfgit closed this in 63824b2 Apr 20, 2017
@ptkool ptkool deleted the apply_complementation_laws branch April 20, 2017 14:30
@gatorsmile
Copy link
Member

Just FYI, we found a bug in this rule, regarding NULL handling.

@ptkool ptkool restored the apply_complementation_laws branch September 10, 2018 18:30
@gatorsmile
Copy link
Member

I am fixing it now. Thanks!

peter-toth pushed a commit to peter-toth/spark that referenced this pull request Oct 6, 2018
## What changes were proposed in this pull request?

Apply Complementation Laws during boolean expression simplification.

## How was this patch tested?

Tested using unit tests, integration tests, and manual tests.

Author: ptkool <michael.styles@shopify.com>
Author: Michael Styles <michael.styles@shopify.com>

Closes apache#17650 from ptkool/apply_complementation_laws.
@ptkool ptkool deleted the apply_complementation_laws branch January 18, 2020 12:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants