Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SQL: Introduce IsNull node to simplify expressions #35206

Merged
merged 8 commits into from Nov 9, 2018

Conversation

matriv
Copy link
Contributor

@matriv matriv commented Nov 2, 2018

Add IsNull node in parser to simplify expressions so that <value> IS NULL is
no longer translated internally to NOT(<value> IS NOT NULL)

Replace IsNotNullProcessor with CheckNullProcessor to encapsulate both
isNull and isNotNull functionality.

Closes: #34876
Fixes: #35171

Add `IsNull` node in parser to simplify expressions so that `<value> IS NULL` is
no longer translated internally to `NOT(<value> IS NOT NULL)`

Closes: elastic#34876
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search-aggs

@matriv
Copy link
Contributor Author

matriv commented Nov 2, 2018

Integration tests for HAVING will be introduced together with the fix here: https://github.com/elastic/elasticsearch/pull/35164/files#diff-a3ba4284670bdc13a2ef764cff91a212

@matriv
Copy link
Contributor Author

matriv commented Nov 2, 2018

retest this please

@matriv matriv added the v6.5.1 label Nov 3, 2018
Copy link
Member

@costin costin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good however the change is too broad (see my comments on negatable) hence my requests for changes.


import java.io.IOException;

public class IsNullProcessor implements Processor {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As we have two very similar processors, it makes sense to aggregate the isnull/isnotnull processors into a single processor class which uses an enum to differentiate between the two (as we do with the rest of other operators).
This makes things consistent and keeps the serialization name explosion to a minimum.
It's also a good reason to keep this a 6.6 feature and not backport it to 6.5 (since it's an enhancement not really a bug).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've tested this PR for this bug - #35171 and it's no longer reproduceable... might be a good reason to backport it to 6.5 @costin ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If that's the case and the change isn't too big, it's worth backporting yes.

new ConstantFolding(),
// boolean
new BooleanSimplification(),
new BooleanLiteralsOnTheRight(),
new BinaryComparisonSimplification(),
// need to occur after simplification
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need to occur after simplification

Why? It should get picked up on the next cycle in the worse case scenario.

@@ -58,8 +57,8 @@ public String processScript(String script) {
@Override
protected Expression canonicalize() {
Expression canonicalChild = field().canonical();
if (canonicalChild instanceof Negateable) {
return ((Negateable) canonicalChild).negate();
if (canonicalChild instanceof UnaryNegateable) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't look good as it essentially removes NOT over binary logical negateable NOT (x AND y).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a mistake for sure, thx for catching!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add some tests please with NOT in HAVING over AND and IS NULL to check this automatically in the future?

@@ -19,6 +19,10 @@

public abstract class UnaryScalarFunction extends ScalarFunction {

public interface UnaryNegateable {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regardless of the number of children, there should be a negatable (I believe that's the correct syntax) interface common to all. In other words, Negatable should be moved up from BinaryOperator to potentially a standalone interface and likely have its generic signature extended to bound its return type (Negatable<T extends ScalarFunction>).

@@ -1278,8 +1284,11 @@ private Expression simplifyNot(Not n) {
return TRUE;
}

if (c instanceof Negateable) {
return ((Negateable) c).negate();
if (c instanceof BinaryNegateable) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Case in point, it doesn't make any difference internally whether an expression is BinaryNegtable or Unary - the result in both cases is an Expression so why differentiate between the two.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was just because of the 2 different ifaces, but as you suggested we'd better merge them into one.

Copy link
Member

@costin costin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made some comments. Thanks.

package org.elasticsearch.xpack.sql.expression.function.scalar;

public interface Negateable {

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to extract the interface so high up - just put it inside the predicate package since essentially it applies only to logical predicates. Also the correct name should be Negatable.

Also it's worth looking into adding the generic aspect in place to have a better lock down on the class hierarchy:
Negatable<T extends ScalarFunction> { T negate() } . IsNull/IsNotNull can lock it to UnaryScalarFunction while the BinaryLogic keep the BinaryOperator in place.

Copy link
Contributor Author

@matriv matriv Nov 7, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But i didn't change its name: Negate + able = Negateable :) but the correct word is negatable, so I'll change.

public enum CheckNullOperation implements Function<Object, Boolean> {

IS_NULL(Objects::isNull, "IS NULL"),
IS_NOT_NULL(o -> !Objects.isNull(o), "IS NOT NULL");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use Objects::nonNull (the main reason these methods exist on Objects).

new ConstantFolding(),
new FoldNull(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, why was FoldNull moved?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no reason, missed the initial place, will change.

@matriv
Copy link
Contributor Author

matriv commented Nov 9, 2018

@costin @astefan Addressed comments and fixed things after syncing with #35236

@matriv
Copy link
Contributor Author

matriv commented Nov 9, 2018

retest this please

1 similar comment
@matriv
Copy link
Contributor Author

matriv commented Nov 9, 2018

retest this please

Copy link
Member

@costin costin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@astefan astefan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Looking great

@matriv matriv merged commit 36da6e1 into elastic:master Nov 9, 2018
@matriv matriv deleted the mt/fix-34876 branch November 9, 2018 10:32
matriv pushed a commit that referenced this pull request Nov 9, 2018
Add `IsNull` node in parser to simplify expressions so that `<value> IS NULL` is
no longer translated internally to `NOT(<value> IS NOT NULL)`

Replace `IsNotNullProcessor` with `CheckNullProcessor` to encapsulate both
isNull and isNotNull functionality.

Closes: #34876
Fixes: #35171
@matriv
Copy link
Contributor Author

matriv commented Nov 9, 2018

Backported to 6.x with 771a940

matriv pushed a commit that referenced this pull request Nov 9, 2018
Add `IsNull` node in parser to simplify expressions so that `<value> IS NULL` is
no longer translated internally to `NOT(<value> IS NOT NULL)`

Replace `IsNotNullProcessor` with `CheckNullProcessor` to encapsulate both
isNull and isNotNull functionality.

Closes: #34876
Fixes: #35171
@matriv
Copy link
Contributor Author

matriv commented Nov 9, 2018

Backported to 6.5 with 97adb4b

@matriv matriv added the >bug label Nov 20, 2018
@jimczi jimczi added v7.0.0-beta1 and removed v7.0.0 labels Feb 7, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants