-
Notifications
You must be signed in to change notification settings - Fork 24.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
QL: Simplify arithmetic operations in binary comps #66022
Conversation
This commit adds an optimizer rule to simplify the arithmetic operations in binary comparison expressions, which in turn will allow for further expression compounding by the optimiser. Only the negation and plus, minus, multiplication and division are currently considered and only when two of the operands are a literal. For instance `((a + 1) / 2 - 3) * 4 >= 14` becomes `a >= 12`.
Revert unintentional change of a source file.
Pinging @elastic/es-ql (Team:QL) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR is changing one behavior that might not be so obvious at first: a query like SELECT salary FROM test_emp WHERE salary + 10 >= 'x'
returns no documents in master
, but throws Cannot compute [-] between [String] [Integer]
in this PR.
Apart from the fact that the error message is slightly incomplete - missing and
between [String]
and [Integer]
- it's a confusing error message for the user. The minus sign is not present in the query, but present in the error message.
x-pack/plugin/ql/src/main/java/org/elasticsearch/xpack/ql/optimizer/OptimizerRules.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to see more tests including non-numeric values considering the binary operators inside SQL support intervals and dates.
The cast to double is problematic - what does 4 / 2
return vs 5 / 2
? And why is the double casting having to be applied manually when we do not do that for /
in the first place.
What happens in case of :
date - INTERVAL 1 MONTH > '2010-01-01T01:01:01` ?
...va/org/elasticsearch/xpack/ql/expression/predicate/operator/comparison/BinaryComparison.java
Show resolved
Hide resolved
...l/src/main/java/org/elasticsearch/xpack/ql/expression/predicate/operator/arithmetic/Mul.java
Outdated
Show resolved
Hide resolved
...l/src/main/java/org/elasticsearch/xpack/ql/expression/predicate/operator/arithmetic/Sub.java
Outdated
Show resolved
Hide resolved
...l/src/main/java/org/elasticsearch/xpack/ql/expression/predicate/operator/arithmetic/Mod.java
Outdated
Show resolved
Hide resolved
...l/src/main/java/org/elasticsearch/xpack/ql/expression/predicate/operator/arithmetic/Div.java
Outdated
Show resolved
Hide resolved
x-pack/plugin/sql/src/test/java/org/elasticsearch/xpack/sql/optimizer/OptimizerRunTests.java
Outdated
Show resolved
Hide resolved
x-pack/plugin/sql/src/test/java/org/elasticsearch/xpack/sql/optimizer/OptimizerRunTests.java
Outdated
Show resolved
Hide resolved
x-pack/plugin/ql/src/main/java/org/elasticsearch/xpack/ql/optimizer/OptimizerRules.java
Outdated
Show resolved
Hide resolved
x-pack/plugin/ql/src/main/java/org/elasticsearch/xpack/ql/optimizer/OptimizerRules.java
Outdated
Show resolved
Hide resolved
x-pack/plugin/ql/src/main/java/org/elasticsearch/xpack/ql/optimizer/OptimizerRules.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have one major concern that I would like to discuss before moving ahead with the PR.
x-pack/plugin/ql/src/main/java/org/elasticsearch/xpack/ql/optimizer/OptimizerRules.java
Outdated
Show resolved
Hide resolved
x-pack/plugin/sql/src/test/java/org/elasticsearch/xpack/sql/optimizer/OptimizerRunTests.java
Outdated
Show resolved
Hide resolved
The optimisation is now refactor splitting the logic by add-sub and mul-div operations. Further restictions are put in place, to prevent the optimiser from chaning the outcome of the filtering. More tests have been added.
@elasticmachine update branch |
Thanks, I was wrongly expecting that this check had already been in place (but #49636 is still open).
This hasn't been reported before, so I went ahead and fixed it part of this PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Halfway through, few comments and suggestions so far.
...org/elasticsearch/xpack/ql/expression/predicate/operator/arithmetic/ArithmeticOperation.java
Outdated
Show resolved
Hide resolved
x-pack/plugin/ql/src/main/java/org/elasticsearch/xpack/ql/optimizer/OptimizerRules.java
Outdated
Show resolved
Hide resolved
x-pack/plugin/ql/src/main/java/org/elasticsearch/xpack/ql/optimizer/OptimizerRules.java
Outdated
Show resolved
Hide resolved
x-pack/plugin/ql/src/main/java/org/elasticsearch/xpack/ql/optimizer/OptimizerRules.java
Outdated
Show resolved
Hide resolved
x-pack/plugin/ql/src/main/java/org/elasticsearch/xpack/ql/optimizer/OptimizerRules.java
Outdated
Show resolved
Hide resolved
x-pack/plugin/ql/src/main/java/org/elasticsearch/xpack/ql/optimizer/OptimizerRules.java
Outdated
Show resolved
Hide resolved
x-pack/plugin/ql/src/main/java/org/elasticsearch/xpack/ql/optimizer/OptimizerRules.java
Outdated
Show resolved
Hide resolved
x-pack/plugin/ql/src/main/java/org/elasticsearch/xpack/ql/optimizer/OptimizerRules.java
Outdated
Show resolved
Hide resolved
x-pack/plugin/ql/src/main/java/org/elasticsearch/xpack/ql/optimizer/OptimizerRules.java
Outdated
Show resolved
Hide resolved
...l/src/main/java/org/elasticsearch/xpack/ql/expression/predicate/operator/arithmetic/Mod.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks to me like this PR is made of two distinct parts.
- the 'bubbling' of negations (which I would rename to SimplifyDoubleNegations).
- The actual simplification of arithmetic in comparisons.
2 depends on 1 however 1 has value by itself and it would be easier to reason about stand-alone.
...org/elasticsearch/xpack/ql/expression/predicate/operator/arithmetic/ArithmeticOperation.java
Outdated
Show resolved
Hide resolved
...l/src/main/java/org/elasticsearch/xpack/ql/expression/predicate/operator/arithmetic/Add.java
Show resolved
Hide resolved
x-pack/plugin/ql/src/main/java/org/elasticsearch/xpack/ql/optimizer/OptimizerRules.java
Outdated
Show resolved
Hide resolved
x-pack/plugin/sql/src/test/java/org/elasticsearch/xpack/sql/planner/QueryTranslatorTests.java
Show resolved
Hide resolved
- add isMulOrDiv and isAddOrSub methods; - add a new factory if'ace for the arithmetic operations invertible in relation to a binary comparison; - simplify NEG - NEG tree structures; - more tests.
I've converted the PR back to draft to extract the rule on negations. However, I assume the rule will mostly extract the negation up (i.e. bubbling them up: DIV/MUL - NEG - x => NEG - DIV/MUL - x), rather then reduce them (i.e. NEG - NEG - x => x), so not sure if "Simplify" would be correct. |
public static final class SimplifyComparisonsArithmetics extends OptimizerExpressionRule { | ||
BiFunction<DataType, DataType, Boolean> typesCompatible; | ||
|
||
public SimplifyComparisonsArithmetics(BiFunction<DataType, DataType, Boolean> typesCompatible) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typesCompatible
-> typeCheck
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before I change it, just to 2x check: the passed arguments are DataTypes::areCompatible
and SqlDataTypes::areCompatible
.
The code using the var also looks like return typesCompatible.apply([type1], [type2]) == false;
.
Not sure if the change would bring more clarity ("what's checked?"), so asking.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some comments
return literalNeg == null ? bc : bc.reverse().replaceChildren(List.of(((Neg) bc.left()).field(), literalNeg)); | ||
} | ||
|
||
static Expression safeMaybeFold(Expression expression) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't particularly like this method and, imo, it doesn't help much understanding the code. I think the return null
inside the catch
block bothers me.
I would have kept the original (no separate method used) block of code, as I find it easier to read.
private static Expression reduceNegation(BinaryComparison bc) {
Literal bcLiteral = (Literal) bc.right();
Expression expression = new Neg(bcLiteral.source(), bcLiteral);
if (expression.foldable()) {
try {
expression = new Literal(expression.source(), expression.fold(), expression.dataType());
return bc.reverse().replaceChildren(List.of(((Neg) bc.left()).field(), expression));
} catch (ArithmeticException | DateTimeException e) {
// implicitly return the original BinaryComparison in case the folding results in scenarios where,
// for example, we have Long.MAX_VALUE+1 to fold. In this case, we skip the optimization.
}
}
return bc;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the return null inside the catch block bothers me.
The null
isn't returned from the catch block, it's just assigned to the var to return.
But this still being a detail, would you mind expanding what's troubling / how it could be improved?
I would have kept the original (no separate method used)
The code is extracted as a method since it's reused. It should be acceptably concise too: "if the expression is foldable, attempt to do so. Return either folded outcome, original (if not foldable) or null
(if foldable, but unsafely; i.e. exception wud be thrown)".
An alternative could be for the caller to save the input expression (as in your example), the function to return the original input, if folding is unsafe, and for caller to check returned value against the original input. But this would be more verbose (as in your example) and this pattern to be applied multiple times.
For now I've taken Costin's suggestion, but still open for change if needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can be left as is I guess. My initial suggestion seems to be just a personal preference. If there will be other changes to this PR, I'd add a comment similar to the one I made in the catch (ArithmeticException | DateTimeException e) {
to explain the catch block "ignoring" the math failure.
if (bc.right() instanceof Literal) { | ||
if (bc.left() instanceof ArithmeticOperation) { | ||
return SimplifyOperation.simplify(bc, typesCompatible); | ||
} else if (bc.left() instanceof Neg) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think only if (bc.left()....
is enough. else if
doesn't necessarily help here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've changed it.
(Particularly here, the if[-else] is practically a switch, on a type that's not switchable in Java, so the change makes sense.
Generally I'm in favour of the if-else format when:
- a
return
is involved, whose later replacement by a non-jumping statement, could make the nextif
branch eligible; i.e. less error-prone. - is more compact.)
x-pack/plugin/ql/src/main/java/org/elasticsearch/xpack/ql/optimizer/OptimizerRules.java
Outdated
Show resolved
Hide resolved
x-pack/plugin/sql/qa/server/src/main/resources/arithmetic.csv-spec
Outdated
Show resolved
Hide resolved
x-pack/plugin/sql/qa/server/src/main/resources/arithmetic.csv-spec
Outdated
Show resolved
Hide resolved
...csearch/xpack/sql/expression/predicate/operator/arithmetic/SqlBinaryArithmeticOperation.java
Show resolved
Hide resolved
- rename methods for more clarify; - regroup simplification checks; - eliminate calling super class' methods; - use == for static string comps; - use Math.signum instead of inlining the calculation.
@elasticmachine update branch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
return literalNeg == null ? bc : bc.reverse().replaceChildren(List.of(((Neg) bc.left()).field(), literalNeg)); | ||
} | ||
|
||
static Expression safeMaybeFold(Expression expression) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can be left as is I guess. My initial suggestion seems to be just a personal preference. If there will be other changes to this PR, I'd add a comment similar to the one I made in the catch (ArithmeticException | DateTimeException e) {
to explain the catch block "ignoring" the math failure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Left some stylist comments, no need for another review from me.
return (simplification == null || simplification.isUnsafe(typesCompatible)) ? comparison : simplification.apply(); | ||
} | ||
|
||
private static boolean isMulOrDiv(String opSymbol) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this method necessary? I only see it used in one place and the check can check can be directly just life for ADD and SUB.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's being used twice. I'm on the fence tbh - it makes things easier to read in sign()
, but looks unbalanced in the if-else explicit ADD/SUB check - but would leave it as is now.
static boolean isMulOrDiv(String opSymbol) { | ||
return MUL.symbol().equals(opSymbol) || DIV.symbol().equals(opSymbol); | ||
// - post adjustments | ||
Expression opSpecificPostApply(BinaryComparison binaryComparison) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
postApply or postProcess. opSpecific is somewhat meaningless.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went for postProcess
.
@@ -1221,7 +1245,7 @@ static Expression safeMaybeFold(Expression expression) { | |||
} | |||
|
|||
// can it be quickly fast-tracked that the operation can't be reduced? | |||
boolean cannotSimplify(BiFunction<DataType, DataType, Boolean> typesCompatible) { | |||
boolean isUnsafe(BiFunction<DataType, DataType, Boolean> typesCompatible) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you don't this to be subclass because of the dedicated method, consider making it final.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, thanks, applied! Applicable to both class' main methods.
return true; | ||
} | ||
|
||
return opSpecificIsUnsafe(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
how about check
, doCheck
without opSpecific
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated.
- rename methods; - add comments.
Both upstream and current branch updated line in the same way (incrementing a count), so merging left it unchanged. However the net result needs to reflect both modifications (i.e. inc by two).
@elasticmachine update branch |
* Simplify arithmetic operations in binary comps This commit adds an optimizer rule to simplify the arithmetic operations in binary comparison expressions, which in turn will allow for further expression compounding by the optimiser. Only the negation and plus, minus, multiplication and division are currently considered and only when two of the operands are a literal. For instance `((a + 1) / 2 - 3) * 4 >= 14` becomes `a >= 12`. (cherry picked from commit f5c2982)
… (#68331) * QL: Simplify arithmetic operations in binary comps (#66022) * Simplify arithmetic operations in binary comps This commit adds an optimizer rule to simplify the arithmetic operations in binary comparison expressions, which in turn will allow for further expression compounding by the optimiser. Only the negation and plus, minus, multiplication and division are currently considered and only when two of the operands are a literal. For instance `((a + 1) / 2 - 3) * 4 >= 14` becomes `a >= 12`. (cherry picked from commit f5c2982)
This PR adds an optimizer rule to simplify the arithmetic operations
in binary comparison expressions, which in turn will allow for further
expression compounding by the optimiser.
Only the negation and plus, minus, multiplication and division are
currently considered and only when two of the operands are a literal.
For instance
((a + 1) / 2 - 3) * 4 >= 14
becomesa >= 12
.Closes #65394.