bitwise math function expressions #10605

clintropolis · 2020-11-25T10:27:16Z

Description

This PR picks up the commit from #10230, which adds bitwise math functions (bitwiseAnd, bitwiseComplement , bitwiseOr, bitwiseShiftLeft, bitwiseShiftRight, bitwiseXor) to the Druid native expression system, and adds bitwiseConvertDoubleToLongBits and bitwiseConvertLongBitsToDouble to allow use with double typed columns.

Finally, I've added vectorization support so these expressions can be utilized in vectorized query engines, as well as tests.

I'll save adding SQL support as a follow-up PR.

Related #8560

This PR has:

been self-reviewed.
added documentation for new or modified features or behaviors.
added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
added integration tests.
been tested in a test Druid cluster.

Key changed/added classes in this PR

Function
VectorMathProcessors

…tions

gianm · 2021-01-08T07:04:51Z

core/src/main/java/org/apache/druid/math/expr/Evals.java

+
+  public static long doubleToLongBits(double x)
+  {
+    return Double.doubleToLongBits(x);


Why add this instead of calling the thing in Double?

Trying to remember.. I think i was just trying to be consistent with the other value conversion functions and put it here, but it probably could just call it directly as it seems unlike we would change the function

gianm · 2021-01-08T07:08:38Z

docs/misc/math-expr.md

@@ -119,6 +119,13 @@ See javadoc of java.lang.Math for detailed explanation for each function.
 |acos|acos(x) would return the arc cosine of x|
 |asin|asin(x) would return the arc sine of x|
 |atan|atan(x) would return the arc tangent of x|
+|bitwiseAnd|bitwiseAnd(x,y) would return the result of x & y. Double values will be converted to their bit representation|


Why convert doubles to their bit representations instead of casting them to longs? Casting to long would, I think, make more sense since we can think of it as an implicit cast of double-typed arguments to a function that only accepts longs.

thinking back, I think i originally did this behavior before I added bitwiseConvertDouble, so it was done as a way to do bitwise operations on double values. After I added the explicit function, it isn't really necessary anymore, so will revert to the behavior of casting and assuming long inputs.

gianm · 2021-01-08T07:14:00Z

docs/misc/math-expr.md

@@ -119,6 +119,13 @@ See javadoc of java.lang.Math for detailed explanation for each function.
 |acos|acos(x) would return the arc cosine of x|
 |asin|asin(x) would return the arc sine of x|
 |atan|atan(x) would return the arc tangent of x|
+|bitwiseAnd|bitwiseAnd(x,y) would return the result of x & y. Double values will be converted to their bit representation|
+|bitwiseComplement|bitwiseComplement(x) would return the result of ~x. Double values will be converted to their bit representation|
+|bitwiseConvertDouble|bitwiseConvertDouble(x) would convert the IEEE 754 floating-point "double" bits stored in a long into a double value if the input is a long, or the copy bits of a double value into a long if the input is a double.|


This function is kind of weird because it doesn't have a fixpoint. I'd think that bitwiseConvertDouble(bitwiseConvertDouble(x)) would be identical to bitwiseConvertDouble(x). The lack of fixpoint makes it hard to reason about what the result of this function is going to be. Is there a specific reason it's designed this way? If not, I'd suggest splitting into two functions for each direction of the conversion.

will split into bitwiseConvertDoubleToLongBits and bitwiseConvertLongBitsToDouble

I'd think that bitwiseConvertDouble(bitwiseConvertDouble(x)) would be identical to bitwiseConvertDouble(x).

hmm, should the conversion just pass through if the type is already the output type or should it implicitly cast similar to the other bitwise functions?

It should implicitly cast, I think.

Generally I think function behavior is easier to understand if the function implicitly casts its inputs to the type that it expects, vs. changing behavior based on its input type.

btw, testing bitwiseConvertLongBitsToDouble(bitwiseConvertLongBitsToDouble(..)) uncovered an issue with the parser when trying to parse the output of Expr.stringify (because unit tests cover this round trip scenario, and when flatten is true it turns it into a constant), where large doubles with exponents, e.g. 1E10 or whatever, could not be correctly parsed, so I expanded the grammar to allow it roughly according to these rules

gianm · 2021-01-08T07:20:16Z

core/src/test/java/org/apache/druid/math/expr/FunctionTest.java

+    assertExpr("bitwiseAnd('2', '1')", null);
+    assertExpr("bitwiseAnd(2, '1')", 0L);
+
+    assertExpr("bitwiseOr(2.345, 1)", 4612462889363109315L);


Please include (double, double) and (long, double) in addition to (double, long) args.

more exhaustive coverage which should include these combinations is done in VectorExprSanityTest, where non-vectorized and vectorized evaluation results are asserted to be equal with a variety of combinations of inputs, but I can add explicit tests here since those don't necessarily confirm correctness, just self consistency between the two evaluation modes.

I see, I missed that test. Sounds good.

…tions

…ubles

…tions

gianm · 2021-01-15T09:13:42Z

core/src/main/antlr4/org/apache/druid/math/expr/antlr/Expr.g4

+EXP: [eE] [-]? LONG;
+// DOUBLE provides partial support for java double format
+// see: https://docs.oracle.com/javase/8/docs/api/java/lang/Double.html#valueOf-java.lang.String-
+DOUBLE : 'NaN' | 'Infinity' | (LONG '.' LONG) | (LONG EXP) | (LONG '.' LONG EXP);


This used to allow 10. as a double, but now it doesn't. I think we should add that back. (with tests 🙂)

good catch 👍

fixed, and added tests

gianm · 2021-01-15T09:17:26Z

docs/misc/math-expr.md

-|bitwiseXor|bitwiseXor(x,y) would return the result of x ^ y. Double values will be converted to their bit representation|
+|bitwiseAnd|bitwiseAnd(x,y) would return the result of x & y. Double values will be implicitly cast to longs, use `bitwiseConvertDoubleToLongBits` to perform bitwise operations directly with doubles|
+|bitwiseComplement|bitwiseComplement(x) would return the result of ~x. Double values will be implicitly cast to longs, use `bitwiseConvertDoubleToLongBits` to perform bitwise operations directly with doubles|
+|bitwiseConvertDoubleToLongBits|bitwiseConvertDoubleToLongBits(x) would convert the IEEE 754 floating-point "double" bits stored in a long into a double value if the input is a long, or implicitly cast the value to a long if the input is a double|


Are these docs for bitwiseConvertDoubleToLongBits correct? It doesn't sound like something that the function should do.

heh no, when I split the functions i deleted the wrong half of the description .. and then wrote the one i meant to delete again a different way for the function i split out 🙃

gianm · 2021-01-20T17:32:28Z

docs/misc/math-expr.md

+|atan|atan(x) returns the arc tangent of x|
+|bitwiseAnd|bitwiseAnd(x,y) returns the result of x & y. Double values will be implicitly cast to longs, use `bitwiseConvertDoubleToLongBits` to perform bitwise operations directly with doubles|
+|bitwiseComplement|bitwiseComplement(x) returns the result of ~x. Double values will be implicitly cast to longs, use `bitwiseConvertDoubleToLongBits` to perform bitwise operations directly with doubles|
+|bitwiseConvertDoubleToLongBits|bitwiseConvertDoubleToLongBits(x) will convert the IEEE 754 floating-point "double" bits of a double value into a long, or implicitly cast the value to a double if the input is a long.|


This still doesn't seem right; surely, if the input is a long, the function will cast the value to a double and then convert those double bits back to a long. The description doesn't make it sound like that's what happens.

kaplanmaxe and others added 5 commits August 1, 2020 19:10

expressions: adding bitwise expressions

b940467

Merge remote-tracking branch 'upstream/master' into bitwise-math-func…

a8e08e0

…tions

double handling and vectorization

88529c0

move conversion to Evals

ecc103a

revert unintended changes

1b5269b

clintropolis added Area - Querying Design Review Release Notes labels Nov 25, 2020

clintropolis mentioned this pull request Nov 26, 2020

add bitwise and, or, negate expressions #10084

Closed

5 tasks

clintropolis mentioned this pull request Jan 8, 2021

Adding bitwise expressions #10230

Closed

4 tasks

gianm reviewed Jan 8, 2021

View reviewed changes

clintropolis added 6 commits January 8, 2021 04:16

Merge remote-tracking branch 'upstream/master' into bitwise-math-func…

55e0a84

…tions

Merge remote-tracking branch 'upstream/master' into bitwise-math-func…

61d3118

…tions

less magic, split convert functions, fix parser for funny exponent do…

85e3074

…ubles

Merge remote-tracking branch 'upstream/master' into bitwise-math-func…

f16026a

…tions

fix spelling exceptions list

d988512

more spelling

72a452a

gianm reviewed Jan 15, 2021

View reviewed changes

fix grammar, add more test, fix docs

58fb617

gianm reviewed Jan 20, 2021

View reviewed changes

fix docs

623e0b7

gianm approved these changes Jan 28, 2021

View reviewed changes

gianm merged commit 2ce7b3d into apache:master Jan 28, 2021

clintropolis deleted the bitwise-math-functions branch January 28, 2021 19:42

clintropolis mentioned this pull request Feb 1, 2021

add SQL operators for bitwise expressions #10823

Merged

7 tasks

clintropolis mentioned this pull request May 21, 2021

bitwise aggregators, better null handling options for expression agg #11280

Merged

6 tasks

clintropolis added this to the 0.22.0 milestone Aug 12, 2021

clintropolis mentioned this pull request Sep 3, 2021

[Draft] 0.22.0 Release Notes #11657

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bitwise math function expressions #10605

bitwise math function expressions #10605

clintropolis commented Nov 25, 2020 •

edited

Loading

gianm Jan 8, 2021

clintropolis Jan 8, 2021

gianm Jan 8, 2021

clintropolis Jan 8, 2021

gianm Jan 8, 2021

clintropolis Jan 8, 2021

clintropolis Jan 8, 2021

gianm Jan 8, 2021

clintropolis Jan 13, 2021

gianm Jan 8, 2021

clintropolis Jan 8, 2021

gianm Jan 8, 2021

gianm Jan 15, 2021

clintropolis Jan 20, 2021

gianm Jan 15, 2021

clintropolis Jan 20, 2021

gianm Jan 20, 2021

bitwise math function expressions #10605

bitwise math function expressions #10605

Conversation

clintropolis commented Nov 25, 2020 • edited Loading

Description

Key changed/added classes in this PR

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

clintropolis commented Nov 25, 2020 •

edited

Loading