Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bitwise math function expressions #10605

Merged
merged 13 commits into from
Jan 28, 2021

Conversation

clintropolis
Copy link
Member

@clintropolis clintropolis commented Nov 25, 2020

Description

This PR picks up the commit from #10230, which adds bitwise math functions (bitwiseAnd, bitwiseComplement , bitwiseOr, bitwiseShiftLeft, bitwiseShiftRight, bitwiseXor) to the Druid native expression system, and adds bitwiseConvertDoubleToLongBits and bitwiseConvertLongBitsToDouble to allow use with double typed columns.

Finally, I've added vectorization support so these expressions can be utilized in vectorized query engines, as well as tests.

I'll save adding SQL support as a follow-up PR.

Related #8560


This PR has:

  • been self-reviewed.
  • added documentation for new or modified features or behaviors.
  • added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
  • added integration tests.
  • been tested in a test Druid cluster.

Key changed/added classes in this PR
  • Function
  • VectorMathProcessors


public static long doubleToLongBits(double x)
{
return Double.doubleToLongBits(x);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why add this instead of calling the thing in Double?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trying to remember.. I think i was just trying to be consistent with the other value conversion functions and put it here, but it probably could just call it directly as it seems unlike we would change the function

@@ -119,6 +119,13 @@ See javadoc of java.lang.Math for detailed explanation for each function.
|acos|acos(x) would return the arc cosine of x|
|asin|asin(x) would return the arc sine of x|
|atan|atan(x) would return the arc tangent of x|
|bitwiseAnd|bitwiseAnd(x,y) would return the result of x & y. Double values will be converted to their bit representation|
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why convert doubles to their bit representations instead of casting them to longs? Casting to long would, I think, make more sense since we can think of it as an implicit cast of double-typed arguments to a function that only accepts longs.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thinking back, I think i originally did this behavior before I added bitwiseConvertDouble, so it was done as a way to do bitwise operations on double values. After I added the explicit function, it isn't really necessary anymore, so will revert to the behavior of casting and assuming long inputs.

@@ -119,6 +119,13 @@ See javadoc of java.lang.Math for detailed explanation for each function.
|acos|acos(x) would return the arc cosine of x|
|asin|asin(x) would return the arc sine of x|
|atan|atan(x) would return the arc tangent of x|
|bitwiseAnd|bitwiseAnd(x,y) would return the result of x & y. Double values will be converted to their bit representation|
|bitwiseComplement|bitwiseComplement(x) would return the result of ~x. Double values will be converted to their bit representation|
|bitwiseConvertDouble|bitwiseConvertDouble(x) would convert the IEEE 754 floating-point "double" bits stored in a long into a double value if the input is a long, or the copy bits of a double value into a long if the input is a double.|
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is kind of weird because it doesn't have a fixpoint. I'd think that bitwiseConvertDouble(bitwiseConvertDouble(x)) would be identical to bitwiseConvertDouble(x). The lack of fixpoint makes it hard to reason about what the result of this function is going to be. Is there a specific reason it's designed this way? If not, I'd suggest splitting into two functions for each direction of the conversion.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will split into bitwiseConvertDoubleToLongBits and bitwiseConvertLongBitsToDouble

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd think that bitwiseConvertDouble(bitwiseConvertDouble(x)) would be identical to bitwiseConvertDouble(x).

hmm, should the conversion just pass through if the type is already the output type or should it implicitly cast similar to the other bitwise functions?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should implicitly cast, I think.

Generally I think function behavior is easier to understand if the function implicitly casts its inputs to the type that it expects, vs. changing behavior based on its input type.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

btw, testing bitwiseConvertLongBitsToDouble(bitwiseConvertLongBitsToDouble(..)) uncovered an issue with the parser when trying to parse the output of Expr.stringify (because unit tests cover this round trip scenario, and when flatten is true it turns it into a constant), where large doubles with exponents, e.g. 1E10 or whatever, could not be correctly parsed, so I expanded the grammar to allow it roughly according to these rules

assertExpr("bitwiseAnd('2', '1')", null);
assertExpr("bitwiseAnd(2, '1')", 0L);

assertExpr("bitwiseOr(2.345, 1)", 4612462889363109315L);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please include (double, double) and (long, double) in addition to (double, long) args.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

more exhaustive coverage which should include these combinations is done in VectorExprSanityTest, where non-vectorized and vectorized evaluation results are asserted to be equal with a variety of combinations of inputs, but I can add explicit tests here since those don't necessarily confirm correctness, just self consistency between the two evaluation modes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, I missed that test. Sounds good.

EXP: [eE] [-]? LONG;
// DOUBLE provides partial support for java double format
// see: https://docs.oracle.com/javase/8/docs/api/java/lang/Double.html#valueOf-java.lang.String-
DOUBLE : 'NaN' | 'Infinity' | (LONG '.' LONG) | (LONG EXP) | (LONG '.' LONG EXP);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This used to allow 10. as a double, but now it doesn't. I think we should add that back. (with tests 🙂)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch 👍

fixed, and added tests

|bitwiseXor|bitwiseXor(x,y) would return the result of x ^ y. Double values will be converted to their bit representation|
|bitwiseAnd|bitwiseAnd(x,y) would return the result of x & y. Double values will be implicitly cast to longs, use `bitwiseConvertDoubleToLongBits` to perform bitwise operations directly with doubles|
|bitwiseComplement|bitwiseComplement(x) would return the result of ~x. Double values will be implicitly cast to longs, use `bitwiseConvertDoubleToLongBits` to perform bitwise operations directly with doubles|
|bitwiseConvertDoubleToLongBits|bitwiseConvertDoubleToLongBits(x) would convert the IEEE 754 floating-point "double" bits stored in a long into a double value if the input is a long, or implicitly cast the value to a long if the input is a double|
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are these docs for bitwiseConvertDoubleToLongBits correct? It doesn't sound like something that the function should do.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

heh no, when I split the functions i deleted the wrong half of the description .. and then wrote the one i meant to delete again a different way for the function i split out 🙃

|atan|atan(x) returns the arc tangent of x|
|bitwiseAnd|bitwiseAnd(x,y) returns the result of x & y. Double values will be implicitly cast to longs, use `bitwiseConvertDoubleToLongBits` to perform bitwise operations directly with doubles|
|bitwiseComplement|bitwiseComplement(x) returns the result of ~x. Double values will be implicitly cast to longs, use `bitwiseConvertDoubleToLongBits` to perform bitwise operations directly with doubles|
|bitwiseConvertDoubleToLongBits|bitwiseConvertDoubleToLongBits(x) will convert the IEEE 754 floating-point "double" bits of a double value into a long, or implicitly cast the value to a double if the input is a long.|
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This still doesn't seem right; surely, if the input is a long, the function will cast the value to a double and then convert those double bits back to a long. The description doesn't make it sound like that's what happens.

@gianm gianm merged commit 2ce7b3d into apache:master Jan 28, 2021
@clintropolis clintropolis deleted the bitwise-math-functions branch January 28, 2021 19:42
@clintropolis clintropolis added this to the 0.22.0 milestone Aug 12, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants