-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bitwise aggregators, better null handling options for expression agg #11280
Conversation
@@ -121,6 +124,7 @@ public ExpressionLambdaAggregatorFactory( | |||
|
|||
this.initialValueExpressionString = initialValue; | |||
this.initialCombineValueExpressionString = initialCombineValue == null ? initialValue : initialCombineValue; | |||
this.initiallyNull = initiallyNull == null ? NullHandling.sqlCompatible() : initiallyNull; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be named as useNullInitially
or isInitiallyNull
to make it more clear in code. As above, I would also be fine with some other better name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
renamed to isNullUnlessAggregated
to more clearly indicate that it is a boolean and hopefully indicate its main role in determining aggregator behavior. initiallyNull
seemed confusing alongside initialValue
.
@@ -48,7 +49,7 @@ | |||
public static ExprEval deserialize(ByteBuffer buffer, int position) | |||
{ | |||
// | expression type (byte) | expression bytes | | |||
ExprType type = ExprType.fromByte(buffer.get(position)); | |||
ExprType type = ExprType.fromByte((byte) (buffer.get(position) & TYPE_MASK)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be better to add comment suggesting that only BufferLambdaExpressionAggregator
calls this hence we are clearing the sign bit due to implementation in aggregator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I reworked this to do the masking in the buffer aggregator instead of here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm % comments
@@ -195,6 +199,12 @@ public String getInitialCombineValueExpressionString() | |||
return initialCombineValueExpressionString; | |||
} | |||
|
|||
@JsonProperty("initiallyNull") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should also be changed to isNullUnlessAggregated
|
||
import javax.annotation.Nullable; | ||
import java.nio.ByteBuffer; | ||
|
||
public class ExpressionLambdaBufferAggregator implements BufferAggregator | ||
{ | ||
private static final short NOT_AGGREGATED_BIT = 1 << 7; | ||
private static final short IS_AGGREGATED_MASK = 0x3F; | ||
private static final byte TYPE_MASK = 0x0F; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it possible to drop either TYPE_MASK
or IS_AGGREGATED_MASK
and use a common mask whose value is 0x0F
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
thanks for review @jihoonson and @rohangarg 🤘 |
…pache#11280) * bitwise aggregators, better nulls for expression agg * correct behavior * rework deserialize, better names * fix json, share mask
Description
Builds on top of #11104 and #10605 to add bitwise aggregator functions:
BIT_AND(expr)
null
ifdruid.generic.useDefaultValueForNull=false
, otherwise0
BIT_OR(expr)
null
ifdruid.generic.useDefaultValueForNull=false
, otherwise0
BIT_XOR(expr)
null
ifdruid.generic.useDefaultValueForNull=false
, otherwise0
In the process of adding this, I've also modified
ExpressionLambdaAggregatorFactory
to have an additional JSON property,initiallyNull
, which determines if the aggregator will produce anull
value orinitialValue
/initialCombineValue
. For example, an SQL compatible count aggregator would haveinitiallyNull
set tofalse
and haveinitialValue
set to0
, so that it would always return 0 even if no rows were aggregated, while a sum would have it set totrue
so that it would returnnull
in the same case. For the buffer aggregator, this is tracked by setting a bit in the expression type byte which prefixes all of the serialized expressions, which is then cleared whenever the aggregate function is called. This change simplifiesARRAY_AGG
since it was previously using a finalize expression to coerce empty results back to null, but now it can just naturally be initialized to null.This PR has: