-
Notifications
You must be signed in to change notification settings - Fork 3.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add output type information to ExpressionPostAggregator #11818
Conversation
…aggs to compute their output type based on input types
} | ||
|
||
@Override | ||
public ExpressionPostAggregator decorate(final Map<String, AggregatorFactory> aggregators) | ||
{ | ||
final ColumnInspector aggInspector = AggregatorUtil.inspectorForAggregatorFactoryMap(aggregators); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't look necessary… how about dropping it and also the new function in AggregatorUtil?
if (type != null) { | ||
return type; | ||
} | ||
final ColumnCapabilities capabilities = signature.getColumnCapabilities(fieldName); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Offtopic: it'd be nice to have signature.getColumnType(fieldName)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RowSignature
already has a getColumnType
though it returns an optional instead of nullable, so we would have to match it (or i guess give another name).
Also, I still have a dream that someday ExpressionType
and ColumnType
will be the same so then getType
would serve that purpose, but not quite there yet.
@@ -277,7 +277,8 @@ public Builder addPostAggregators(final List<PostAggregator> postAggregators) | |||
); | |||
|
|||
// unlike aggregators, the type we see here is what we get, no further finalization will occur | |||
add(name, postAggregator.getType()); | |||
// feed it the existing RowSignature for PostAggregator implementations whose output is dependent on input types |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternate comment:
// It's OK to call getType in the order that post-aggregators appear, because post-aggregators are only
// allowed to refer to *earlier* post-aggregators (not later ones; the order is meaningful).
All CI passed except for "(Compile=openjdk8, Run=openjdk8, Cluster Build On K8s) ITNestedQueryPushDownTest integration test", which seems broken (a python issue?) and unrelated. So I'll merge this. |
Description
Currently, the
PostAggregator
interface defines a methodgetType
which returns aColumnType
for use when populating aRowSignature
. ManyPostAggregator
implementations have fixed return types, and are so independent of the underlyingRowSignature
, but some, such asFieldAccessPostAggregator
andExpressionAggregator
, are sensitive to their inputs.Currently these dynamic return type
PostAggregator
implementations are unable to produce an output type until they have beendecorated
which provides access to a map of column names toAggregatorFactory
, which provide their own type information. The only place using the output type ofPostAggregators
isRowSignature
, so if we modify thegetType
method to accept aColumnInspector
, we can compute the output prior to decoration, if theRowSignature
provides itself as theColumnInspector
.The
ExpressionPostAggregator
was missing an output type completely, despite the fact that Druid expression system for some time now has been able to infer the output type of an expression if the inputs to all of its free variables are known, so it has been updated to support producing the output type from either the inspector, or from thedecorate
method, if it has been called.I also updated
FieldAccessPostAggregator
to be able to compute it's output type from theColumnInspector
if decorate has not been called, but did not yet doFinalizingFieldAccessPostAggregator
, as I think we would need a clearer split between intermediary and finalizedRowSignature
that is not yet in place.I left the 'decoration computes type' pattern in place, and the code prefers those over the
ColumnInspector
computed types because I wasn't completely sure if there would be any consequences to this, but it might very be possible to remove them from these aggregators in the future if it isn't needed.Key changed/added classes in this PR
PostAggregator
ExpressionPostAggregator
FieldAccessPostAggregator
This PR has: