-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add api in AggregationFunction to get its compiled input expressions. #5339
Conversation
37dc52c
to
f02f5f3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM otherwise
columns.addAll(brokerRequest.getGroupBy().getExpressions()); | ||
} | ||
} else { | ||
private void setMaxDocsForSelection(BrokerRequest brokerRequest) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This logic should also be handle in the upper level (SelectionPlanNode) and passed to this class
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I thought so too. But upper levels can be aggregation as well, which will now have this field leaked. For now I tend to keep it here (as was the case before) until the full cleanup happens.
private int _maxDocPerNextCall = DocIdSetPlanNode.MAX_DOC_PER_CALL; | ||
|
||
public TransformPlanNode(IndexSegment indexSegment, BrokerRequest brokerRequest) { | ||
public TransformPlanNode(IndexSegment indexSegment, BrokerRequest brokerRequest, | ||
Set<TransformExpressionTree> expressionsToPlan) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(nit) expressionsToPlan
-> expressions
?
Also pass maxDocsPerBlock
from upper level?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
expressions
was too generic and being used in too many places for different purposes, so I thought to call it expressionsToPlan
.
Replied on maxDocsPerBlock
above.
...core/src/main/java/org/apache/pinot/core/query/aggregation/function/AggregationFunction.java
Outdated
Show resolved
Hide resolved
...src/main/java/org/apache/pinot/core/query/aggregation/function/AggregationFunctionUtils.java
Outdated
Show resolved
Hide resolved
...src/main/java/org/apache/pinot/core/query/aggregation/function/CountAggregationFunction.java
Outdated
Show resolved
Hide resolved
With aggregation functions now taking multiple agruments, only the functions themselves have the knowledge on how to interpret these arguments. This poses a problem for the planning phase on what columns need to be projected and what expressions need to be computed. With this change, AggregationFunction's are now responsible for providing what inputs they need. 1. Added a new api in AggregationFunction interface `getInputExpressions()`, that returns a list of compiled TransformExpressionTrees that the aggregation function needs as input to compute. 2. Cleaned up the chaining data dependency during planning phase. Before this PR, all planning nodes receive the BrokerRequest (and pass to their child plan node) to extract out all information needed. With this change: - Aggregation plan nodes only specify the expression trees they need from Transform plan nodes, and Transform plan nodes use that to specify what columns they need from projection plan nodes. TODO: Ideally we should completely eliminate passing of BrokerRequest throughout the chain plan nodes, and only pass minimal information instead. This change only does so for projection columns. A TODO here is to extend it to FilterPlanNode and deeper.
With aggregation functions now taking multiple agruments, only the functions themselves
have the knowledge on how to interpret these arguments. This poses a problem for the planning
phase on what columns need to be projected and what expressions need to be computed.
With this change, AggregationFunction's are now responsible for providing what inputs they need.
Added a new api in AggregationFunction interface
getInputExpressions()
, that returns a listof compiled TransformExpressionTrees that the aggregation function needs as input to compute.
Cleaned up the chaining data dependency during planning phase. Before this PR, all planning nodes
receive the BrokerRequest (and pass to their child plan node) to extract out all information needed.
With this change:
Transform plan nodes use that to specify what columns they need from projection plan nodes.
TODO: Ideally we should completely eliminate passing of BrokerRequest throughout the chain plan nodes,
and only pass minimal information instead. This change only does so for projection columns. A TODO here
is to extend it to FilterPlanNode and deeper.