[FLINK-10976][table] Add Aggregate operator to Table API#8311
[FLINK-10976][table] Add Aggregate operator to Table API#8311hequn8128 wants to merge 5 commits intoapache:masterfrom
Conversation
|
Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community Review Progress
Please see the Pull Request Review Guide for a full explanation of the review process. DetailsThe Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commandsThe @flinkbot bot supports the following commands:
|
|
@flinkbot approve description |
| * AggregateFunction aggFunc = new MyAggregateFunction() | ||
| * tableEnv.registerFunction("aggFunc", aggFunc); | ||
| * table.aggregate("aggFunc(a, b) as (f0, f1, f2)") | ||
| * .select("key, f0, f1") |
| } | ||
|
|
||
| /** | ||
| * The implementation of a [[AggregatedTable]] that has been grouped on a set of grouping keys. |
There was a problem hiding this comment.
a -> an
that has been performed on an aggregate function.
| * The implementation of a [[AggregatedTable]] that has been grouped on a set of grouping keys. | ||
| */ | ||
| class AggregatedTableImpl( | ||
| private[flink] val table: Table, |
| } | ||
|
|
||
| def rowBasedAggregate( | ||
| groupingExpressions: JList[Expression], |
| : TableOperation = { | ||
| // resolve for java string case, i.e., turn LookupCallExpression to CallExpression. | ||
| val resolver = resolverFor(tableCatalog, functionCatalog, child).build | ||
| val resolvedAggregate = resolveSingleExpression(aggregate, resolver) |
docs/dev/table/tableApi.md
Outdated
| tableEnv.registerFunction("myAggFunc", myAggFunc); | ||
| Table table = input | ||
| .groupBy("key") | ||
| .aggregate("myAggFunc(a, b) as (x, y, z)") |
| } | ||
| } | ||
|
|
||
| def resetAccumulator(acc: MyMinMaxAcc): Unit = { |
There was a problem hiding this comment.
The Scala example has resetAccumulator, while the Java example has not.
| } | ||
|
|
||
| @Test(expected = classOf[ValidationException]) | ||
| def testTableFunctionInSelection(): Unit = { |
There was a problem hiding this comment.
This test failed with the exception:
org.apache.flink.table.api.ValidationException: Given parameters of function 'func' do not match any signature.
Actual: (java.lang.Long)
Expected: (java.lang.String)
|
|
||
| table | ||
| .groupBy('a) | ||
| // must fail. Only AggregateFunction can be used in aggregate |
There was a problem hiding this comment.
The exception message is not friendly for users:
org.apache.flink.table.api.ValidationException: Invalid arguments [log(b), 'd'] for function: as
| util.tableEnv.registerFunction("func", new TableFunc0) | ||
| table | ||
| .groupBy('a) | ||
| // must fail. Only AggregateFunction can be used in aggregate |
There was a problem hiding this comment.
Only AggregateFunction -> Only one AggregateFunction
|
@hequn8128 Thanks a lot for the PR. LGTM overall with just a few comments. |
|
@flinkbot approve-until architecture |
2ccf79b to
09e3e73
Compare
|
@sunjincheng121 @dianfu Thanks a lot for your review. I have updated the PR and rebased to the master. Best, Hequn |
|
@hequn8128 Thanks for the update. LGTM. +1 |
|
@flinkbot approve-until architecture |
sunjincheng121
left a comment
There was a problem hiding this comment.
Thanks for the update @hequn8128
I only left some suggestions.
Best,
Jincheng
| /** | ||
| * Performs an aggregate operation with an aggregate function. Use this before a selection | ||
| * to perform the selection operation. The output will be flattened if the output type is a | ||
| * composite type. |
There was a problem hiding this comment.
Use this before a selection to perform the selection operation. -> You have to close the "aggregate" with a select statement ?
|
|
||
| /** | ||
| * Performs a global aggregate operation with an aggregate function. Use this before a selection | ||
| * to perform the selection operation. The output will be flattened if the output type is a |
| aggregate(ExpressionParser.parseExpression(aggregateFunction)) | ||
| } | ||
|
|
||
| override def aggregate(aggregateFunction: Expression): AggregatedTable = { |
There was a problem hiding this comment.
Can we consistency the name format for aggregateFunction and tableAggFunction?
I suggest using the complete word, i.e.: Agg->Aggregate? What do you think?
|
|
||
| override def select(fields: Expression*): Table = { | ||
| new TableImpl(tableImpl.tableEnv, | ||
| tableImpl.operationTreeBuilder.project(fields, |
There was a problem hiding this comment.
I think we can unify aggregate and flatAggregate implementations. i.e. both AggregatedTableImpl#select and FlatAggregateTableImpl#select can usingoperationTreeBuilder.project or xxTable.select(fields: _*). What do you think?
| aggregateOperationFactory.createAggregate(resolvedGroupings, resolvedAggregates, child) | ||
| } | ||
|
|
||
| def rowBasedAggregate( |
There was a problem hiding this comment.
I think we can rowBasedAggregate -> aggregate and add some comments to illustrate the difference between the two aggregate methods, What do you think?
| verifyTableEquals(resJava, resScala) | ||
| } | ||
|
|
||
| def testNonGroupedAggregate2(): Unit = { |
There was a problem hiding this comment.
testNonGroupedAggregate2->testRowBasedNonGroupedAggregate?
|
@sunjincheng121 Thanks a lot for your review and suggestions. I have updated the PR according to your comments. Best, Hequn |
|
LGTM. +1 to merged |
What is the purpose of the change
This pull request add row-based aggregate function on Table API. Note: this pr based on the #7235. Thank you @dianfu for your excellent work.
Brief change log
Verifying this change
This change added tests and can be verified as follows:
Does this pull request potentially affect one of the following parts:
@Public(Evolving): ( no)Documentation