New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: UDAFs with multiple/variadic args #9361
feat: UDAFs with multiple/variadic args #9361
Conversation
This reverts commit 392f618.
… for unused indices
…ove old single index methods
…type passed for variadic args
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
first pass at reviewing. i havent reviewed the unit tests. this looks great so far!
ksqldb-engine/src/main/java/io/confluent/ksql/function/UdafFactoryInvoker.java
Outdated
Show resolved
Hide resolved
this.right = right; | ||
} | ||
|
||
public T1 getLeft() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The KLIP only mentions 10 because the javatuples
library goes up to 10. If we're implementing our own classes, I don't think we really need to go up to 10. How were you planning to implement Quartet+?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I created classes for each tuple type (Quadruple
, Quintuple
, ...), so I was going to add these as cases in BaseAggregateFunction#determineInputConverter()
. Ten column arguments does seem a bit excessive, though. Maybe stop at five?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Five is reasonable. My question was more, were you planning on following the same getLeft/getRight
pattern?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For Quadruple
and later, I've been using getFirst()
, getSecond()
, getThird()
, and so on since getLeft()
and similar didn't fit. I used getLeft()
and getRight()
for Triple
since Pair
uses these terms.
@@ -73,8 +73,7 @@ public AggregateFunctionFactory(final UdfMetadata metadata) { | |||
this.metadata = Objects.requireNonNull(metadata, "metadata can't be null"); | |||
} | |||
|
|||
public abstract KsqlAggregateFunction<?, ?, ?> createAggregateFunction( | |||
List<SqlArgument> argTypeList, AggregateFunctionInitArguments initArgs); | |||
public abstract FunctionSource getFunction(List<SqlType> argTypeList); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes it impossible to write aggregate functions with time unit initial args.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Zara and I discussed this on Slack, and this is not a regression. In the old implementation and the current implementation, initial args must be Literal
s, but time units are not Literal
s (IntervalUnit
extends Expression
instead of Literal
). This might be reported as a separate issue.
ksqldb-execution/src/main/java/io/confluent/ksql/execution/function/UdafUtil.java
Outdated
Show resolved
Hide resolved
...ecution/src/test/java/io/confluent/ksql/execution/function/udaf/KudafUndoAggregatorTest.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Description
This is the main PR to add support for multiple UDAF arguments. One argument can be variadic, either through wrapping the last column argument type in
VariadicArgs
or making the UDAF factory method have a variadic argument.There are other changes that will be made as separate, smaller PRs:
tenfive column arguments. This PR only adds support for two and three column arguments (though adding the rest is trivial).This PR should not introduce breaking changes.
The most important changes are in
ExpressionTypeManager
,AggregateNode
,UdafUtil
,UdafAggregateFunctionFactory
(resolving multi-param UDAFs),UdafTypes
,UdafFactoryInvoker
(handling multi-param UDAF signatures), andKudafAggregator
(applying multiple columns). Most of the other changes are tweaks to use lists for column indices or arrays for param schemas.Testing done
Added unit tests to existing classes that were tweaked. Added QTTs to cover typical and edge cases with multi-param UDAFs.
Reviewer checklist