-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add GenericTransformFunction wrapper for simple ScalarFunctions #5440
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is awesome! Change the description to say that this is a generic transform function that can call any scalarfunction that return a single value
...rc/main/java/org/apache/pinot/core/operator/transform/function/GenericTransformFunction.java
Outdated
Show resolved
Hide resolved
|
||
@Override | ||
public TransformResultMetadata getResultMetadata() { | ||
return STRING_SV_NO_DICTIONARY_METADATA; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can add another switch statement in init to set this based on the return type of the scalarfunction, see functionInvoker.getReturnType
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For return type just a switch won't work
Different return types will need a different function overrides
e.g. transformToStringValuesSV for string, transformToIntValuesSV for int and so on
Or am I missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right, we need to override methods for int, long, double, float. Shouldn't be hard, its mostly copy-paste. similar to _stringResultArray, create a resultArray for each type and instantiate them in the init
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does it look better now?
This is great! |
...rc/main/java/org/apache/pinot/core/operator/transform/function/TransformFunctionFactory.java
Show resolved
Hide resolved
|
||
public class GenericTransformFunctionTest extends BaseTransformFunctionTest { | ||
|
||
@Test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this exercises the compilation path. You should add tests in CalciteSqlCompiler test file as well.
Also, we should add tests for end to end query execution. See TransformQueriesTest class or consider adding execution tests to one of the existing query tests
@fx19880617 @siddharthteotia Should I add tests in CalciteSQL for all the functions? |
...rc/main/java/org/apache/pinot/core/operator/transform/function/GenericTransformFunction.java
Outdated
Show resolved
Hide resolved
...rc/main/java/org/apache/pinot/core/operator/transform/function/GenericTransformFunction.java
Outdated
Show resolved
Hide resolved
...rc/main/java/org/apache/pinot/core/operator/transform/function/GenericTransformFunction.java
Outdated
Show resolved
Hide resolved
...rc/main/java/org/apache/pinot/core/operator/transform/function/TransformFunctionFactory.java
Show resolved
Hide resolved
pinot-common/src/main/java/org/apache/pinot/common/function/StringFunctions.java
Show resolved
Hide resolved
pinot-common/src/main/java/org/apache/pinot/common/function/annotations/ScalarFunction.java
Show resolved
Hide resolved
@KKcorps , sorry missed this. Yes, the query compilation tests should be in CalciteSqlCompilerTest. Here we can verify that PinotQuery is being built correctly and that gets converted to BrokerRequest correctly. Most other tests in this file do this validation. The other suggestion was to also add unit tests for exercising end-to-end execution path. Please consider adding these tests to an appropriate file in |
pinot-common/src/main/java/org/apache/pinot/common/function/FunctionRegistry.java
Outdated
Show resolved
Hide resolved
...n/java/org/apache/pinot/core/operator/transform/function/ScalarTransformFunctionWrapper.java
Show resolved
Hide resolved
TransformFunction function = arguments.get(i); | ||
if (function instanceof LiteralTransformFunction) { | ||
String literal = ((LiteralTransformFunction) function).getLiteral(); | ||
Class paramType = _functionInvoker.getParameterTypes()[i]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest directly comparing class instead of String comparison for both performance and readability:
if (paramType == Integer.class) {
...
} else if (paramType == Long.class) {
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would require changing switch case to a lot of else if since switch doesn't accept Class type. IMO, switch case looks neat so made this trade off
...n/java/org/apache/pinot/core/operator/transform/function/ScalarTransformFunctionWrapper.java
Outdated
Show resolved
Hide resolved
...rc/main/java/org/apache/pinot/core/operator/transform/function/TransformFunctionFactory.java
Show resolved
Hide resolved
...rc/main/java/org/apache/pinot/core/operator/transform/function/TransformFunctionFactory.java
Outdated
Show resolved
Hide resolved
...n/java/org/apache/pinot/core/operator/transform/function/ScalarTransformFunctionWrapper.java
Outdated
Show resolved
Hide resolved
...n/java/org/apache/pinot/core/operator/transform/function/ScalarTransformFunctionWrapper.java
Outdated
Show resolved
Hide resolved
...n/java/org/apache/pinot/core/operator/transform/function/ScalarTransformFunctionWrapper.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great feature! LGTM except for the unaddressed comments
Is the plan to use this wrapper solely for invoking scalar functions (like already done in this PR for StringFunctions) or are we expecting follow-ups to integrate it with rest of the transform functions. I think only the former? |
@sidd only for scalarfunctions. |
The new FunctionRegistry change of using reflection to plug in methods (introduced in #5440) is causing failure in query compilation. The problem is caused by Reflections library not being thread-safe when multiple threads are accessing the same jar file. We have multiple libraries (jersey, swagger) using reflection to load classes. The FunctionRegistry uses reflection in the static block, which happens when the class is loaded. If the first query arrives (first usage of FunctionRegistry which runs the static block) during the setup of the jersey server, Reflections will throw exception. Using reflection in the static block can also cause the first query to block on reflection scanning the classes, which can potentially take seconds. Read more about the thread-safety issue here: ronmamo/reflections#81 Users are reporting the same issue even with the latest Reflections version 0.9.12. A common solution is to downgrade Reflections to 0.9.9, but we have other dependencies rely on 0.9.11, so downgrading might cause other issues. The solution here is to introduce a no-op init() method to trigger the static block so that we can control when to scan the files to avoid the thread-safety issue as well as blocking the first query. Another benefit of using an init() method is that the exception won't be swallowed (the exception in static block will be swallowed and query engine will start getting ClassNotFoundException) This PR also introduces a convention for the plugin methods using reflection, where the class must includes ".function." in its class path. This can significantly reduce the time of class scanning (reduced from 4 seconds to 200 ms locally)
Today, adding new TransformFunction is quite involved. Supporting simple static java functions requires one to write a lot of boiler plate code. This PR adds a generic Transform Function that can invoke any of the registered functions in FunctionRegistry.
This PR tests this new feature with StringFunctions.