Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding support to execute functions during query compilation #5406

Merged
merged 2 commits into from May 23, 2020

Conversation

kishoreg
Copy link
Member

@kishoreg kishoreg commented May 18, 2020

Adding support to execute functions during query compilation

  1. Moved FunctionRegistry from pinot-core to pinot-common
  2. Adding support for compilation time function invoker.

Now that below queries are supported.
SELECT * FROM T where ts < now()
SELECT * FROM T where date < toDateTime(now(), 'yyyy-MM-dd z')

now() should be evaluated at the time of the query compilation. The logic introduced detects any function that has

  • no arguments
  • all arguments are literals or functions with all literal arguments (i.e. no column identifiers)

This PR introduces now() and datetimeFormat(String, String) as sample functions. There will be another PR that will add support for many other DateTime functions such as https://prestodb.io/docs/current/functions/datetime.html

for (ScalarFunctionType value : values) {
String upperCaseFunctionName = value.getName().toUpperCase();
_scalarFunctions.put(upperCaseFunctionName, value);
_scalarFunctions.put(upperCaseFunctionName.replace("_", ""), value);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the purpose of this? Flexibility to support variations of function names?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

e.g. presto use _ in function names. It provides more robustness from user perspective.


ScalarFunctionType scalarFunctionType = ScalarFunctionType.getScalarFunctionType(funcName);
switch (scalarFunctionType) {
case NOW:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will this approach scale, as number of scalar functions increase? For example, each one would need to be added here. What do you think about modelling this as a query rewrite phase that goes over all scalars and evaluates them?

Copy link
Contributor

@xiangfu0 xiangfu0 May 19, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we will need to move to function registry/invoker model

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, this is to just ensure that the logic of detection if a function can be evaluated at query compile time works.

@kishoreg
Copy link
Member Author

For this PR, please focus on logic to figure out if any function can be evaluated at compile time. The evaluation itself will move into FunctionRegistry/Function Invoker

@xiangfu0 xiangfu0 force-pushed the time-functions-in-query branch 7 times, most recently from 55c9814 to 1368749 Compare May 19, 2020 19:31
* @param funcExpr
* @return true if all arguments are literals
*/
private static boolean isCompileTimeEvaluationPossible(Expression funcExpr) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of this, why don't we just have a constant value function registry where all compile time evaluated functions are registered. So this check then becomes if the function is part of the constant value function registry.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing here is that the function could be used in transform functions in query/ingestion field conversion. Ideally we should be able to evaluate any transform function with literal here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

if (funcSqlNode.getOperator().getKind() == SqlKind.OTHER_FUNCTION) {
funcName = funcSqlNode.getOperator().getName();
}
if (funcName.equalsIgnoreCase(SqlKind.COUNT.toString()) && (funcSqlNode.getFunctionQuantifier() != null)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i didn't quite follow why there is special casing for DISTINCTCOUNT ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is a special handling for the case of COUNT(DISTINCT A).

@npawar
Copy link
Contributor

npawar commented May 23, 2020

Will we eventually move towards also allowing functions which have column names?

@xiangfu0
Copy link
Contributor

Will we eventually move towards also allowing functions which have column names?

do you mean function name contains a column name?

@xiangfu0 xiangfu0 merged commit 3f8ba71 into master May 23, 2020
@xiangfu0 xiangfu0 deleted the time-functions-in-query branch May 23, 2020 05:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
5 participants