-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add date time transform functions #5313
Comments
One thing to keep in mind when designing: In the case of Simple Date Format, we don't want to create an instance of SDF formatter for every row. |
@reallocf does this look like something you can help out with? |
@npawar yes, would love to help with this! Will look into it later today and ask any questions I have here. Thanks! |
Chatted with @npawar a bit over Slack. She added a number of the functions here: https://github.com/apache/incubator-pinot/pull/5324/files so I'm focusing mostly on LocalDateTime conversions here. The interfaces should be something like:
The core complexity here is in not creating a new SimpleDateFormat for each row (as Neha already mentioned). Right now, all DateTimeFunctions are static - if we were to write similar static functions we would either need to create a new SimpleDateFormat for each row (too slow) or create a simple cache for SimpleDateFormat (the joys of cache invalidation...). So, the better approach is to refactor the FunctionRegistry to allow non-static functions to be processed so we can have some shared memory between rows. I'm exploring the best way to go about this now, using the usage of TransformFunctions as an example. |
Also update the FunctionRegistry to avoid requiring functions to be static in order to initialize them. Adds related tests.
Adds toDateTime and fromDateTime inbuilt functions (issue #5313 ). 1) toDateTime takes a long of millis since epoch and a pattern string and returns a string corresponding to the DateTime since epoch as the passed millis, formatted by the pattern. 2) fromDateTime takes in a DateTime string and a pattern that the DateTime string is formatted in and returns a long of millis since epoch corresponding to the DateTime string. Also renamed DefaultFunctionEvaluator to InbuiltFunctionEvaluator and FunctionRegistry to InbuiltFunctionRegistry. Adds a FunctionRegistryFactory to create InbuiltFunctionRegistrys with a specified set of inbuilt functions. In doing so, enables InbuiltFunctionRegistry to register both non-static and static functions. Adds a DateTimePatternHandler that converts DateTime strings to epoch longs and epoch longs to DateTime strings backed by a cache of joda DateTimeFormatters (each cache is specific to the particular upload). Adds new tests for toDateTime and fromDateTime and an InbuiltFunctionEvaluator test that makes sure the InbuiltFunctionRegistry internal state persists between rows for the lifetime of the InbuiltFunctionEvaluator.
With #5326 merging, can we close this? Or should we leave it open for some follow-up work? |
This is a followup to: #5312
We need to exhaustively add a lot of time transformation functions. This is part of the broader task: https://docs.google.com/document/d/1SU1jCjfsIDSA960fD5YWQbD72p8UdGF0c7CroFNt9Ho/edit#heading=h.y1084mvhfcft
The text was updated successfully, but these errors were encountered: