Support group by span over time based column with Span UDF#3421
Conversation
Signed-off-by: Songkan Tang <songkant@amazon.com>
…ngine Signed-off-by: Songkan Tang <songkant@amazon.com>
| date.atStartOfDay().atZone(ZoneOffset.UTC).toInstant().toEpochMilli(), interval); | ||
| return SqlFunctions.timestampToDate(dateEpochValue); | ||
| case SqlTypeName.TIME: | ||
| if (dateTimeUnit.getId() > 4) { |
There was a problem hiding this comment.
Could you explain why need this limitation?
There was a problem hiding this comment.
Added comment inline. The TIME type usually means date format without year, month, etc like the field '17:59:59.99'
| * day, 1 month, 1 hour | ||
| * </ol> | ||
| */ | ||
| public class SpanFunction implements UserDefinedFunction { |
There was a problem hiding this comment.
Maybe leave a TODO here to do refactoring in future. Your previous idea of implementing a self-defined implementor seems to be a better approach.
Looks like it could be implemented by replacing ScalarFunction(we currently use) with ImplementableFunction, and move logic for handling different filed type to the implementor. In that way, this span function should be simpler with only handling timestamp type.
Then, we may have different functions like dateToTs, tsToDate, span(ForTs) and provide us more flexibility to reuse them or combine them.
Signed-off-by: Songkan Tang <songkant@amazon.com>
| return SqlLibraryOperators.DATEADD; | ||
| // UDF Functions | ||
| case "SPAN": | ||
| return TransferUserDefinedFunction(SpanFunction.class, "SPAN", ReturnTypes.ARG0); |
There was a problem hiding this comment.
ReturnTypes.ARG0 or ReturnTypes.ARG0_NULLABLE? What if the timestamp is null in span(timestamp, ...)?
There was a problem hiding this comment.
I changed it to ReturnTypes.ARG0_NULLABLE but it will transform it to default long or int value like 0L or 0 in linq4j generated code. Still looking
There was a problem hiding this comment.
Resolved the groupby nulls issue due to sql type creation conflicts in RexNodeBuilder.
| executeQuery( | ||
| String.format( | ||
| "source=%s | stats avg(balance) by span(birthdate, 1 day) as age_balance", | ||
| "source=%s | stats avg(balance) by span(birthdate, 1 month) as age_balance", |
There was a problem hiding this comment.
we need some ITs for such as 15 minutes
There was a problem hiding this comment.
Added 15 minutes IT.
Signed-off-by: Songkan Tang <songkant@amazon.com>
Signed-off-by: Songkan Tang <songkant@amazon.com>
Signed-off-by: Songkan Tang <songkant@amazon.com>
Signed-off-by: Songkan Tang <songkant@amazon.com>
Signed-off-by: Songkan Tang <songkant@amazon.com>
Signed-off-by: Songkan Tang <songkant@amazon.com>
|
|
|
Flaky test Let me trigger a re-run attempt. cc @yuancu |
Signed-off-by: Songkan Tang <songkant@amazon.com>
Signed-off-by: Songkan Tang <songkant@amazon.com>
Added corresponding change and more IT to cover span over different kinds of date formats. |
|
Looks like we have some flaky tests, this time there is one in
|
b99130a
into
opensearch-project:feature/calcite-engine
|
Failure due to flaky test, merged. |
* Support group by span over time based column with Span UDF Signed-off-by: Songkan Tang <songkant@amazon.com> * Minor fix UT and IT after merge Signed-off-by: Songkan Tang <songkant@amazon.com> * Add custom minute span IT Signed-off-by: Songkan Tang <songkant@amazon.com> * Remove print Signed-off-by: Songkan Tang <songkant@amazon.com> * Fix spotless style Signed-off-by: Songkan Tang <songkant@amazon.com> * Fix groupby nulls issue Signed-off-by: Songkan Tang <songkant@amazon.com> * Remove unnecessary sql type creation in RexNodeBuilder Signed-off-by: Songkan Tang <songkant@amazon.com> * Fix rounding for date and time type Signed-off-by: Songkan Tang <songkant@amazon.com> * Add more tests Signed-off-by: Songkan Tang <songkant@amazon.com> * Revert unrelated change in build.gradle Signed-off-by: Songkan Tang <songkant@amazon.com> --------- Signed-off-by: Songkan Tang <songkant@amazon.com> Signed-off-by: xinyual <xinyual@amazon.com>
Description
Support group by span over time based column with Span UDF
Related Issues
Resolves #3354
Check List
--signoff.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.