Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[multistage] Support TIMESTAMP type and date ops functions #11350

Merged
merged 6 commits into from Aug 17, 2023

Conversation

xiangfu0
Copy link
Contributor

@xiangfu0 xiangfu0 commented Aug 15, 2023

  • Convert TIMESTAMP type value at reduce phase.
  • Add TimeStampTest into IntegrationTests
  • Make Reinterpret an alias of cast
  • Support functions:
    • fromDateTime
    • toDateTime
    • timestampAdd
    • timestampDiff
    • year
    • yearOfWeek
    • monthOfYear
    • weekOfYear
    • dayOfYear
    • dayOfMonth
    • dayOfWeek
    • hour
    • minute
    • second
    • millisecond

Note quarter is not supported in v2 yet.

Before:
image

After:
image

image

@codecov-commenter
Copy link

codecov-commenter commented Aug 15, 2023

Codecov Report

Merging #11350 (284ede5) into master (b0c360a) will increase coverage by 0.00%.
The diff coverage is 41.30%.

@@            Coverage Diff            @@
##             master   #11350   +/-   ##
=========================================
  Coverage     61.46%   61.46%           
- Complexity     6512     6513    +1     
=========================================
  Files          2234     2234           
  Lines        120155   120191   +36     
  Branches      18235    18244    +9     
=========================================
+ Hits          73854    73880   +26     
- Misses        40888    40890    +2     
- Partials       5413     5421    +8     
Flag Coverage Δ
integration1 0.00% <0.00%> (ø)
integration2 0.00% <0.00%> (ø)
java-11 61.43% <41.30%> (-0.02%) ⬇️
java-17 61.32% <41.30%> (+47.08%) ⬆️
java-20 61.32% <41.30%> (-0.01%) ⬇️
temurin 61.46% <41.30%> (+<0.01%) ⬆️
unittests1 66.96% <44.18%> (-0.01%) ⬇️
unittests2 14.63% <0.00%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Changed Coverage Δ
...n/function/scalar/DataTypeConversionFunctions.java 71.42% <ø> (ø)
...sform/function/ScalarTransformFunctionWrapper.java 81.20% <0.00%> (-2.00%) ⬇️
...pinot/query/parser/CalciteRexExpressionParser.java 71.31% <0.00%> (-1.13%) ⬇️
.../query/planner/logical/RelToPlanNodeConverter.java 86.60% <ø> (ø)
...che/pinot/query/planner/logical/RexExpression.java 92.64% <0.00%> (-2.88%) ⬇️
...apache/pinot/query/planner/plannode/ValueNode.java 31.81% <0.00%> (-14.85%) ⬇️
.../pinot/query/service/dispatch/QueryDispatcher.java 0.00% <0.00%> (ø)
.../pinot/core/common/datablock/DataBlockBuilder.java 77.71% <33.33%> (-1.11%) ⬇️
...java/org/apache/pinot/common/utils/DataSchema.java 68.57% <71.42%> (-0.05%) ⬇️
...e/pinot/common/function/TransformFunctionType.java 88.38% <81.81%> (-0.45%) ⬇️
... and 1 more

... and 10 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@xiangfu0 xiangfu0 added the multi-stage Related to the multi-stage query engine label Aug 15, 2023
@xiangfu0 xiangfu0 force-pushed the fixing-timstamp-type branch 3 times, most recently from cc9c0b9 to d94bc25 Compare August 16, 2023 23:51
Copy link
Contributor

@Jackie-Jiang Jackie-Jiang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add some tests

@@ -194,6 +196,10 @@ public static Expression toExpression(RexExpression rexNode, PinotQuery pinotQue
private static Expression rexLiteralToExpression(RexExpression.Literal rexLiteral) {
// TODO: currently literals are encoded as strings for V1, remove this and use directly literal type when it
// supports strong-type in V1.
if (rexLiteral.getDataType() == FieldSpec.DataType.TIMESTAMP
&& rexLiteral.getValue() instanceof GregorianCalendar) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(minor) Suggest removing the rexLiteral.getValue() instanceof GregorianCalendar so that if it returns unexpected value we can get exception instead of a completely wrong literal

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

@@ -121,6 +121,10 @@ public void init(List<TransformFunction> arguments, Map<String, ColumnContext> c
parameterTypes[i].convert(literalTransformFunction.getBigDecimalLiteral(), PinotDataType.BIG_DECIMAL);
break;
case TIMESTAMP:
if (parameterTypes[i] == PinotDataType.STRING) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel this applies to all data types

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For other data types, only BIG_DECIMAL is the inferable type. So just need to add support for that.

@@ -151,7 +151,11 @@ public static RowDataBlock buildFromRows(List<Object[]> rows, DataSchema dataSch
byteBuffer.putInt(((Boolean) value) ? 1 : 0);
break;
case TIMESTAMP:
byteBuffer.putLong(((Timestamp) value).getTime());
if (value instanceof Long) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we going to hit both? If so, can you add some java doc explaining when will we hit both? If we can hit both, then BOOLEAN will also break

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Certain non strong typed functions in v2 might return long value then in datablock build, we need to do the conversion.

@@ -353,6 +353,9 @@ public Serializable convert(Object value) {
case BOOLEAN:
return ((Number) value).intValue() == 1;
case TIMESTAMP:
if (value instanceof Timestamp) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this still apply? Ideally we should always have internal format value. If we break that contract, BOOLEAN will also break

@@ -124,6 +124,10 @@ public enum TransformFunctionType {
SqlTypeFamily.CHARACTER),
ordinal -> ordinal > 1)),

FROMDATETIME("fromDateTime", ReturnTypes.TIMESTAMP_NULLABLE,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a TODO to auto generate signature for scalar function. We should not manually add this for scalar function

@xiangfu0 xiangfu0 changed the title [multistage] Convert TIMESTAMP type value at reduce phase [multistage] Support TIMESTAMP type and date ops functions Aug 17, 2023
@xiangfu0 xiangfu0 force-pushed the fixing-timstamp-type branch 2 times, most recently from 0b5b0f7 to 284ede5 Compare August 17, 2023 09:37
@xiangfu0 xiangfu0 merged commit 3b83907 into apache:master Aug 17, 2023
22 checks passed
@xiangfu0 xiangfu0 deleted the fixing-timstamp-type branch August 17, 2023 15:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
multi-stage Related to the multi-stage query engine
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants