-
Notifications
You must be signed in to change notification settings - Fork 13.8k
[Flink-14599][table-planner-blink] Support precision of TimestampType in blink planner #10105
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community Automated ChecksLast check on commit 212e234 (Wed Apr 15 11:35:06 UTC 2020) Warnings:
Mention the bot in a comment to re-run the automated checks. Review Progress
Please see the Pull Request Review Guide for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commandsThe @flinkbot bot supports the following commands:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Havn't finished yet...
|
||
val STRING_UTIL: String = className[BinaryStringUtil] | ||
|
||
val SQL_TIMESTAMP_TERM: String = className[SqlTimestamp] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
change to SQL_TIMESTAMP
, term
normally used as variable name
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK
|
||
// declaration | ||
reusableMemberStatements.add(s"private long $fieldTerm;") | ||
reusableMemberStatements.add(s"private $SQL_TIMESTAMP_TERM $fieldTerm;") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why use SqlTimestamp
when we want a ReusableLocalDateTime
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All java.sql.Timestamp
and java.time.LocalDateTime
in our engine should be represented as SqlTimestamp
now.
case SqlTypeName.TIMESTAMP => | ||
val reducedValue = reduced.getField(reducedIdx) | ||
val value = if (reducedValue != null) { | ||
Long.box(reducedValue.asInstanceOf[SqlTimestamp].getMillisecond) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why we reduce timestamp to Long
but not SqlTimestamp
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just want not break the original sql timestamp literal logic.
I think an individual commit would catch this, and let the sql timestamp literal support precision.
case TIMESTAMP_WITHOUT_TIME_ZONE => | ||
// TODO: support Timestamp(3) now | ||
val fieldTerm = newName("timestamp") | ||
val millis = literalValue.asInstanceOf[Long] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you explain why when we generating a literal for TIMESTAMP
type, the value will be passed in as Long
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
| $contextTerm.timerService().currentProcessingTime()); | ||
|""".stripMargin.trim | ||
// the proctime has been materialized, so it's TIMESTAMP now, not PROCTIME_INDICATOR | ||
GeneratedExpression(resultTerm, NEVER_NULL, resultCode, new TimestampType(3)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
new TimestampType(3)
-> resultType
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK
...able-planner-blink/src/main/scala/org/apache/flink/table/planner/codegen/GenerateUtils.scala
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this nice work! It makes Timestamp more close to sql standard. I left some comments.
@JingsongLi Please also take a look at this PR, especially code generation and blink-runtime part.
datetime = valueLiteral.getValueAs(Timestamp.class) | ||
.orElseThrow(() -> new TableException("Invalid literal.")).toLocalDateTime(); | ||
} else { | ||
throw new TableException("Invalid literal."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
record the illegal class here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK
if (precision > 9 || precision < 0) { | ||
throw new TableException( | ||
s"TIMESTAMP precision is not supported: ${relDataType.getPrecision}") | ||
s"TIMESTAMP precision is not supported: ${precision}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't need {}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK
case TIMESTAMP => | ||
if (relDataType.getPrecision > 3) { | ||
val precision = relDataType.getPrecision | ||
if (precision > 9 || precision < 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
change 9 to TimestampType.MAX_PRECISION
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK
Int.MaxValue | ||
|
||
// The maximal precision of TIMESTAMP is 3, change it to 9 to support nanoseconds precision | ||
case SqlTypeName.TIMESTAMP => 9 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
change 9 to TimestampType.MAX_PRECISION
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW, please also check line #46, I remember our default precision for TIMESTAMP is 6?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. But the default precision of Timestamp in Calcite is 3.
Should we change this behavior?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the impact of changing it to 9?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
change it 6 and fix failures in 90f3316
} | ||
|
||
// TODO: we copied the logical of TimestampString::getMillisSinceEpoch since the copied | ||
// DateTimeUtils.ymdToJulian is wrong. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the comment of DateTimeUtils
, I think we already aware of CALCITE-1884, do you mean we didn't fix it in copied DateTimeUtils
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. CALCITE-1884 is not fixed in our copied DateTimeUtils. And it's the root cause of some delicate cases, such as:
SELECT TIMESTAMP '1500-04-30 00:00:00.123456789' FROM docs; SELECT CAST('1500-04-30 00:00:00.123456789' AS DATETIME(9)) FROM docs;
should returns
1500-05-10T00:00:00.123456789
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So how about fixing it in our copied DateTimeUtils
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two reasons to not fixing our copied DateTimeUtils
in this PR
- two copies of
DateTimeUtils
should remain the same in legacy planner and blink planner - and the impact should be evaluated for both planner
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FLINK-11935 should do this, and after that this copied code could be removed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like to create another issue for this, since I don't think the author of FLINK-11935 will be aware of these codes and remove them.
LocalTimeConverter.INSTANCE.toExternal(v) | ||
|
||
case TIMESTAMP_WITHOUT_TIME_ZONE => | ||
val v = literal.getValueAs(classOf[java.lang.Long]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why the literal for TIMESTAMP_WITHOUT_TIME_ZONE
is long
? I remember somewhere else you presume the literal is TimestampString
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. It should be TimestampString and preserve the nanosecond precision.
|
||
testSqlApi( | ||
"ARRAY[TIMESTAMP '1985-04-11 14:15:16.1234567', TIMESTAMP '2018-07-26 17:18:19.1234567']", | ||
"[1985-04-11T14:15:16.123456700, 2018-07-26T17:18:19.123456700]") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why the result isn't 1985-04-11T14:15:16.1234567
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We use the LocalDateTime.toString style to represent string-format TimestampType when precision > 3. It follows one of the following ISO-8601 formats:
uuuu-MM-dd'T'HH:mm:ss.SSSSSS
uuuu-MM-dd'T'HH:mm:ss.SSSSSSSSS
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you check whether this fits the sql standard?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The interpreted string should conforms to the definition of timestamp literal(SQL 2011 Part 2 Section 6.13 General Rules (11) (d)). And i will fix it in next commit.
|
||
testSqlApi( | ||
"ARRAY[TIMESTAMP '1985-04-11 14:15:16.1234', TIMESTAMP '2018-07-26 17:18:19.1234']", | ||
"[1985-04-11T14:15:16.123400, 2018-07-26T17:18:19.123400]") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why the result isn't 1985-04-11T14:15:16.1234
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
import java.time.{Instant, ZoneId} | ||
import java.util.{Locale, TimeZone} | ||
|
||
class TemporalTypesTest extends ExpressionTestBase { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you extract all Timestamp
related test to a dedicated one? TemporalTypesTest
seems weird.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So divide it to TimestampTypeTest/DateTypeTest/TimeTypeTest ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would be great.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Double checked the TemporalTypesTest
and find some datetime function can handle multiple datetime types(e.g. extract/floor etc). Split the class causes tests for one single function live in different places. So I would rename this as DateTimeTypesTest to avoid confusing.
assertIndexIsValid(pos); | ||
|
||
if (SqlTimestamp.isCompact(precision)) { | ||
return SqlTimestamp.fromEpochMillis(segments[0].getLong(getElementOffset(pos, 8))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think always reading from segments[0]
is wrong
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch, it should be SegmentsUtil.getLong(segments, getElementOffset(pos, 8)
Thanks @docete , I will review this these two days. |
Can we put nanos of |
OK. It can save 4 bytes per high precision timestamp object. |
import java.io.File | ||
import java.util.TimeZone | ||
|
||
import org.apache.calcite.util.TimestampString |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
put this together with other calcite imports
import java.math.{BigDecimal => JBigDecimal} | ||
import java.lang.{Integer => JInteger} | ||
|
||
import org.apache.flink.table.runtime.functions.SqlDateTimeUtils |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
put this together with other flink imports
val v = literal.getValueAs(classOf[java.lang.Long]) | ||
LocalDateTimeConverter.INSTANCE.toExternal(v) | ||
val timestampString = literal.getValueAs(classOf[TimestampString]) | ||
val millis = SqlDateTimeUtils.getMillisSinceEpoch(timestampString.toString) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reuse the result of timestampString.toString
"CAST(f0 AS TIMESTAMP)", | ||
"1990-10-14 00:00:00.000") | ||
"CAST(f0 AS TIMESTAMP(3))", | ||
"1990-10-14 00:00:00") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TIMESTAMP(3) should be print as 1990-10-14 00:00:00.000
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SQL 2011 defined the interpreted string value as the shortest character string that conforms to the definition of literal
. And I checked other DBMSs:
MySQL 5.6: 'SELECT CAST(TIMESTAMP '1970-01-01 11:22:33' AS DATETIME(3))' => '1970-01-01T11:22:33Z'
PG 9.6: `SELECT CAST('1970-01-01 11:22:33' AS TIMESTAMP(3))' => '1970-01-01T11:22:33Z'
ORACLE 11g R2: select CAST(TIMESTAMP '1970-01-01 11:22:33' AS TIMESTAMP(3)) from log_table
=> '1970-01-01 11:22:33.0'
MSSQL 2017: SELECT CAST({ts '1970-01-01 11:22:33.000'} AS DATETIME), CAST({ts '1970-01-01 11:22:33.000'} AS DATETIME2)
=> '1970-01-01T11:22:33Z', '1970-01-01 11:22:33.0000000'
Only MSSQL's DATETIME2 will preserve tailing '0's and the precision can not be specified. IMO we shouldn't preserve the tailing '0's.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO 1990-10-14 00:00:00.000
is more intuitive, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ignore the tailing '0's is more standard-compliant since it's the shortest to represent the literal?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know. Does SQL standard describe something with this topic?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. It just describe the cast specification from datetime type to char/varchar type:
If SD is a datetime data type or an interval data type then let Y be the shortest character string that conforms to the definition of <literal> in Subclause 5.3, “<literal>”, and such that the interpreted value of Y is SV and the interpreted precision of Y is the precision of SD. If SV is a negative interval, then <sign> shall be specified within <unquoted interval string> in the literal Y.
<!-- Use TimestampString to cover CAST TIMESTAMP to STRING --> | ||
<dependency> | ||
<groupId>org.apache.calcite</groupId> | ||
<artifactId>calcite-core</artifactId> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer to not introduce calcite dependency for table-runtime
module
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any harmful things to introduce calcite dependency?
The calcite's TimestampString provides standard-compliant processing of Timestamp literal which we can depend on.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I recalled depending on calcite in different places introduced some packaging problems during 1.9.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to not introduce calcite to runtime.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 not introduce calcite. We are aiming to achieve Calcite-free in runtime to support different users' Calcite in a long term (shading Calcite is not easy).
case TIMESTAMP_WITHOUT_TIME_ZONE => | ||
generateOperatorIfNotNull(ctx, new TimestampType(), left, right) { | ||
(l, r) => s"($l * ${MILLIS_PER_DAY}L) $op $r" | ||
generateOperatorIfNotNull(ctx, new TimestampType(3), left, right) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what do you think about this?
case (TIMESTAMP_WITHOUT_TIME_ZONE, BIGINT) => | ||
generateUnaryOperatorIfNotNull(ctx, targetType, operand) { | ||
operandTerm => s"((long) ($operandTerm / 1000))" | ||
operandTerm => s"((long) ($operandTerm.getMillisecond() / 1000))" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what do you think about this?
Please rebase to master, we should let travis verify all the tests since you modified a lot of them. |
@KurtYoung The Date plus Day-Time Interval case and Cast Timestamp to BigInt case, I prefer leave it to the next PR. The semantic of arithmetic and cast operator for date time type need to be double check. Mix the change of only one or two cases in the PR is incomplete. By the way, the semantic of TIMESTAMPADD for DATE +/- INTERVAL_DAY_TIME(hours, minutes and seconds) seems OK to return a TIMESTAMP. What do you think? |
…rface to TypeGetterSetters/VectorizedColumnBatch and writeTimestamp interface to BinaryWriter
…sentation of Timestamp type
…ng and vice versa
…on for Timestamp type
…-in function for Timestamp type
…sting Timestamp to Timestamp with local time zone
…n for Timestamp type
…p type in parameters or result
…kupTableFunction()
…and LegacyLocalDateTimeTypeInfo to hold precision on conversion
…e in table source
…e for timestamp literals
…stamp type as 6 which defined in SQL standard
…estamp type conforms to the definition of timestamp literal which defined in SQL standard
…er high precision timestamp
Rebased to master. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @docete .
Havn't finished yet...
return relBuilder.getRexBuilder().makeTimestampLiteral(TimestampString.fromCalendarFields( | ||
valueAsCalendar(extractValue(valueLiteral, java.sql.Timestamp.class))), 3); | ||
TimestampType timestampType = (TimestampType) type; | ||
Class<?> clazz = valueLiteral.getOutputDataType().getConversionClass(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- First, these logical should in
extractValue
. - Second, we should just use
valueLiteral.getValueAs
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can just use valueLiteral.getValueAs(LocalDateTime.class)
.
} else { | ||
throw new TableException(String.format("Invalid literal of %s.", clazz.getCanonicalName())); | ||
} | ||
return relBuilder.getRexBuilder().makeTimestampLiteral( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could have util in SqlDateTimeUtils
or other class: toTimestampString(LocalDateTime)
val value = if (reducedValue != null) { | ||
val ts = reducedValue.asInstanceOf[SqlTimestamp] | ||
val milliseconds = ts.getMillisecond | ||
val nanoseconds = ts.toLocalDateTime.getNano |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just use LocalDateTime
to TimestampString
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A little tricky here, take CAST('1500-04-30 00:00:00' AS TIMESTAMP(3))
as an example:
we use SqlDateTimeUtils.toTimestamp
[1] to cast the string to SqlTimestamp and use ExpressionReducer
[2] to make a Timestamp literal.
The [1] step considers Gregorian cutovers and again the [2] step uses TimestampString.fromMillisSinceEpoch
which calls our copied DateTimeUtils.julianToString
considers Gregorian cutovers too. Two wrongs make a correct result.
I will change the toTimestamp
and this ExpressionReducer
to ignore Gregorian cutovers in next PR and that will be the final one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can fix it, and it is very worthwhile to fix. Otherwise it will bring a lot of strange code.
generateNonNullLiteral(literalType, millis + "L", millis) | ||
val fieldTerm = newName("timestamp") | ||
val timestampString = literalValue.asInstanceOf[TimestampString].toString | ||
val millis = getMillisSinceEpoch(timestampString) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should just use TimestampString
to localDateTime
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How can TimestampString
get a LocalDateTime
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In getMillisSinceEpoch
, you have got the year, month, and etc.. Why not construct a LocalDateTime
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getMillisSinceEpoch
is a util function of TimestampString
, we copied it because our copied DateTimeUtils
causes Gregorian cutovers issues. We should use TimestampString.getMillisSinceEpoch
directly when we remove our copied DateTimeUtils
. Do you mean we should not depend on TimestampString
and parse the timestamp string by ourself?
val resultTerm = ctx.addReusableLocalVariable(resultTypeTerm, "result") | ||
val evalResult = s"$functionReference.eval(${parameters.map(_.resultTerm).mkString(", ")})" | ||
val evalResult = | ||
if (returnType.getTypeRoot == LogicalTypeRoot.TIMESTAMP_WITHOUT_TIME_ZONE) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Types.TIMESTAMP can be represented as long
is a very old document, now, TIMESTAMP_WITHOUT_TIME_ZONE
can not be represented as long, you can take a look to TimestampType.INPUT_OUTPUT_CONVERSION
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have no idea whether we should disable a UDX with long
parameter to accept a Timestamp
. What do you think @KurtYoung ?
|
||
parameterClasses.zipWithIndex.zip(operands).map { case ((paramClass, i), operandExpr) => | ||
var newOperandExpr = operandExpr | ||
if (operandExpr.resultType.getTypeRoot == LogicalTypeRoot.TIMESTAMP_WITHOUT_TIME_ZONE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
case TIMESTAMP_WITHOUT_TIME_ZONE => | ||
generateOperatorIfNotNull(ctx, new TimestampType(), left, right) { | ||
(l, r) => s"($l * ${MILLIS_PER_DAY}L) $op $r" | ||
generateOperatorIfNotNull(ctx, new TimestampType(3), left, right) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree with @KurtYoung . It not a clean way.
BTW, You can split this PR in other ways, such as format related changes and so on.
val paraInternalType = fromDataTypeToLogicalType(parameterType) | ||
if (isAny(internal) && isAny(paraInternalType)) { | ||
getDefaultExternalClassForType(internal) == getDefaultExternalClassForType(paraInternalType) | ||
} else if ((isTimestamp(internal) && isLong(paraInternalType)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto, logical type already defined the conversions.
assert cursor > 0 : "invalid cursor " + cursor; | ||
|
||
// zero-out the bytes | ||
SegmentsUtil.setLong(segments, offset + cursor, 0L); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move this setLong to if (value == null)
. This is not decimal, timestamp is fix-length.
Others format too.
<!-- Use TimestampString to cover CAST TIMESTAMP to STRING --> | ||
<dependency> | ||
<groupId>org.apache.calcite</groupId> | ||
<artifactId>calcite-core</artifactId> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to not introduce calcite to runtime.
Hi @docete , I think we should try our best to avoid merging the code in and fixing it in next PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @docete . I left some commetns but didn't go through the PR in detail.
} else if (typeInfo instanceof BigDecimalTypeInfo) { | ||
BigDecimalTypeInfo decimalType = (BigDecimalTypeInfo) typeInfo; | ||
return new DecimalType(decimalType.precision(), decimalType.scale()); | ||
} else if (typeInfo instanceof LegacyLocalDateTimeTypeInfo) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The newly introduced LegacyLocalDateTimeTypeInfo
is wired. In theory, we don't need such things unless somewhere converting DataType to TypeInformation and back again. I think we should find out the root cause why we need this and create JIRA to fix the root cause. And add comments on these classes with the JIRA id.
Otherwise, we don't know how to remove these temporary code in the future.
} | ||
|
||
/** | ||
* Obtains an instance of {@code SqlTimestamp} from a millisecond. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor: I would suggest to use {@link SqlTimestamp}
.
<!-- Use TimestampString to cover CAST TIMESTAMP to STRING --> | ||
<dependency> | ||
<groupId>org.apache.calcite</groupId> | ||
<artifactId>calcite-core</artifactId> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 not introduce calcite. We are aiming to achieve Calcite-free in runtime to support different users' Calcite in a long term (shading Calcite is not easy).
case SqlTypeName.VARCHAR | SqlTypeName.CHAR | SqlTypeName.VARBINARY | SqlTypeName.BINARY => | ||
Int.MaxValue | ||
|
||
// The maximal precision of TIMESTAMP is 3, change it to 9 to support nanoseconds precision |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is this "maximal precision of TIMESTAMP is 3" mean? Is that a typo?
case TIMESTAMP_WITHOUT_TIME_ZONE => DataTypes.BIGINT | ||
case TIMESTAMP_WITHOUT_TIME_ZONE => | ||
val dt = argTypes(0).asInstanceOf[TimestampType] | ||
DataTypes.TIMESTAMP(dt.getPrecision).bridgedTo(classOf[SqlTimestamp]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a little performance sensitive, because it relates de/serialization.
If the precision is less than 3, we can use BIGINT to have a better performance.
Could you improve this a bit? Or create a issue and TODO for it.
|
||
testSqlApi( | ||
"TIMESTAMP '1500-04-30 12:00:00.123456789'", | ||
"1500-04-30 12:00:00.123456789") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
indent.
return relBuilder.getRexBuilder().makeTimestampLiteral(TimestampString.fromCalendarFields( | ||
valueAsCalendar(extractValue(valueLiteral, java.sql.Timestamp.class))), 3); | ||
TimestampType timestampType = (TimestampType) type; | ||
Class<?> clazz = valueLiteral.getOutputDataType().getConversionClass(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can just use valueLiteral.getValueAs(LocalDateTime.class)
.
} | ||
|
||
def generateRowtimeAccess( | ||
def generateTimestampAccess( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why change the method name? I think the original method describe the logic more accurate.
At the first glance at the new method name, I though it is used to access timestamp field from a row or record.
close this since it's splitted to two PRs |
What is the purpose of the change
Since FLINK-14080 introduced an internal representation(SqlTimestamp) of TimestampType with precision. This subtask will replace current long with SqlTimestamp, and let blink planner support precision of TimestampType.
Note:
Brief change log
Verifying this change
This change is already covered by existing tests and new tests.
Does this pull request potentially affect one of the following parts:
@Public(Evolving)
: (yes / no)Documentation