-
Notifications
You must be signed in to change notification settings - Fork 13.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FLINK-24684][table-planner] Add to string cast rules using the new CastRule stack #17658
Conversation
Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community Automated ChecksLast check on commit aabf462 (Wed Nov 03 10:43:20 UTC 2021) Warnings:
Mention the bot in a comment to re-run the automated checks. Review Progress
Please see the Pull Request Review Guide for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commandsThe @flinkbot bot supports the following commands:
|
aabf462
to
bbeab6e
Compare
bbeab6e
to
6bc3aa4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @slinkydeveloper, great stuff!
I left some comments, and could you also add some integration tests missing in CastFunctionITCase
or CastFunctionMiscITCase
regarding the structured type -> string casts?
...va/org/apache/flink/table/planner/functions/casting/rules/AbstractCodeGeneratorCastRule.java
Show resolved
Hide resolved
...ache/flink/table/planner/functions/casting/rules/AbstractNullAwareCodeGeneratorCastRule.java
Outdated
Show resolved
Hide resolved
flink-table/flink-table-planner/src/test/resources/log4j2-test.properties
Show resolved
Hide resolved
...planner/src/main/scala/org/apache/flink/table/planner/codegen/calls/ScalarOperatorGens.scala
Outdated
Show resolved
Hide resolved
...le-planner/src/test/java/org/apache/flink/table/planner/functions/casting/CastRulesTest.java
Outdated
Show resolved
Hide resolved
...le-planner/src/test/java/org/apache/flink/table/planner/functions/casting/CastRulesTest.java
Show resolved
Hide resolved
...le-planner/src/test/java/org/apache/flink/table/planner/functions/casting/CastRulesTest.java
Outdated
Show resolved
Hide resolved
...rc/main/java/org/apache/flink/table/planner/functions/casting/rules/MapToStringCastRule.java
Show resolved
Hide resolved
...rc/main/java/org/apache/flink/table/planner/functions/casting/rules/RowToStringCastRule.java
Show resolved
Hide resolved
.../main/java/org/apache/flink/table/planner/functions/casting/rules/ArrayToStringCastRule.java
Show resolved
Hide resolved
@matriv regarding testing the rules in |
Yes of course, my mistake, thought those casts work end to end now. |
6bc3aa4
to
00997f4
Compare
.../main/java/org/apache/flink/table/planner/functions/casting/rules/ArrayToStringCastRule.java
Outdated
Show resolved
Hide resolved
...rc/main/java/org/apache/flink/table/planner/functions/casting/rules/RowToStringCastRule.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@slinkydeveloper Thx a lot for adding those generated code examples, I think they are really useful, for debugging, or for changing something in the future.
...rc/main/java/org/apache/flink/table/planner/functions/casting/rules/MapToStringCastRule.java
Outdated
Show resolved
Hide resolved
e9e7c13
to
5b8ef0d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @slinkydeveloper. This is a great PR. Very readable code with good utilities that we can generalize to common code gen utils in Java. But this is future work. I added some feedback.
...planner/src/main/java/org/apache/flink/table/planner/functions/casting/CastRuleProvider.java
Outdated
Show resolved
Hide resolved
...c/main/java/org/apache/flink/table/planner/functions/casting/rules/ArrayToArrayCastRule.java
Outdated
Show resolved
Hide resolved
private static LogicalType sanitizeTargetType( | ||
ArrayType inputArrayType, ArrayType targetArrayType) { | ||
LogicalType innerTargetType = targetArrayType.getElementType(); | ||
// TODO this seems rather a bug of the planner that generates/allows/doesn't sanitize |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See LogicalTypeCasts
:
private static boolean supportsCasting(
LogicalType sourceType, LogicalType targetType, boolean allowExplicit) {
// a NOT NULL type cannot store a NULL type
// but it might be useful to cast explicitly with knowledge about the data
if (sourceType.isNullable() && !targetType.isNullable() && !allowExplicit) {
return false;
}
// ignore nullability during compare
if (sourceType.copy(true).equals(targetType.copy(true))) {
return true;
}
For explicit casts, this is totally fine if the user knows the data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotcha, I've modified the comment to explain why we need that logic.
.../main/java/org/apache/flink/table/planner/functions/casting/rules/ArrayToStringCastRule.java
Show resolved
Hide resolved
|
||
/* Example generated code for ARRAY<INT>: | ||
|
||
isNull$0 = _myInputIsNull; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use <pre>
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why? This is not a javadoc
|
||
// Write the comma | ||
if (fieldIndex != 0) { | ||
writer.stmt(methodCall(builderTerm, "append", strLiteral(","))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use a space after the comma
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As for #17658 (comment), choosing which representation to use is out of scope of this PR. We have an issue open for that already: https://issues.apache.org/jira/browse/FLINK-17321
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A row is not a collection type. But let me create a bunch of issues to not forget about it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I opened FLINK-24802.
"delete", | ||
0, | ||
methodCall(builderTerm, "length"))) | ||
.stmt(methodCall(builderTerm, "append", strLiteral("("))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe document why we chose this representation, because I was just thinking about if we should synchronize this with the Row.toString
. but maybe ( )
is ok to distinguish it from arrays and rows with +I[]
where we also have a change flag
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above
(inputLogicalType.is(LogicalTypeRoot.TIMESTAMP_WITH_LOCAL_TIME_ZONE)) | ||
? context.getSessionTimeZoneTerm() | ||
: className(DateTimeUtils.class) + ".UTC_ZONE"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe use the old method for performance? I don't see a reason to declare and access a time zone for a time zone less type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The old method was invoking the same method defaulting the zone to UTC_ZONE
. Note that we don't declare any variable for UTC_ZONE
, we just statically access to it. The variable is only declared for timestamp_ltz
...le-planner/src/test/java/org/apache/flink/table/planner/functions/casting/CastRulesTest.java
Show resolved
Hide resolved
RAW(LocalDateTime.class, new LocalDateTimeSerializer()), | ||
RawValueData.fromObject( | ||
LocalDateTime.parse("2020-11-11T18:08:01.123")), | ||
StringData.fromString("2020-11-11T18:08:01.123")), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please also add test for the NULL
type, MULTISET
, and structured types. Structured types have the speciality that they can either be backed by a POJO (in which case we call toString
on them) or Row
in which case I would use the RowData
representation.
2b2ea98
to
ff2edf0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the update @slinkydeveloper. I think the PR should be good in the next iteration. I thought about the changes in this PR again and I think we should use null
as the string representation instead of NULL
for now. The change from null
to NULL
should be hidden behind a feature flag that we can enable by default once all casting rules have been reworked. We should introduce a boolean option use-old-casting
which is set to false by default in 1.15.
...le-planner/src/test/java/org/apache/flink/table/planner/functions/casting/CastRulesTest.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM waiting for a green build.
…generated the casting Signed-off-by: slinkydeveloper <francescoguard@gmail.com>
Signed-off-by: slinkydeveloper <francescoguard@gmail.com>
Signed-off-by: slinkydeveloper <francescoguard@gmail.com>
…astRule stack Signed-off-by: slinkydeveloper <francescoguard@gmail.com>
…o reset the builder Signed-off-by: slinkydeveloper <francescoguard@gmail.com>
Signed-off-by: slinkydeveloper <francescoguard@gmail.com>
Signed-off-by: slinkydeveloper <francescoguard@gmail.com>
Signed-off-by: slinkydeveloper <francescoguard@gmail.com>
Signed-off-by: slinkydeveloper <francescoguard@gmail.com>
Signed-off-by: slinkydeveloper <francescoguard@gmail.com>
Signed-off-by: slinkydeveloper <francescoguard@gmail.com>
Signed-off-by: slinkydeveloper <francescoguard@gmail.com>
Signed-off-by: slinkydeveloper <francescoguard@gmail.com>
Signed-off-by: slinkydeveloper <francescoguard@gmail.com>
9de9026
to
0fe4cd3
Compare
…astRule stack Signed-off-by: slinkydeveloper <francescoguard@gmail.com> This closes apache#17658.
…astRule stack Signed-off-by: slinkydeveloper <francescoguard@gmail.com> This closes apache#17658.
What is the purpose of the change
This PR ports all the "to string" casting logic to the new
CastRule
stack, as discussed https://issues.apache.org/jira/browse/FLINK-24684.Brief change log
Major PR changes
CastRule
stackScalarOperatorGens
for to string castingCastRuleUtils
provides aCodeWriter
to simplify and organize code generation using javaIncluded minor changes
OperatorCodeGen
to debug codegen issuesExpressionTestBase
provides more verbose assertion error loggingVerifying this change
Each new rule is tested with new test cases in
CastRulesTest
, and then they are covered inside integration tests provided byCastFunctionsITCase
.In some tests I had to change
null
toNULL
, because now the null values is represented as an uppercase string.Does this pull request potentially affect one of the following parts:
@Public(Evolving)
: noDocumentation