New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FLINK-25187][table-planner] Apply padding when CASTing to BINARY(<length>) #18162
Conversation
Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community Automated ChecksLast check on commit 68c1818 (Tue Dec 21 14:53:50 UTC 2021) Warnings:
Mention the bot in a comment to re-run the automated checks. Review Progress
Please see the Pull Request Review Guide for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commandsThe @flinkbot bot supports the following commands:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @matriv. I only had minor comments to improve readability.
|| !(couldTrim(length) || couldPad(targetLogicalType, length))) { | ||
|| ((!couldTrim(targetLength) | ||
// Assume input length is respected by the source | ||
|| (inputLength != null && inputLength <= targetLength)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this condition is already quite complex. maybe introduce another method that takes input from above and target length?
.fromCaseLegacy(CHAR(2), fromString("Apache"), fromString("Apache")) | ||
.fromCase(VARCHAR(2), fromString("Apache"), fromString("Apache")) | ||
.fromCaseLegacy(VARCHAR(2), fromString("Apache"), fromString("Apache")) | ||
// |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
empty comments are not very useful. fill them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I used it to mark the end of that section, since there are more tests for this target type.
if (context.legacyBehaviour() | ||
|| ((!couldTrim(targetLength) | ||
// Assume input length is respected by the source | ||
|| (inputLength <= targetLength)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
simplify the condition maybe with local variables? or separate branches for early out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (condition) {
return inputTerm;
}
if (condition) {
return inputTerm;
}
remaining code without nesting
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I cannot do that apart from the check for the legacyBehaviour
, the other condition cannot be splitted, it has to be evaluated as a whole.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I could invert it maybe to avoid the wrapping not
but imho it's more readable to exit early or then apply the logic for trimming/padding.
@@ -656,13 +656,6 @@ | |||
.fromCaseLegacy(CHAR(6), fromString("Apache"), fromString("Apache")) | |||
.fromCase(VARCHAR(5), fromString("Flink"), fromString("Flink ")) | |||
.fromCaseLegacy(VARCHAR(5), fromString("Flink"), fromString("Flink")) | |||
// We assume that the input length is respected, therefore, no trimming is | |||
// applied |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update the hotfix commit if those tests can be removed? They are invalid anyways so +1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep I will do.
…d for CAST When casting to CHAR/VARCHAR/BINARY/VARBINARY, we assume that the length of the source type CHAR/VARCHAR/BINARY/VARBINARY is respected, to avoid performance overhead by applying checks and trimming at runtime. i.e. if input type is `VARCHAR(3)`, input value is 'foobar' and target type is `VARCHAR(4)`, no trimming is applied and the result value remains: `foobar`.
…ngth>) Similarly to `CHAR(<length>)` when casting to a `BINARY(<length>)` apply padding with 0 bytes to the right so that the resulting `byte[]` matches exaxctly the specified length.
@flinkbot run azure |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
…ngth>) Similarly to `CHAR(<length>)` when casting to a `BINARY(<length>)` apply padding with 0 bytes to the right so that the resulting `byte[]` matches exaxctly the specified length. This closes apache#18162.
What is the purpose of the change
Similarly to
CHAR(<length>)
when casting to aBINARY(<length>)
apply padding with 0 bytes to the right so that the resulting
byte[]
matches exaxctly the specified length.
Verifying this change
This change added tests and can be verified as follows:
CastRulesTest
CastFunctionITCase
CastFunctionMiscITCase
Does this pull request potentially affect one of the following parts:
@Public(Evolving)
: noDocumentation