New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[FLINK-12253][table-common] Setup a class hierarchy for the new logical type system #8335
Conversation
Thanks a lot for your contribution to the Apache Flink project. I'm the @flinkbot. I help the community Review Progress
Please see the Pull Request Review Guide for a full explanation of the review process. The Bot is tracking the review progress through labels. Labels are applied according to the order of the review items. For consensus, approval by a Flink committer of PMC member is required Bot commandsThe @flinkbot bot supports the following commands:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
private static void testAll( | ||
LogicalType nullableType, | ||
String typeString, | ||
Class[] input, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
input
-> supportedInputClasses
output
-> supportedOutputClasses
or a Javadoc?
ANY( | ||
LogicalTypeFamily.EXTENSION); | ||
|
||
private final Set<LogicalTypeFamily> families; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to use EnumSet
here?
|
||
private static final int DEFAULT_LENGTH = 1; | ||
|
||
private static final String DEFAULT_FORMAT = "CHAR(%d)"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: DEFAULT_FORMAT
-> FORMAT
?
* | ||
* @return a deep copy | ||
*/ | ||
public LogicalType copy() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make this method final
with a comment to implement LogicalType copy(boolean isNullable)
instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for d2fd35f
|
||
private static final int DEFAULT_LENGTH = 1; | ||
|
||
private static final String DEFAULT_FORMAT = "VARCHAR(%d)"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Is it a DEFAULT_format
or just a FORMAT
can we change the FORMAT
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It depends on the type. E.g. ROW
and ARRAY
could be represented in different ways. But I will correct it to FORMAT
while merging.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A comment for 2bfc47d
testAll( | ||
new BooleanType(), | ||
"BOOLEAN", | ||
new Class[]{Boolean.class}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't it be new Class[]{Boolean.class, boolean.class}
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for 063398d
@flinkbot approve-until quality |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for c0652a4
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for db0e8cb
if (scale < MIN_SCALE || scale > precision) { | ||
throw new ValidationException( | ||
String.format( | ||
"Decimal scale must be between %d and %d (both inclusive).", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: "Decimal scale must be between %d and %d (both inclusive)."
-> "Decimal scale must be between %d and %d(precision) (both inclusive)."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A comment for 52229e1
testAll( | ||
new TinyIntType(), | ||
"TINYINT", | ||
new Class[]{Byte.class}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
new Class[]{Byte.class, byte.class}
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for a6fe3ff
testAll( | ||
new SmallIntType(), | ||
"SMALLINT", | ||
new Class[]{Short.class}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
new Class[]{Short.class, short.class}
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for 908aa48
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@twalthr Good job! It really looks concise. I also really enjoyed reading all the javadocs!
I put some minor comments, but generally +1 on all the changes.
import java.util.Set; | ||
|
||
/** | ||
* Logical type of a time WITHOUT timezone consisting of {@code hour:minute:second[.fractional]} with up |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Can Time
have a timezone in general? Doesn't timezone imply there has to be date component?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The SQL standard also defines a TIME WITH TIME ZONE
. I will explicitly mention that we won't support that. Java does also not provide a time class for such a type.
* Internal timestamp kind for time attribute metadata. | ||
*/ | ||
@Internal | ||
public enum TimestampKind { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we extract this and use it for all Timestamp
types?
* 9 (both inclusive). If no precision is specified, {@code p} is equal to 6. | ||
* | ||
* <p>Compared to {@link ZonedTimestampType}, the time zone offset information is not stored physically | ||
* in ever datum. Instead, the type assumes {@link java.time.Instant} semantics in UTC time zone at |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ever
-> every
* interval of months of 50 leads to {@code +04-02}). | ||
* | ||
* <p>The serialized string representation is {@code INTERVAL YEAR(p)}, {@code INTERVAL YEAR(p) TO MONTH}, | ||
* or {@code INTERVAL YEAR(p)} where {@code p} is the number of digits of years (=year precision). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
INTERVAL YEAR(p)
-> INTERVAL MONTH
* the following resolutions: interval of years, interval of years to months, or interval of months. | ||
* | ||
* <p>An interval of year-month consists of {@code +years-months} with values ranging from {@code -9999-11} | ||
* to {@code +9999-11}. The value is the same for all resolutions of this group (for example, an |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't get this example...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about?
An interval of year-month consists of {@code +years-months} with values ranging from {@code -9999-11}
* to {@code +9999-11}. A value representation is looks the same for all resolutions. For example, an
* interval of months of 50 is always represented in an interval of year to month format: {@code +04-02}.
@@ -285,6 +286,27 @@ public void testDayTimeIntervalType() { | |||
); | |||
} | |||
|
|||
@Test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about we add a check or two for different supported array conversions?
E.g. to verify that new ArrayType(new ArrayType(new TimestampType()))
does not support java.sql.Timestamp[].class
.
|
||
/** | ||
* Logical type of a user-defined distinct type. A distinct type specifies an identifier and is backed | ||
* by a source type. A distinct types has the same internal representation as a source type, but is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
types
-> type
* Logical type of an arbitrary serialized type. This type is a black box within the table ecosystem | ||
* and is only deserialized at the edges. The any type is an extension to the SQL standard. | ||
* | ||
* <p>The serialized string representation is {@code ANY(c, s)} where {@code s} is the originating |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
{@code s}
-> {@code c}
} catch (Exception e) { | ||
throw new TableException(String.format( | ||
"Unable to generate a string representation of the serializer snapshot of '%s' " + | ||
"describing the class '%s' for the any type.", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: for the any type
-> for the ANY type
testAll( | ||
new AnyType<>(Human.class, new KryoSerializer<>(Human.class, new ExecutionConfig())), | ||
"ANY(org.apache.flink.table.types.LogicalTypesTest$Human, " + | ||
"ADNvcmcuYXBhY2hlLmZsaW5rLnRhYmxlLnR5cGVzLkxvZ2ljYWxUeXBlc1Rlc3QkSHVtYW4AAATyxpo9cAA" + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we not hardcode the snapshot? Changes to the KryoSerializer will break this test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Forget this comment. It should work. We should make sure anyway that we support old versions of internal serializer snapshots.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also think that hardcoding is ok in this case to indicate changes in the serialization format.
Thanks for the review @dawidwys. I merged all commits separately. I will close this PR now. |
What is the purpose of the change
This PR implements all logical types that Flink's type system should support in the near future. As the name indicates, these type classes are purely logical to allow declarations and intentions across modules and in the API. Planners/runtime can still decide which type and precision is supported physically.
An exact listing can be found in FLIP-37: https://docs.google.com/document/d/1a9HUb6OaBIoj9IRfbILcMFPrOL7ALeZ3rVI66dvA2_U/edit#
The most important content of this PR are the JavaDoc comments that clearly define each type and should avoid ambiguity. Catalogs/connectors/planners should adapt to those definitions.
Brief change log
Verifying this change
org.apache.flink.table.types.LogicalTypesTest
Does this pull request potentially affect one of the following parts:
@Public(Evolving)
: noDocumentation