-
Notifications
You must be signed in to change notification settings - Fork 13.6k
[FLINK-37923][sql] Introduce VARIANT type and PARSE_JSON to Flink SQL #26655
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome work @Sxnan! I left a couple of comments that should improve consistency when integrating this type into the existing type system.
flink-core/src/main/java/org/apache/flink/types/variant/BinaryVariant.java
Outdated
Show resolved
Hide resolved
flink-core/src/main/java/org/apache/flink/types/variant/BinaryVariant.java
Outdated
Show resolved
Hide resolved
flink-core/src/main/java/org/apache/flink/types/variant/BinaryVariant.java
Outdated
Show resolved
Hide resolved
flink-core/src/main/java/org/apache/flink/types/variant/BinaryVariantUtil.java
Outdated
Show resolved
Hide resolved
public static final DateTimeFormatter TIMESTAMP_FORMATTER = | ||
new DateTimeFormatterBuilder() | ||
.append(DateTimeFormatter.ISO_LOCAL_DATE) | ||
.appendLiteral(' ') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we want to format timestamps according to JSON we should actually use T
in-between and Z
at the end for TIMESTAMP_LTZ.
https://stackoverflow.com/questions/10286204/what-is-the-right-json-date-format
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See also https://nightlies.apache.org/flink/flink-docs-master/docs/connectors/table/formats/json/#json-timestamp-format-standard.
Or is the the intention to use SQL format, which is also fine for me. I just wanted to raise awareness here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point! I think we can follow the specification of the Variant type here, which uses T
in-between for TIMESTAMP but no Z
at the end for TIMESTAMP_LTZ.
...-planner/src/main/scala/org/apache/flink/table/planner/codegen/calls/FunctionGenerator.scala
Outdated
Show resolved
Hide resolved
...anner/src/test/java/org/apache/flink/table/planner/runtime/stream/sql/VariantJavaITCase.java
Outdated
Show resolved
Hide resolved
...anner/src/test/scala/org/apache/flink/table/planner/runtime/stream/sql/AggregateITCase.scala
Outdated
Show resolved
Hide resolved
...er/src/test/scala/org/apache/flink/table/planner/runtime/stream/sql/VariantScalaITCase.scala
Outdated
Show resolved
Hide resolved
...flink-table-runtime/src/main/java/org/apache/flink/table/runtime/functions/VariantUtils.java
Outdated
Show resolved
Hide resolved
a5df846
to
316415e
Compare
@twalthr Thanks for the detailed review! I updated the PR accordingly, and all the comments should be addressed. Please take another look. |
Thanks @Sxnan. I added it for my list for tomorrow. I'm sure it can still make it before the feature freeze. |
What is the purpose of the change
This pull request introduces the Variant data structure to represent semi-structured data, a new SQL type variant, and a builtin method to parse json string to the variant type.
Brief change log
Verifying this change
This change added tests and can be verified.
Does this pull request potentially affect one of the following parts:
@Public(Evolving)
: yesDocumentation