Skip to content

[FLINK-37923][sql] Introduce VARIANT type and PARSE_JSON to Flink SQL #26655

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

Sxnan
Copy link
Contributor

@Sxnan Sxnan commented Jun 9, 2025

What is the purpose of the change

This pull request introduces the Variant data structure to represent semi-structured data, a new SQL type variant, and a builtin method to parse json string to the variant type.

Brief change log

  • Patch calcite to support variant type
  • Introduce Variant and BinaryVariant
  • Introduce variant type and PARSE_JSON to Flink SQL

Verifying this change

This change added tests and can be verified.

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): no
  • The public API, i.e., is any changed class annotated with @Public(Evolving): yes
  • The serializers: no
  • The runtime per-record code paths (performance sensitive): no
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no
  • The S3 file system connector: no

Documentation

  • Does this pull request introduce a new feature? yes
  • If yes, how is the feature documented? JavaDocs

@flinkbot
Copy link
Collaborator

flinkbot commented Jun 9, 2025

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run azure re-run the last Azure build

@Sxnan Sxnan changed the title [FLINK-37923][sql] Patch calcite to support variant type [FLINK-37923][sql] Introduce VARIANT type and PARSE_JSON to Flink SQL Jun 10, 2025
@Sxnan Sxnan marked this pull request as ready for review June 10, 2025 03:18
Copy link
Contributor

@twalthr twalthr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome work @Sxnan! I left a couple of comments that should improve consistency when integrating this type into the existing type system.

public static final DateTimeFormatter TIMESTAMP_FORMATTER =
new DateTimeFormatterBuilder()
.append(DateTimeFormatter.ISO_LOCAL_DATE)
.appendLiteral(' ')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want to format timestamps according to JSON we should actually use T in-between and Z at the end for TIMESTAMP_LTZ.
https://stackoverflow.com/questions/10286204/what-is-the-right-json-date-format

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See also https://nightlies.apache.org/flink/flink-docs-master/docs/connectors/table/formats/json/#json-timestamp-format-standard.
Or is the the intention to use SQL format, which is also fine for me. I just wanted to raise awareness here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point! I think we can follow the specification of the Variant type here, which uses T in-between for TIMESTAMP but no Z at the end for TIMESTAMP_LTZ.

@Sxnan Sxnan force-pushed the intro-variant branch 4 times, most recently from a5df846 to 316415e Compare June 12, 2025 14:39
@Sxnan
Copy link
Contributor Author

Sxnan commented Jun 12, 2025

@twalthr Thanks for the detailed review! I updated the PR accordingly, and all the comments should be addressed. Please take another look.

@Sxnan Sxnan requested a review from twalthr June 12, 2025 15:04
@twalthr
Copy link
Contributor

twalthr commented Jun 18, 2025

Thanks @Sxnan. I added it for my list for tomorrow. I'm sure it can still make it before the feature freeze.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants