Skip to content

[FLINK-39801][table] Skip serializing empty partition/order keys for PTF table args#28283

Merged
snuyanzin merged 3 commits into
apache:masterfrom
raminqaf:FLINK-39801
Jun 1, 2026
Merged

[FLINK-39801][table] Skip serializing empty partition/order keys for PTF table args#28283
snuyanzin merged 3 commits into
apache:masterfrom
raminqaf:FLINK-39801

Conversation

@raminqaf
Copy link
Copy Markdown
Contributor

@raminqaf raminqaf commented May 30, 2026

What is the purpose of the change

TABLE_ARG_CALL nodes always wrote partitionKeys, orderKeys, and orderDirections into the compiled plan JSON, even when empty. Table arguments without PARTITION BY and ORDER BYare common, so these empty arrays bloat the plan for no benefit. This change omits the arrays when empty, matching how serializeCall already skips empty operands.

Brief change log

  • RexNodeJsonSerializer skips partitionKeys / orderKeys / orderDirections when they are empty.
  • RexNodeJsonDeserializer treats the three fields as optional and defaults them to empty arrays when absent, so plans written by older versions still load.
  • Regenerated the affected PTF / TO_CHANGELOG / FROM_CHANGELOG restore plan files to drop the now-omitted arrays.

Verifying this change

This change is already covered by existing tests, such as the stream-exec-process-table-function, TO_CHANGELOG, and FROM_CHANGELOG restore tests. They load the regenerated plans through the deserializer and pass. The deserializer change keeps older plans that still contain the empty arrays loadable, preserving backward compatibility.

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): no
  • The public API, i.e., is any changed class annotated with @Public(Evolving): no
  • The serializers: yes. The compiled-plan JSON format changes, but it is backward compatible: the deserializer reads plans both with and without the arrays.
  • The runtime per-record code paths (performance sensitive): no
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: no
  • The S3 file system connector: no

Documentation

  • Does this pull request introduce a new feature? no
  • If yes, how is the feature documented? not applicable

Was generative AI tooling used to co-author this PR?
  • Yes (please specify the tool below)

@flinkbot
Copy link
Copy Markdown
Collaborator

flinkbot commented May 30, 2026

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run azure re-run the last Azure build

}
} else {
order = new SortOrder[0];
private static <T> List<T> deserializeArrayOrEmpty(
Copy link
Copy Markdown
Contributor

@snuyanzin snuyanzin May 30, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is it called deserializeArrayOrEmpty however returns a list?
can we make it returning array in order to encapsulate

.mapToInt(Integer::intValue)
                        .toArray();

and probably just call it deserializeArray

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed to deserializeListOrEmpty and used it for objects like SortOrder. Introduced a new method called deserializeToIntArray that uses deserializeListOrEmpty.

Copy link
Copy Markdown
Contributor

@snuyanzin snuyanzin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM
Thanks for addressing feedback

let's wait for CI

@snuyanzin snuyanzin merged commit cc348d4 into apache:master Jun 1, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants