[FLINK-39537][table] Apply conditional SET_SEMANTIC_TABLE trait to FROM_CHANGELOG#28025
[FLINK-39537][table] Apply conditional SET_SEMANTIC_TABLE trait to FROM_CHANGELOG#28025raminqaf wants to merge 6 commits intoapache:masterfrom
Conversation
2232a48 to
cb18d7a
Compare
gustavodemorais
left a comment
There was a problem hiding this comment.
Thanks for the clean PR, @raminqaf. I've added some suggestions so we get code aligned and the wording for the documentation right
gustavodemorais
left a comment
There was a problem hiding this comment.
Thanks for the quick updates, @raminqaf! Code is looking good. Could you align this with #28120 where I moved our shared utility to ChangelogTypeStrategyUtils.javaflink-table/flink-table-common/src/main/java/org/apache/flink/table/types/inference/strategies/ChangelogTypeStrategyUtils.java
Thanks @gustavodemorais for the feedbacks. I have aligned the latest changes. If the code looks good enough we can processed with the merge and continue the work on the Utils on your PR or a followup PR so the changes are not blocked here. |
| final Optional<Field> opField = | ||
| inputFields.stream().filter(f -> f.getName().equals(opColumnName)).findFirst(); | ||
| if (opField.isEmpty()) { | ||
| final OptionalInt opIndex = resolveOpColumnIndex(inputFields, opColumnName); |
There was a problem hiding this comment.
if you compare names in this method do not use DataYpe#getFieldNames then?
There was a problem hiding this comment.
Good point, but down the line we use the type
final Field opField = inputFields.get(opIndex.getAsInt());
final LogicalType opFieldType = opField.getDataType().getLogicalType();There was a problem hiding this comment.
how about composite types which are covered by DataType#getFieldDataTypes and not sure it is covered here
There was a problem hiding this comment.
I play around with it but tbh I like this approach of fetching the Fields once and gathering the infos out of it
gustavodemorais
left a comment
There was a problem hiding this comment.
LGTM, thanks for the contribution @raminqaf
| * <pre>{@code | ||
| * Table result = cdcStream | ||
| * .partitionBy($("id")) | ||
| * .process("FROM_CHANGELOG"); |
There was a problem hiding this comment.
This is not so nice. Shall we add a method to PartitionedTable
| * .process("FROM_CHANGELOG"); | |
| * .fromChangelog(); |
There was a problem hiding this comment.
We can also do this as a follow up and include to_changelog there.
|
@raminqaf seems your PR is conflicting with the one from @gustavodemorais |
What is the purpose of the change
FROM_CHANGELOGis currently locked to row semantics — each row is processed independently, with no way to co-locate rows for the same key in the same parallel operator instance. This is fine for stateless downstreams but limits use cases where the resulting changelog feeds into a stateful operator keyed on the same column.This PR uses the conditional-traits machinery introduced for
TO_CHANGELOGin FLINK-39392 to switchFROM_CHANGELOGto set semantics when the call providesPARTITION BY. Behavior withoutPARTITION BYis unchanged.Brief change log
BuiltInFunctionDefinitions.FROM_CHANGELOG: input table argument adds withConditionalTrait(SET_SEMANTIC_TABLE,hasPartitionBy())FromChangelogTest#testSetSemanticsWithPartitionBy) verifying theExchange(hash[...])propagationFromChangelogTestPrograms#SET_SEMANTICS_PARTITION_BY) verifying end-to-end output equivalencedocs/.../changelog.md) andTable#fromChangelogJavaDocVerifying this change
This change added tests and can be verified as follows:
PARTITION BYid is specified, and the output changelogMode propagates correctlySET_SEMANTICS_PARTITION_BYprogram) — end-to-end run verifying that addingPARTITION BYchanges he parallel execution layout but does NOT alter row-level output (same +I, -U, +U, -D sequence as the row-semantics tests)FromChangelogTestandFromChangelogSemanticTestscases continue to pass without modification, confirming the conditional trait does not regress the row-semantics pathDoes this pull request potentially affect one of the following parts:
@Public(Evolving): no (only FROM_CHANGELOG's declared traits change; the function signature and Table#fromChangelog API are unchanged)Documentation
docs/content/docs/sql/reference/queries/changelog.md) and JavaDocs(
Table#fromChangelog)Was generative AI tooling used to co-author this PR?