Skip to content

[FLINK-39700][table-planner] Fix column-order sensitivity in validateAndExtractColumnChanges#28194

Merged
AHeise merged 2 commits into
apache:masterfrom
AHeise:FLINK-39700-mt-column-order
May 20, 2026
Merged

[FLINK-39700][table-planner] Fix column-order sensitivity in validateAndExtractColumnChanges#28194
AHeise merged 2 commits into
apache:masterfrom
AHeise:FLINK-39700-mt-column-order

Conversation

@AHeise
Copy link
Copy Markdown
Contributor

@AHeise AHeise commented May 18, 2026

What is the purpose of the change

Physical DDL columns in CREATE OR ALTER MATERIALIZED TABLE could not be declared in a different order than the AS SELECT projection, because validateAndExtractColumnChanges compared old and new schemas positionally. This PR switches to name-based matching so that type changes, computed-column definition changes, drops, and additions each generate the appropriate TableChange entry.

It also consolidates query-change and persisted-column-drop validation — previously scattered across MaterializedTableChangeHandler and SqlAlterMaterializedTableDropSchemaConverter — into AlterMaterializedTableChangeOperation.validateChanges(). The handler becomes a pure applier.

CREATE OR ALTER now uses declarative semantics: non-persisted columns absent from the explicit DDL are dropped instead of kept.

Brief change log

  • MaterializedTableUtils.validateAndExtractColumnChanges: replace positional comparison with name-based lookup over all column kinds (physical, computed, metadata):
    • Physical type change → ModifyPhysicalColumnType (instead of throwing)
    • Computed/metadata definition change → ModifyColumn
    • Comment-only change → ModifyColumnComment (also emitted alongside ModifyPhysicalColumnType when both change)
    • Column absent in new schema → DropColumn; non-persisted columns are skipped when schemaDefinedInQuery=false (schema derived from query projection, so absent non-persisted columns are assumed to still exist rather than being intentionally removed)
    • New column → AddColumn; nullable coercion applied when schemaDefinedInQuery=false
    • Removes the apache-commons-lang3 StringUtils dependency from this class
  • AlterMaterializedTableChangeOperation: new validateChanges() method called from execute() rejects:
    • Column reordering (ModifyColumnPosition) when the change set includes ModifyDefinitionQuery
    • Physical type changes (ModifyPhysicalColumnType) when the change set includes ModifyDefinitionQuery
    • Persisted-column drops (DropColumn) unconditionally
      Also gains computeNewTable() (protected hook), getOldTable(), and setOldTable() (invalidates derived caches)
  • MaterializedTableChangeHandler: remove isQueryChange, droppedPersistedCnt, validationErrors, checkForChangedPositionByQuery, and all positionChangeError helpers; dropColumn is now unconditional; unknown changes throw immediately
  • SqlAlterMaterializedTableDropSchemaConverter: remove the per-column persisted-column-drop guard (now handled by validateChanges)
  • FullAlterMaterializedTableOperation: override computeNewTable() with a newTableBuilder lambda supplied by SqlCreateOrAlterMaterializedTableConverter, so the caller controls the new-table construction without shadowing the cached newTable field on the parent
  • SqlCreateOrAlterMaterializedTableConverter: extract buildNewTable() that builds the new table definition with non-persisted columns absent from the explicit DDL dropped (declarative semantics); unfold buildTableChanges from a returned lambda into a direct method; add getMergedLogicalRefreshMode, getMergedComment, getMergedFreshness to MergeContext

Verifying this change

This change added tests and can be verified as follows:

  • AlterMaterializedTableChangeOperationValidationTest: rejects position changes, type changes, and persisted-column drops via validateChanges()
  • AlterMaterializedTableAsQueryOperationValidationTest: end-to-end validation of CREATE OR ALTER and ALTER ... AS SELECT through execute()
  • MaterializedTableChangeHandlerTest: handler applies changes without validation side effects
  • ValidateAndExtractColumnChangesTest: parametrized suite covering identical schemas, comment add/remove, column appended (with/without schemaDefinedInQuery), reordered persisted columns (no-op), type change, drop, rename, computed add/drop/modify, metadata add/drop
  • SqlMaterializedTableNodeToOperationConverterTest: updated expected error messages and schema assertions

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): no
  • The public API, i.e., is any changed class annotated with @Public(Evolving): no
  • The serializers: no
  • The runtime per-record code paths (performance sensitive): no
  • Anything that affects deployment or recovery: no
  • The S3 file system connector: no

Documentation

  • Does this pull request introduce a new feature? no (correctness fix and refactor for materialized table schema evolution)
  • If yes, how is the feature documented? not applicable

Was generative AI tooling used to co-author this PR?
  • Yes (Claude Code)

Generated-by: Claude Code (claude-sonnet-4-6)

@flinkbot
Copy link
Copy Markdown
Collaborator

flinkbot commented May 18, 2026

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run azure re-run the last Azure build

@AHeise AHeise force-pushed the FLINK-39700-mt-column-order branch 2 times, most recently from d9aacab to 98cba84 Compare May 19, 2026 15:16
…ation into AlterMaterializedTableChangeOperation

Validation that was scattered across MaterializedTableChangeHandler (checkForChangedPositionByQuery,
droppedPersistedCnt, validationErrors) and SqlAlterMaterializedTableDropSchemaConverter is consolidated
into AlterMaterializedTableChangeOperation.validateChanges(). The handler becomes a pure applier: it
throws immediately on unknown changes and applies DropColumn unconditionally.

Same errors are thrown, just from the right layer. Tests for both layers
(AlterMaterializedTableChangeOperationValidationTest, AlterMaterializedTableAsQueryOperationValidationTest,
MaterializedTableChangeHandlerTest) are added alongside the prod change.
@AHeise AHeise force-pushed the FLINK-39700-mt-column-order branch from 98cba84 to c30a346 Compare May 19, 2026 15:29
@AHeise
Copy link
Copy Markdown
Contributor Author

AHeise commented May 19, 2026

@flinkbot run azure

@AHeise AHeise force-pushed the FLINK-39700-mt-column-order branch from b0aec94 to c30a346 Compare May 19, 2026 16:46
@AHeise
Copy link
Copy Markdown
Contributor Author

AHeise commented May 19, 2026

@flinkbot run azure

@AHeise AHeise force-pushed the FLINK-39700-mt-column-order branch from f0b1bbe to c30a346 Compare May 19, 2026 19:03
@AHeise
Copy link
Copy Markdown
Contributor Author

AHeise commented May 19, 2026

@flinkbot run azure

@AHeise AHeise force-pushed the FLINK-39700-mt-column-order branch from d2d729e to c30a346 Compare May 19, 2026 19:32
List<Column> oldColumns,
Map<String, Integer> columnIndex,
List<String> errors) {
final Integer idx = columnIndex.get(change.getColumnName());
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit

Suggested change
final Integer idx = columnIndex.get(change.getColumnName());
final int idx = columnIndex.getOrDefault(change.getColumnName(), -1);

and then have logic similar to checkPositionChange
with

if (index < 0) {
            return;
        }
        errors.add(
...

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same for checkPhysicalTypeChange

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@AHeise AHeise force-pushed the FLINK-39700-mt-column-order branch from c30a346 to 92c6b21 Compare May 19, 2026 20:29
}

@VisibleForTesting
public void validateChanges() {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WDYT about movement public methods up, private to the bottom?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, moved public methods up and private to the bottom.

…TER drops absent non-persisted columns

validateAndExtractColumnChanges switches from position-based throws to name-based diff: columns are
matched by name; new columns emit AddColumn; matched columns with type/class/comment differences emit
ModifyPhysicalColumnType/ModifyColumn/ModifyColumnComment; absent columns emit DropColumn (skipping
non-persisted when schemaDefinedInQuery=false). This allows physical DDL columns to be declared in a
different order than the AS SELECT projection.

AlterMaterializedTableChangeOperation gains a protected computeNewTable() hook so subclasses can plug
in a different builder without shadowing the cached newTable field. FullAlterMaterializedTableOperation
overrides the hook with a newTableBuilder lambda supplied by SqlCreateOrAlterMaterializedTableConverter
so non-persisted columns absent from the new DDL are dropped (declarative CREATE OR ALTER semantics).
Tests for validateAndExtractColumnChanges are added alongside.
@AHeise AHeise force-pushed the FLINK-39700-mt-column-order branch from 92c6b21 to f750378 Compare May 19, 2026 20:33
@AHeise
Copy link
Copy Markdown
Contributor Author

AHeise commented May 19, 2026

@flinkbot run azure

Copy link
Copy Markdown
Contributor

@snuyanzin snuyanzin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM
Thanks for addressing feedback

@AHeise AHeise merged commit 4ac4c4a into apache:master May 20, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants