Skip to content

Conversation

shuiqiangchen
Copy link
Contributor

@shuiqiangchen shuiqiangchen commented Feb 9, 2023

What is the purpose of the change

Fix the bug that it excludes all metadata and computed columns when doing preValidateRewrite with partial insert.

Brief change log

  • Added util to get persisted schema.
  • Apply persisted columns as target row type when doing pre-validate rewrite for partial insert.

Verifying this change

This change has test cases covered in PartialInsertTest.

Does this pull request potentially affect one of the following parts:

  • Dependencies (does it add or upgrade a dependency): (no)
  • The public API, i.e., is any changed class annotated with @Public(Evolving): (no)
  • The serializers: (no)
  • The runtime per-record code paths (performance sensitive): (no)
  • Anything that affects deployment or recovery: JobManager (and its components), Checkpointing, Kubernetes/Yarn, ZooKeeper: (no)
  • The S3 file system connector: (no)

Documentation

  • Does this pull request introduce a new feature? (no)
  • If yes, how is the feature documented? (not documented)

@flinkbot
Copy link
Collaborator

flinkbot commented Feb 9, 2023

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run azure re-run the last Azure build

@shuiqiangchen
Copy link
Contributor Author

@flinkbot run azure

1 similar comment
@shuiqiangchen
Copy link
Contributor Author

@flinkbot run azure

@shuiqiangchen
Copy link
Contributor Author

@PatrickRen could you please help figure out why it didn't trigger the azure run after the @flinkbot run azure ?

@MartijnVisser
Copy link
Contributor

@shuiqiangchen You need to rebase your PR on the latest changes in master

@shuiqiangchen shuiqiangchen force-pushed the FLINK-30922-write-metadata-column branch from 996be99 to ca74da7 Compare February 11, 2023 02:45
@shuiqiangchen
Copy link
Contributor Author

@MartijnVisser Thank you for reminding. It worked after rebasing.

@shuiqiangchen shuiqiangchen changed the title [FLINK-30922][table-planner] Apply persisted columns when doing appendPartitionAndNu… [FLINK-30922][table-planner] Apply persisted columns when doing appendPartitionAndNullsProjects Feb 22, 2023
Copy link
Contributor

@lincoln-lil lincoln-lil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shuiqiangchen thanks for fixing this! I've left some comments, and we'd better also add tests for the sink table which has computed columns(virtual and non-virtual)


/**
* Return {@link TableSchema} which consists of all persisted columns. That means, the virtual
* computed columns and metadata columns are filterd out.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: -> filtered

* computed columns and metadata columns are filterd out.
*
* <p>Readers(or writers) such as {@link TableSource} and {@link TableSink} should use this
* persisted schema to generate {@link TableSource#getProducedDataType()} and {@link
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Users should know the difference between this method and getPhysicalSchema

* TableSource#getTableSchema()} rather than using the raw TableSchema which may contains
* additional columns.
*/
public static TableSchema getPersistedSchema(TableSchema tableSchema) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we'd better extract a common method for this and getPhysicalSchema since only one condition differs

@shuiqiangchen shuiqiangchen force-pushed the FLINK-30922-write-metadata-column branch from ca74da7 to d6ddc18 Compare March 6, 2023 11:15
@shuiqiangchen
Copy link
Contributor Author

@lincoln-lil Thank you for your comments. I have added a test case for partialInsert with computed columns and optimized the comments according to your suggestion.

Copy link
Contributor

@lincoln-lil lincoln-lil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@lincoln-lil lincoln-lil merged commit a2d78e6 into apache:master Mar 6, 2023
@lincoln-lil
Copy link
Contributor

@shuiqiangchen could you also create backport pr to release-1.17 branch?

shuiqiangchen added a commit to shuiqiangchen/flink that referenced this pull request Mar 6, 2023
…ns in column list of a partial-insert statement

This closes apache#21897

(cherry picked from commit a2d78e6)
@shuiqiangchen
Copy link
Contributor Author

@lincoln-lil Thank you, I will do it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants