Skip to content

Conversation

@lincoln-lil
Copy link
Contributor

What is the purpose of the change

This is an addition to the documentation for the newly added interface in FLINK-31487. It may be beneficial for connector developers who want to avoid overwriting non-target columns with null values when processing partial column updates.

Brief change log

Update the current sql doc dev/table/sql/insert.md

@flinkbot
Copy link
Collaborator

flinkbot commented Jul 26, 2023

CI report:

Bot commands The @flinkbot bot supports the following commands:
  • @flinkbot run azure re-run the last Azure build

Copy link
Contributor

@luoyuxia luoyuxia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lincoln-lil Thanks for the pr. I left minor comments. PTAL.

that 'x' is written to column 'c' and 'y' is written to column 'b' and 'a' is set to NULL (assuming column 'a' is nullable).
that 'x' is written to column 'c' and 'y' is written to column 'b' and 'a' is set to NULL (assuming column 'a' is nullable).
For the connector developers who want to avoid overwriting non-target columns with null values when processing partial column updates,
this user specified target column list can be found from the {{< gh_link file="flink-table/flink-table-common/src/main/java/org/apache/flink/table/connector/sink/DynamicTableSink.java" name="DynamicTableSink$Context.getTargetColumns()" >}}.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit

Suggested change
this user specified target column list can be found from the {{< gh_link file="flink-table/flink-table-common/src/main/java/org/apache/flink/table/connector/sink/DynamicTableSink.java" name="DynamicTableSink$Context.getTargetColumns()" >}}.
this user-specified target column list can be found from the {{< gh_link file="flink-table/flink-table-common/src/main/java/org/apache/flink/table/connector/sink/DynamicTableSink.java" name="DynamicTableSink$Context.getTargetColumns()" >}}.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the above new version is ok, then this dash will not be added

Given a table T(a INT, b INT, c INT), Flink supports INSERT INTO T(c, b) SELECT x, y FROM S. The expectation is
that 'x' is written to column 'c' and 'y' is written to column 'b' and 'a' is set to NULL (assuming column 'a' is nullable).
that 'x' is written to column 'c' and 'y' is written to column 'b' and 'a' is set to NULL (assuming column 'a' is nullable).
For the connector developers who want to avoid overwriting non-target columns with null values when processing partial column updates,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like we miss how the connector devs can avoid it? may be some like:

For the connector .... when processing partial column updates, they can find the user-specified target column list and only update the target columns.

?

Copy link
Contributor Author

@lincoln-lil lincoln-lil Jul 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For connector developers who want to avoid overwriting non-target columns with null values when processing partial column updates, you can get the information about the target columns specified by the user's insert statement from 'getTargetColumns(link...)' and decide how to process the partial updates.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The corresponding chinese version: 连接器开发人员在处理部分列更新时,如果希望避免用空值覆盖非目标列,可以从 "getTargetColumns(link...) "中获取用户插入语句指定的目标列信息,然后决定如何处理部分更新。

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I like the tweaked description.

Given a table T(a INT, b INT, c INT), Flink supports INSERT INTO T(c, b) SELECT x, y FROM S. The expectation is
that 'x' is written to column 'c' and 'y' is written to column 'b' and 'a' is set to NULL (assuming column 'a' is nullable).
that 'x' is written to column 'c' and 'y' is written to column 'b' and 'a' is set to NULL (assuming column 'a' is nullable).
For the connector developers who want to avoid overwriting non-target columns with null values when processing partial column updates,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I'm wondering whether it's more clear to break the line in here since these lines are more likely to say other things.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

current version: image
a new line seems to be better, will update it.

Copy link
Contributor Author

@lincoln-lil lincoln-lil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@luoyuxia thanks for reviewing this! What do you think of the tweaked description?

Given a table T(a INT, b INT, c INT), Flink supports INSERT INTO T(c, b) SELECT x, y FROM S. The expectation is
that 'x' is written to column 'c' and 'y' is written to column 'b' and 'a' is set to NULL (assuming column 'a' is nullable).
that 'x' is written to column 'c' and 'y' is written to column 'b' and 'a' is set to NULL (assuming column 'a' is nullable).
For the connector developers who want to avoid overwriting non-target columns with null values when processing partial column updates,
Copy link
Contributor Author

@lincoln-lil lincoln-lil Jul 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For connector developers who want to avoid overwriting non-target columns with null values when processing partial column updates, you can get the information about the target columns specified by the user's insert statement from 'getTargetColumns(link...)' and decide how to process the partial updates.

Given a table T(a INT, b INT, c INT), Flink supports INSERT INTO T(c, b) SELECT x, y FROM S. The expectation is
that 'x' is written to column 'c' and 'y' is written to column 'b' and 'a' is set to NULL (assuming column 'a' is nullable).
that 'x' is written to column 'c' and 'y' is written to column 'b' and 'a' is set to NULL (assuming column 'a' is nullable).
For the connector developers who want to avoid overwriting non-target columns with null values when processing partial column updates,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

current version: image
a new line seems to be better, will update it.

that 'x' is written to column 'c' and 'y' is written to column 'b' and 'a' is set to NULL (assuming column 'a' is nullable).
that 'x' is written to column 'c' and 'y' is written to column 'b' and 'a' is set to NULL (assuming column 'a' is nullable).
For the connector developers who want to avoid overwriting non-target columns with null values when processing partial column updates,
this user specified target column list can be found from the {{< gh_link file="flink-table/flink-table-common/src/main/java/org/apache/flink/table/connector/sink/DynamicTableSink.java" name="DynamicTableSink$Context.getTargetColumns()" >}}.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if the above new version is ok, then this dash will not be added

Given a table T(a INT, b INT, c INT), Flink supports INSERT INTO T(c, b) SELECT x, y FROM S. The expectation is
that 'x' is written to column 'c' and 'y' is written to column 'b' and 'a' is set to NULL (assuming column 'a' is nullable).
that 'x' is written to column 'c' and 'y' is written to column 'b' and 'a' is set to NULL (assuming column 'a' is nullable).
For the connector developers who want to avoid overwriting non-target columns with null values when processing partial column updates,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The corresponding chinese version: 连接器开发人员在处理部分列更新时,如果希望避免用空值覆盖非目标列,可以从 "getTargetColumns(link...) "中获取用户插入语句指定的目标列信息,然后决定如何处理部分更新。

Copy link
Contributor

@luoyuxia luoyuxia left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lincoln-lil Thanks for updating. LGTM.

@lincoln-lil lincoln-lil merged commit 65f200b into apache:master Jul 27, 2023
@lincoln-lil lincoln-lil deleted the FLINK-32674 branch July 27, 2023 05:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants