Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[#3136] fix(spark-connector): support replace column with same column name for spark connector #3461

Merged
merged 5 commits into from
Jul 29, 2024

Conversation

FANNG1
Copy link
Contributor

@FANNG1 FANNG1 commented May 20, 2024

What changes were proposed in this pull request?

To support replace column:

  • support delete and add same column name in valid column change logic in hive catalog
  • remove get column position logic if not specifying column position in Iceberg catalog

Why are the changes needed?

Fix: #3136

Does this PR introduce any user-facing change?

no

How was this patch tested?

add IT

@FANNG1 FANNG1 marked this pull request as draft May 20, 2024 03:55
@FANNG1 FANNG1 marked this pull request as ready for review May 20, 2024 04:31
@FANNG1 FANNG1 changed the title [#3136] fix(spark-connector): support replace column for spark connector [#3136] fix(spark-connector): support replace column with same column name for spark connector May 20, 2024
@FANNG1 FANNG1 marked this pull request as draft May 20, 2024 08:26
@FANNG1
Copy link
Contributor Author

FANNG1 commented May 20, 2024

@jerryshao @mchades @qqqttt123 please help to review this PR when you are free, thanks

@FANNG1 FANNG1 marked this pull request as ready for review May 20, 2024 13:16
@@ -171,40 +171,7 @@ private void doUpdateColumnType(
icebergUpdateSchema.updateColumn(fieldName, (PrimitiveType) type);
}

private ColumnPosition getAddColumnPosition(StructType parent, ColumnPosition columnPosition) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Iceberg lib will handle this internally

@FANNG1
Copy link
Contributor Author

FANNG1 commented Jul 9, 2024

@mchades @jerryshao please help to review when you are free, thanks.

@@ -143,7 +142,8 @@ private void doMoveColumn(
} else if (columnPosition instanceof TableChange.First) {
icebergUpdateSchema.moveFirst(DOT.join(fieldName));
} else {
throw new NotSupportedException(
Preconditions.checkArgument(
columnPosition instanceof TableChange.Default,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this change mean that if the user does not specify the column position, nothing will happen?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no, Iceberg will treat it as last position.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But here is the end of the block, I don't see any action after the code

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Iceberg will add the column to the last position, so no need to move the column

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed that this method is called in two places, one is doAddColumn and the other is doUpdateColumnPosition. If it's called by doUpdateColumnPosition, no changes will be made either. Are you sure this meets expectations?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, this seems bugy, let me think about it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mchades , fix it and add test, please help to review again

@@ -156,7 +160,9 @@ private void doUpdateColumnType(
icebergUpdateSchema.updateColumn(fieldName, (PrimitiveType) type);
}

private ColumnPosition getAddColumnPosition(StructType parent, ColumnPosition columnPosition) {
// Iceberg doesn't support LAST
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Iceberg doesn't support LAST, so what do you do in the method?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this method transform Last Position to After, or first

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

update the comment

@FANNG1
Copy link
Contributor Author

FANNG1 commented Jul 25, 2024

@mchades all comments are addressed, please review again

Copy link
Contributor

@mchades mchades left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@mchades mchades merged commit 2c0eb6a into apache:main Jul 29, 2024
33 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Milestone
No milestone
Development

Successfully merging this pull request may close these issues.

[Bug report] [spark-connector] replace column failed for hive table
2 participants