New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Spark: Parameterize backup suffix in migrate procedure #7121
Spark: Parameterize backup suffix in migrate procedure #7121
Conversation
058dbcc
to
15c3421
Compare
15c3421
to
c284234
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made my comments in the spark3.3 folder. Could you also make the corresponding changes if for spark3.2 and spark3.1.
...-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestMigrateTableProcedure.java
Outdated
Show resolved
Hide resolved
...-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestMigrateTableProcedure.java
Outdated
Show resolved
Hide resolved
...-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestMigrateTableProcedure.java
Show resolved
Hide resolved
...-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestMigrateTableProcedure.java
Show resolved
Hide resolved
spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/procedures/MigrateTableProcedure.java
Show resolved
Hide resolved
spark/v3.3/spark/src/main/java/org/apache/iceberg/spark/procedures/MigrateTableProcedure.java
Show resolved
Hide resolved
@sririshindra Thanks for the review. I believe I've made the changes requested, PTAL when you have a chance. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Please take a look at the comment I made about potentially removing Nullable and see if it makes sense. That is up to you as I am not really familiar with its standard usage.
...-extensions/src/test/java/org/apache/iceberg/spark/extensions/TestMigrateTableProcedure.java
Show resolved
Hide resolved
PTAL @rdblue @RussellSpitzer - Thanks! |
I'd rather not add another option, could we just change the suffix to backup? I just can't imagine the situation where folks need to change this dynamically |
@RussellSpitzer I think |
I think you are just making the case for changing it to something like iceberg_migration_backup or something like that. Still don't see why a user would want it to be different that a more descriptive suffix? |
Each user has a different migration strategy and they could use the suffix as a tagging mechanism in the old table name for different scenarios (this is what we do), having a fixed suffix - while it could be more descriptive - still does not fill the gap. Other issues also arise with large table names failing to be migrated due to table name length character limitations where changing the suffix to a 1 char suffix helps. |
I suppose we were okay with changing the backup name on |
This PR adds the ability to use a custom backup suffix during the
migrate
process. By default, the suffix__BACKUP__
is used, which sometimes gets represented as__backup__
in engines that lower case table names leading to confusion.This change enables uses to optionally set a suffix for those backup tables. A follow up PR in docs would update the parameter list once merged.
PTAL Thanks.