Using rewrite_table_path iceberg procedure on a backup

### Query engine

Spark 3.4.3

### Question

Hi Iceberg team,

I was wondering how to best use the [rewrite_table_path](https://iceberg.apache.org/docs/latest/spark-procedures/#rewrite_table_path) procedure on a Backup.

My situation is the following:

- I have an S3 bucket on which Iceberg stores the data and metadata files
- My metastore is being stored in a Hive metastore in a Postgres DB on RDS
- I have a backup of that S3 bucket on another S3 bucket in another region, maybe even another account
- I also have a backup of the RDS on the other account
- Let's say my original S3 bucket got corrupted or I can't reach it anymore, so I need to switch to the backup bucket and backup RDS
- Now I wanted to use `rewrite_table_path` and `register_table` to recreate the tables so that I can use them

What I gather from the documentation:

- the `rewrite_table_path` needs to have a registered table to work, because you are specifying the table name in the CALL command
- on the other hand it says that only after I have run `rewrite_table_path`, I should run `register_table` with the new metadata.json. Which makes total sense to me.

My problem is now, how can I run `rewrite_table_path` without registering the table first? In this case, Spark returns me a `Couldn't load table`, which makes sense, because the table does not exist.

And in case I first register the table, Spark returns another error `Path s3a://backup-bucket/test_table/metadata/v1.metadata.json does not start with s3a://original-bucket/test_table/`.

I understand how the `rewrite_table_path` would work if I can run this on my original bucket with the existing table, then move the data and metadata files to a new bucket and run `register_table` there. But that might not be possible for me if the old bucket got destroyed or corrupted or is otherwise unreachable.

In [this blog](https://www.dremio.com/blog/disaster-recovery-for-apache-iceberg-tables-restoring-from-backup-and-getting-back-online/) they state that my approach should work, but I cannot execute `3. Check for File Path Changes Before Recovery` because of the problem described above.

I feel that I'm missing something very obvious. Please advise!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Using rewrite_table_path iceberg procedure on a backup #14606

Query engine

Question

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Using rewrite_table_path iceberg procedure on a backup #14606

Description

Query engine

Question

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions