Skip to content

API, Spark 4.1: Add ignore_missing_files to migrate procedure#16643

Merged
huaxingao merged 2 commits into
apache:mainfrom
drexler-sky:migrate
Jun 5, 2026
Merged

API, Spark 4.1: Add ignore_missing_files to migrate procedure#16643
huaxingao merged 2 commits into
apache:mainfrom
drexler-sky:migrate

Conversation

@drexler-sky
Copy link
Copy Markdown
Contributor

@drexler-sky drexler-sky commented Jun 1, 2026

Adds a new ignore_missing_files parameter to the Spark migrate procedure, allowing migrations to skip over source files that are no longer present instead of failing.

@huaxingao huaxingao changed the title API, SPark 4.1: Add ignore_missing_files to migrate procedure API, Spark 4.1: Add ignore_missing_files to migrate procedure Jun 1, 2026
sql("SELECT * FROM %s", tableName));
}

private static void deleteDirectory(Path dir) throws IOException {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could use FileUtils.deleteDirectory method instead.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

* @param ignore whether to ignore missing source files
* @return this for method chaining
*/
default MigrateTable ignoreMissingFiles(boolean ignore) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if we really need boolean ignore argument here. We could remove it, and modify MigrateTableProcedure to something like:

      boolean ignoreMissingFiles = input.asBoolean(IGNORE_MISSING_FILES_PARAM, false);
      if (ignoreMissingFiles) {
        migrateTableSparkAction = migrateTableSparkAction.ignoreMissingFiles();
      }

The existing drop_backup parameter employs this style.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@github-actions github-actions Bot added the docs label Jun 3, 2026
Copy link
Copy Markdown
Contributor

@huaxingao huaxingao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@huaxingao huaxingao merged commit c00669f into apache:main Jun 5, 2026
57 checks passed
@huaxingao
Copy link
Copy Markdown
Contributor

Thanks @drexler-sky for the PR! Thanks @ebyhr for the review!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants