Skip to content

Delete destination directory when exception occurred during ingestion#50

Merged
kevinwallimann merged 1 commit intodevelopfrom
feature/remove-destination-dir-on-failure
Oct 4, 2019
Merged

Delete destination directory when exception occurred during ingestion#50
kevinwallimann merged 1 commit intodevelopfrom
feature/remove-destination-dir-on-failure

Conversation

@kevinwallimann
Copy link
Copy Markdown
Collaborator

Problem: If e.g. one task fails writing, the ingestion will be incomplete and an IngestionException is thrown. However, the output directory is not cleaned up. For streaming queries, Spark does not yet delete the written files when a task is aborted. This is only planned with Spark v3.0.0 (https://issues.apache.org/jira/browse/SPARK-27254, https://issues.apache.org/jira/browse/SPARK-27210)

This PR: If an exception occurs during ingestion the destination directory is deleted, but only if it was empty before the ingestion started. If the destination directory was not empty in the beginning, it has to be cleaned up manually.

Copy link
Copy Markdown
Contributor

@felipemmelo felipemmelo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spot on.

@kevinwallimann kevinwallimann marked this pull request as ready for review October 4, 2019 08:16
@kevinwallimann kevinwallimann merged commit 01419bd into develop Oct 4, 2019
@kevinwallimann kevinwallimann deleted the feature/remove-destination-dir-on-failure branch October 8, 2019 14:14
@kevinwallimann kevinwallimann modified the milestones: v1.1.0, v2.0.0 Jan 22, 2020
@kevinwallimann kevinwallimann added the bug Something isn't working label Jan 22, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants