Skip to content
This repository has been archived by the owner on Dec 16, 2022. It is now read-only.

improve robustness of cached_path when extracting archives #4622

Merged
merged 2 commits into from
Sep 3, 2020

Conversation

epwalsh
Copy link
Member

@epwalsh epwalsh commented Sep 1, 2020

Closes #4501.

@epwalsh epwalsh requested a review from dirkgr September 1, 2020 20:48
@epwalsh epwalsh changed the title improve robustness of cache_path when extracting archives improve robustness of cached_path when extracting archives Sep 1, 2020

# We extract first to a temporary directory in case something goes wrong
# during the extraction process so we don't end up with a corrupted cache.
tmp_extraction_dir = tempfile.mkdtemp(dir=os.path.split(extraction_path)[0])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fine, but why didn't you use with tempfile.TemporaryDirectory(...) as d:?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When used as a context manager, it will try to remove the temp directory when we exit the context (there is no option to disable that). But at that point - if the extraction was successful - we've already renamed the temp directory to it's final location, so the context manager will raise an exception when it tries to remove the temp directory that no longer exists.

Comment on lines 205 to 206
os.makedirs(extraction_path)
os.replace(tmp_extraction_dir, extraction_path)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One more comment: Doesn't the first line create all the directories, including the target one? If we fail between these two lines, I think it'll conclude that the archive has been extracted and won't do it again?

In summary, I think that first one has to be os.makedirs(os.path.split(extraction_path)[0]).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes, good point.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just fixed. After taking another look, I realized we can just remove that first line altogether because at that point the base directory should already exist.

@epwalsh epwalsh merged commit e1aa57c into master Sep 3, 2020
@epwalsh epwalsh deleted the file-utils branch September 3, 2020 16:09
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

When extracting an archive fails half-way through, it corrupts the cache
2 participants