New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DM-36412: Use atomic writes for copy or move of local files #33
Conversation
Codecov ReportBase: 86.21% // Head: 86.03% // Decreases project coverage by
Additional details and impacted files@@ Coverage Diff @@
## main #33 +/- ##
==========================================
- Coverage 86.21% 86.03% -0.18%
==========================================
Files 22 22
Lines 2727 2736 +9
Branches 542 505 -37
==========================================
+ Hits 2351 2354 +3
- Misses 297 303 +6
Partials 79 79
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
If multiple processes are copying to the same destination we want the content of the file to be correct even if we don't know which file we end up with. Note that os.rename does not care if the file already exists so the overwrite parameter is ignored if the target file appeared between the first check and the final rename.
try: | ||
with transaction.undoWith(f"move from {local_src}", os.rename, newFullPath, local_src): | ||
os.rename(local_src, newFullPath) | ||
except OSError: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not easy to trigger this code path because the test environment does not have multiple disks where rename would fail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, though I still have questions about some corner cases, but maybe we'll never see anything that odd.
with transaction.undoWith(f"move from {local_src}", os.rename, newFullPath, local_src): | ||
os.rename(local_src, newFullPath) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If rename fails do we actually need "undo"? I'm thinking of a case, for example, whennewFullPath
already exists for some reason but you can't overwrite it, so rename fill fail, does it make sense to move existing newFullPath
to local_src
? Maybe if renme fails, it's safe to assume that destination did not change (or does not exist)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If rename fails the undo won't ever happen. undoWith
only registers the undo callback if the code block succeeds. This is for the case where the rename works but some time later the transaction has to be rolled back.
with transaction.undoWith(f"copy from {local_src}", os.remove, newFullPath): | ||
# os.rename works even if the file exists. | ||
# It's possible that another process has copied a file | ||
# in whilst this one was copying. If overwrite | ||
# protection is needed then another stat() call should | ||
# happen here. | ||
os.rename(temp_copy.ospath, newFullPath) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar issue here, if newFullPath
exists but cannot be overwritten, is it worth trying to delete it? You probably cannot delete it if you cannot overwrite it, but some odd conditions may still exist.
If multiple processes are copying to the same destination we want the content of the file to be correct even if we don't know which file we end up with.
Note that os.rename does not care if the file already exists so the overwrite parameter is ignored if the target file appeared between the first check and the final rename.
Checklist
doc/changes