Fix: resolve issue #4478 by using a temporary file for non-append writes#4624
Fix: resolve issue #4478 by using a temporary file for non-append writes#4624milosgajdos merged 3 commits intodistribution:mainfrom
Conversation
|
Please sign your DCO @onporat |
…n-append writes To address the issue where a failed write operation results in an empty file, we can use a temporary file for non-append writes. This ensures that the original file is only replaced once the new content is fully written and committed. **Key Changes:** 1. **Temporary File Handling:** - For non-append writes, a temporary file is created in the same directory as the target file. - All write operations are performed on the temporary file first. 2. **Atomic Commit:** - The temporary file is only renamed to the target path during `Commit()`, ensuring atomic replacement. - If `Commit()` fails, the temporary file is cleaned up. 3. **Error Handling:** - `Cancel()` properly removes temporary files if the operation is aborted. - `Close()` is made idempotent to handle multiple calls safely. 4. **Data Integrity:** - Directory sync after rename ensures metadata persistence. - Proper file flushing and syncing before rename operations. Signed-off-by: Oded Porat <onporat@gmail.com>
|
@milosgajdos Could you please approve the workflows? Thank you. |
…cess is interrupted, the solution involves writing to a temporary file first and then atomically renaming it to the target file. This ensures that the target file is only updated if the write completes successfully, preventing empty or partially written files. **Explanation:** 1. **Temporary File Creation:** The content is first written to a temporary file (appending `.tmp` to the original path). This ensures that the original file remains intact until the write is complete. 2. **Write to Temporary File:** Using the existing `Writer` with truncation (`false`), the content is written to the temporary file. If the write fails, the temporary file is closed and deleted. 3. **Commit and Rename:** After successfully writing to the temporary file, it is committed. Then, the temporary file is atomically renamed to the target path using `Move`, which is handled by the filesystem's rename operation (atomic on most systems). 4. **Cleanup on Failure:** If any step fails, the temporary file is cleaned up to avoid leaving orphaned files. Signed-off-by: Oded Porat <onporat@gmail.com>
|
@milosgajdos I updated the PR, could you please approve the workflows? Thank you. |
|
@milosgajdos Hello, could you please let me know whom I should ask to review this PR? Thanks. |
There was a problem hiding this comment.
Pull Request Overview
This pull request addresses issue #4478 by introducing a safer file write mechanism that writes data to a temporary file before atomically replacing the target file to prevent empty or partially written files.
- Use a temporary file (subPath + ".tmp") for data writes
- Atomically rename the temporary file to the final target upon successful write
- Clean up the temporary file if any write or rename errors occur
| // Write to a temporary file to prevent partial writes. | ||
| writer, err := d.Writer(ctx, tempPath, false) |
There was a problem hiding this comment.
Is there anything in the Writer and/or driver that's created that handles concurrency? (Mostly considering here that before this code, the destination would be a deterministic path, which (could be? need to check the implementation) used to prevent concurrent writes to the same file.
With this change, the content is written to a temporary file (no means to check concurrency), then a os.Rename()
There was a problem hiding this comment.
The Writer is not concurrency-safe; it does not implement locking based on the filename. Concurrency handling is left to the caller, specifically the scheduler in this context.
|
Hi @thaJeztah , thank you for your comment on my PR. I noticed that this project requires two approvals before merging, and @milosgajdos has already approved the changes. Since you've already taken the time to look at the code and provide feedback, would you be able to formally review the PR when you have a moment? I'd appreciate your help in moving this forward. Thanks for your time and consideration! |
| return errors.Join(err, dErr) | ||
| } | ||
|
|
||
| if err := writer.Commit(ctx); err != nil { |
There was a problem hiding this comment.
-
I wonder why we can
Commitbefored.Move? IIUCCommitshould be the last step to confirm that the write has completed. -
I also wonder why we don't attempt to clean up the temporary file on Commit error?
To address the issue where empty files are created when the write process is interrupted, the solution involves writing to a temporary file first and then atomically renaming it to the target file. This ensures that the target file is only updated if the write completes successfully, preventing empty or partially written files.
Explanation:
Temporary File Creation: The content is first written to a temporary file (appending
.tmpto the original path). This ensures that the original file remains intact until the write is complete.Write to Temporary File: Using the existing
Writerwith truncation (false), the content is written to the temporary file. If the write fails, the temporary file is closed and deleted.Commit and Rename: After successfully writing to the temporary file, it is committed. Then, the temporary file is atomically renamed to the target path using
Move, which is handled by the filesystem's rename operation (atomic on most systems).Cleanup on Failure: If any step fails, the temporary file is cleaned up to avoid leaving orphaned files.