WIP AFTP backend data upload corruption fix #3866
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is a proof of concept fix since there will likely be a more elegant solution. But this is a working fix and possibly similar is needed for the ftp backend.
Currently AFTP will seemingly transfer a file via async ftp and return 'success' even though it seems the transfer is still proceeding. The symptom looks to be an error stating the hashes do not match.
I am able to reproduce every time when not in debug mode. Debug mode seems to introduce enough delay that the error never occurs, at least not for me.
There might be another issue going on with timing between processes; I'm not sure. The simplistic solution is to loop the check for the file size and only proceed when the file size is correct or our set time for checking has expired.
Can this solution be a stop-gap fix even in the event there is a more involved issue deeper in Duplicati?
As a WIP and discussion on this fix, what modifications to the check might others suggest? A better way than sleep()?
Real data is shown below to show how after the async upload the remote file size is smaller and continues to grow until it is the correct size. This is using a FileZilla Server installed on the same machine. The data is sorted by the filename. At 250 ms you'll see that for the blocks there is generally a number of rechecks that occur.
aftp-fix-test.csv.txt