Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
The company I work for has encountered a problem where a sync failure just ends up deleting all destination files. I took it upon myself to examine the issue, and found someone else reported the same behavior in issue #695.
In testing different failure cases, I have found the following through log level of trace and adding an extra log to the source object retrieval section.
Test environment is a Ceph Object Gateway which contains some test files and folders to allow me to quickly sync a test data set. Then, I have a PHP script on my web server that triggers an error I saw when Ceph Object Store had an error.
Test PHP file:
First, I tested network related issues by rebooting my web server as s5cmd was in a retry loop, and I got the following error:
I added
RequestError
to the list of awsErr codes inshouldStopSync
to make this error stop the sync.My next test is with the above PHP script to reproduce the following error we got when Ceph was in a bad state:
Testing with my debug code, and I get the following:
With the above, I have added
SerializationError
to the list inshouldStopSync
.To ensure sync cancels also cancels the plan run, I added the context as a parameter and added a context error check to cancel the plan when the context cancels.
Now to test that the changes mentioned actually fixes the problem. Testing the first failure case of network related issue:
Finally the second failure case of S3 backend failure: