400 RequestTimeout - s3cmd is not re-trying #737

jblachly · 2016-05-08T17:02:33Z

s3cmd version 1.6.1
SunOS cloudbackup 5.11 joyent_20160428T170316Z i86pc i386 i86pc Solaris

When doing an initial sync of about 100k files to S3:

s3cmd sync /path/to/photos s3://photo-archive-bucket/

.. plenty of successes for > 12 hours (averaging 600 kB/sec) ...
then:

WARNING: photos/Photos/20 - Mexico - Group1.jpg: Owner groupname not known. Storing GID=100 instead.
upload: 'photos/Photos/20 - Mexico - Group1.jpg' -> 's3://photo-archive-bucket/photos/Photos/20 - Mexico - Group1.jpg'  [15078 of 53083]
313221 of 313221   100% in   33s     9.00 kB/s  done
ERROR: S3 error: 400 (RequestTimeout): Your socket connection to the server was not read from or written to within the timeout period. Idle connections will be closed.

I found the following (very old) issue in AWS' official repository:
aws/aws-sdk-js#281
It seems like in most cases the S3 error 400 RequestTimeout Your socket connection to the server was not read from or written to within the timeout period. Idle connections will be closed. is related to incorrect content-length header. However, I had not run with --debug and so cannot confirm.

Importantly, and the reason I am filing this issue, is that I also found the following on the s3tools mailing list:
https://sourceforge.net/p/s3tools/mailman/message/25713745/
This indicates that the code had been fixed to retry on 400 status codes.

Indeed, looking at the s3cmd source code:
https://github.com/s3tools/s3cmd/blob/master/S3/S3.py#L1289
It should be retrying (up to 5 times total as the S3 class _max_retries is initialized at 5), but it is not. Why not?

The 9.00 kB/s (my other files all transmitted at > 600 kB/sec) suggest perhaps I had an intermittent internet connectivity issue. EVEN SO, s3cmd should have retried several additional times.

Looking at the code path this could happen if

retries is initialized as 0 -- I don't see this happening, but I could have missed it
if err.code in [ 'BadDigest', 'OperationAborted', 'TokenRefreshRequired', 'RequestTimeout' ]: evaluates to false, which would then skip the if try_retry: block and fall through to raise S3Error(response).

Condition 2 could happen if the response did not have a key data or data were empty:
https://github.com/s3tools/s3cmd/blob/master/S3/Exceptions.py#L52

Any other suggestions for debugging?

The text was updated successfully, but these errors were encountered:

jblachly · 2016-05-08T17:13:31Z

Oh, one additional clue.
After re-running the same sync command, s3cmd scans all files, local and remote, then resumes. Interestingly, despite that the last line of the previous run indicated "100%" had been uploaded, the first uploaded file of the new run is the same file.

WARNING: photos/Photos/20 - Mexico - Group1.jpg: Owner groupname not known. Storing GID=100 instead.
WARNING: Module python-magic is not available. Guessing MIME types based on file extensions.
upload: 'photos/Photos/20 - Mexico - Group1.jpg' -> 's3://photo-archive-bucket/photos/Photos/20 - Mexico - Group1.jpg'  [1 of 37540]
313221 of 313221   100% in    0s   605.93 kB/s  done

While re-running sync, it only ran for about 30 minutes this time; when it aborted again I went to my S3 bucket and found that the last file (some other ~250k photograph entirely) was not stored in the S3 bucket.

Next time I'll run with --debug

jblachly · 2016-05-09T01:22:54Z

Ok, I captured with --debug a failure; is there any security risk in pasting it (since it contains all x-amz headers etc?) if I scrub out my bucket name?

fviard · 2016-05-09T01:39:25Z

Thank you for your investigation of the issue.
In the beginning. The only thing to take care is your access, secret, token key that are printed with the configuration at the beginning of the log.

fviard · 2016-09-02T23:36:31Z

@jblachly Were you able to fix your issue? Otherwise don't hesitate to send the log here so we can have a look at it.

jblachly · 2016-09-04T22:17:13Z

@fviard This only seems to happen during long uploads ; I wasn't (back in May) able to re-capture the failure in a debug log and my scheduled s3cmd syncs don't write debug logs.

I will try to do another large-scale sync by hand and capture it in a log. Thanks for patience.

jblachly · 2022-01-11T16:13:18Z

stale; will re-test in future

fviard mentioned this issue Dec 10, 2017

Meta-issue - unexpected timeouts encountered during transfers #941

Closed

jblachly closed this as completed Jan 11, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

400 RequestTimeout - s3cmd is not re-trying #737

400 RequestTimeout - s3cmd is not re-trying #737

jblachly commented May 8, 2016

jblachly commented May 8, 2016 •

edited

Loading

jblachly commented May 9, 2016

fviard commented May 9, 2016

fviard commented Sep 2, 2016

jblachly commented Sep 4, 2016

jblachly commented Jan 11, 2022

400 RequestTimeout - s3cmd is not re-trying #737

400 RequestTimeout - s3cmd is not re-trying #737

Comments

jblachly commented May 8, 2016

jblachly commented May 8, 2016 • edited Loading

jblachly commented May 9, 2016

fviard commented May 9, 2016

fviard commented Sep 2, 2016

jblachly commented Sep 4, 2016

jblachly commented Jan 11, 2022

jblachly commented May 8, 2016 •

edited

Loading