New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fail to copy 80Gb file from local to S3 #386

Open
zioproto opened this Issue Mar 10, 2016 · 33 comments

Comments

Projects
None yet
4 participants
@zioproto

zioproto commented Mar 10, 2016

Upload from local to S3 a file of 80GB failes with the following error.

2016/03/10 07:08:08 eng/googlebooks-eng-all-3gram-20120701-punctuation.gz: Failed to copy: MultipartUpload: upload multipart failed
        upload id: 2~ByQE9lp8oV7WI-L8MQq9igV1DuiHSXa
caused by: TotalPartsExceeded: exceeded total allowed configured MaxUploadParts (10000). Adjust PartSize to fit in this limit

What is your rclone version (eg output from rclone -V)

rclone v1.28

Which OS you are using and how many bits (eg Windows 7, 64 bit)

  • Linux Ubuntu 14.04.4
  • Kernel Linux rclonevm 3.13.0-77-generic #121-Ubuntu SMP Wed Jan 20 10:50:42 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

Which cloud storage system are you using? (eg Google Drive)

S3 Interface, RadosGW (Ceph) v0.94.6

The command you were trying to run (eg rclone copy /tmp remote:tmp)

rclone sync /home/ubuntu/dataset/ unil:googlebooks-ngrams-gz/

A log from the command with the -v flag (eg output from rclone -v copy /tmp remote:tmp)

Transferring:
 * ...ooks-eng-all-3gram-20120701-punctuation.gz: 36% done. avg: 14501.5, cur: 14250.2 kByte/s. ETA: 1h42m41s

2016/03/10 07:08:08 eng/googlebooks-eng-all-3gram-20120701-punctuation.gz: Failed to copy: MultipartUpload: upload multipart failed
        upload id: 2~ByQE9lp8oV7WI-L8MQq9igV1DuiHSXa
caused by: TotalPartsExceeded: exceeded total allowed configured MaxUploadParts (10000). Adjust PartSize to fit in this limit
@zioproto

This comment has been minimized.

Show comment
Hide comment
@zioproto

zioproto Mar 10, 2016

ubuntu@rclonevm:~$ rclone -v --stats 15m copy /home/ubuntu/epflsftp/eng/googlebooks-eng-all-3gram-20120701-th.gz unil:googlebooks-ngrams-gz/eng/
2016/03/10 14:13:36 unil: Using v2 auth
2016/03/10 14:13:36 S3 bucket googlebooks-ngrams-gz path eng/: Modify window is 1ns
2016/03/10 14:13:36 S3 bucket googlebooks-ngrams-gz path eng/: Building file list
2016/03/10 14:13:38 S3 bucket googlebooks-ngrams-gz path eng/: Waiting for checks to finish
2016/03/10 14:13:38 S3 bucket googlebooks-ngrams-gz path eng/: Waiting for transfers to finish
^[[B2016/03/10 14:28:36
Transferred:   15812526080 Bytes (17150.49 kByte/s)
Errors:                 0
Checks:                 0
Transferred:            0
Elapsed time:     15m0.3s
Transferring:
 *      googlebooks-eng-all-3gram-20120701-th.gz: 18% done. avg: 17185.6, cur: 16638.7 kByte/s. ETA: 1h8m27s

2016/03/10 14:43:36
Transferred:   30796677120 Bytes (16704.76 kByte/s)
Errors:                 0
Checks:                 0
Transferred:            0
Elapsed time:     30m0.3s
Transferring:
 *      googlebooks-eng-all-3gram-20120701-th.gz: 35% done. avg: 16721.9, cur: 16428.0 kByte/s. ETA: 54m29s

2016/03/10 14:58:36
Transferred:   46446673920 Bytes (16796.94 kByte/s)
Errors:                 0
Checks:                 0
Transferred:            0
Elapsed time:     45m0.3s
Transferring:
 *      googlebooks-eng-all-3gram-20120701-th.gz: 54% done. avg: 16808.4, cur: 17594.5 kByte/s. ETA: 36m24s

2016/03/10 15:04:25 googlebooks-eng-all-3gram-20120701-th.gz: Failed to copy: MultipartUpload: upload multipart failed
        upload id: 2~QRAlRG23WGKesZJCgY2p4DrHmWvRcJT
caused by: TotalPartsExceeded: exceeded total allowed configured MaxUploadParts (10000). Adjust PartSize to fit in this limit
2016/03/10 15:04:25 googlebooks-eng-all-3gram-20120701-th.gz: Removing failed copy
2016/03/10 15:04:28 Attempt 1/3 failed with 1 errors```

zioproto commented Mar 10, 2016

ubuntu@rclonevm:~$ rclone -v --stats 15m copy /home/ubuntu/epflsftp/eng/googlebooks-eng-all-3gram-20120701-th.gz unil:googlebooks-ngrams-gz/eng/
2016/03/10 14:13:36 unil: Using v2 auth
2016/03/10 14:13:36 S3 bucket googlebooks-ngrams-gz path eng/: Modify window is 1ns
2016/03/10 14:13:36 S3 bucket googlebooks-ngrams-gz path eng/: Building file list
2016/03/10 14:13:38 S3 bucket googlebooks-ngrams-gz path eng/: Waiting for checks to finish
2016/03/10 14:13:38 S3 bucket googlebooks-ngrams-gz path eng/: Waiting for transfers to finish
^[[B2016/03/10 14:28:36
Transferred:   15812526080 Bytes (17150.49 kByte/s)
Errors:                 0
Checks:                 0
Transferred:            0
Elapsed time:     15m0.3s
Transferring:
 *      googlebooks-eng-all-3gram-20120701-th.gz: 18% done. avg: 17185.6, cur: 16638.7 kByte/s. ETA: 1h8m27s

2016/03/10 14:43:36
Transferred:   30796677120 Bytes (16704.76 kByte/s)
Errors:                 0
Checks:                 0
Transferred:            0
Elapsed time:     30m0.3s
Transferring:
 *      googlebooks-eng-all-3gram-20120701-th.gz: 35% done. avg: 16721.9, cur: 16428.0 kByte/s. ETA: 54m29s

2016/03/10 14:58:36
Transferred:   46446673920 Bytes (16796.94 kByte/s)
Errors:                 0
Checks:                 0
Transferred:            0
Elapsed time:     45m0.3s
Transferring:
 *      googlebooks-eng-all-3gram-20120701-th.gz: 54% done. avg: 16808.4, cur: 17594.5 kByte/s. ETA: 36m24s

2016/03/10 15:04:25 googlebooks-eng-all-3gram-20120701-th.gz: Failed to copy: MultipartUpload: upload multipart failed
        upload id: 2~QRAlRG23WGKesZJCgY2p4DrHmWvRcJT
caused by: TotalPartsExceeded: exceeded total allowed configured MaxUploadParts (10000). Adjust PartSize to fit in this limit
2016/03/10 15:04:25 googlebooks-eng-all-3gram-20120701-th.gz: Removing failed copy
2016/03/10 15:04:28 Attempt 1/3 failed with 1 errors```
@ncw

This comment has been minimized.

Show comment
Hide comment
@ncw

ncw Mar 10, 2016

Owner

That is interesting! The aws library is supposed to take care of that, but is obviously didn't...

What it looks like is that the file was uploaded with the default part size of 5MB. According to my calculations an 80 GB file would reach 10,000 parts after 61% which looks plausible from the log above. I make the ETA 4993s but it failed after 3047s which is 61% so check!

I see the problem - the aws library only does its magic calculations if I'm passing an io.Seeker but I'm not.

I'll upload a fix for you to try in a few minutes.

Owner

ncw commented Mar 10, 2016

That is interesting! The aws library is supposed to take care of that, but is obviously didn't...

What it looks like is that the file was uploaded with the default part size of 5MB. According to my calculations an 80 GB file would reach 10,000 parts after 61% which looks plausible from the log above. I make the ETA 4993s but it failed after 3047s which is 61% so check!

I see the problem - the aws library only does its magic calculations if I'm passing an io.Seeker but I'm not.

I'll upload a fix for you to try in a few minutes.

@ncw ncw added this to the v1.29 milestone Mar 10, 2016

@ncw ncw closed this in a1323eb Mar 10, 2016

@ncw

This comment has been minimized.

Show comment
Hide comment
@ncw

ncw Mar 10, 2016

Owner

That should be fixed now. Here is a beta for you to try.

http://pub.rclone.org/v1.28-8-ga1323eb%CE%B2/

Thanks for reporting the problem

Nick

Owner

ncw commented Mar 10, 2016

That should be fixed now. Here is a beta for you to try.

http://pub.rclone.org/v1.28-8-ga1323eb%CE%B2/

Thanks for reporting the problem

Nick

@zioproto

This comment has been minimized.

Show comment
Hide comment
@zioproto

zioproto Mar 11, 2016

I used the beta version and I was able to upload the 80GB file. Everything ok.
I also tried a 130 GB file and everything is ok.
I tested both copy and sync commands.

I think the issue can be closed.

Thank you

zioproto commented Mar 11, 2016

I used the beta version and I was able to upload the 80GB file. Everything ok.
I also tried a 130 GB file and everything is ok.
I tested both copy and sync commands.

I think the issue can be closed.

Thank you

@ncw

This comment has been minimized.

Show comment
Hide comment
@ncw

ncw Mar 11, 2016

Owner

Thanks for reporting the bug and testing the fix

-- Nick

Owner

ncw commented Mar 11, 2016

Thanks for reporting the bug and testing the fix

-- Nick

@dkorunic

This comment has been minimized.

Show comment
Hide comment
@dkorunic

dkorunic Mar 29, 2016

Hi,

This issue is still present on large files for S3 storage backend:

2016/03/29 13:00:36 xxxxx.mp4: Failed to copy: MultipartUpload: upload multipart failed
upload id: 2~_5wSYsHuZhHJj251cD5qZkTMLJUnIb9
caused by: TotalPartsExceeded: exceeded total allowed configured MaxUploadParts (10000). Adjust PartSize to fit in this limit

Is there any way to get around this limit and/or change partsize?

dkorunic commented Mar 29, 2016

Hi,

This issue is still present on large files for S3 storage backend:

2016/03/29 13:00:36 xxxxx.mp4: Failed to copy: MultipartUpload: upload multipart failed
upload id: 2~_5wSYsHuZhHJj251cD5qZkTMLJUnIb9
caused by: TotalPartsExceeded: exceeded total allowed configured MaxUploadParts (10000). Adjust PartSize to fit in this limit

Is there any way to get around this limit and/or change partsize?

@zioproto

This comment has been minimized.

Show comment
Hide comment
@zioproto

zioproto Mar 29, 2016

@dkorunic what version of the client are you using ? this fix will be available with the 1.29 Release that is not yet out.

zioproto commented Mar 29, 2016

@dkorunic what version of the client are you using ? this fix will be available with the 1.29 Release that is not yet out.

@dkorunic

This comment has been minimized.

Show comment
Hide comment
@dkorunic

dkorunic Mar 29, 2016

@zioproto I've tried latest 1.28 beta (v1.28-8-ga1323ebβ and v1.28-9-g9dccf91β). Unfortunately neither work OK with S3 and huge uploads. Any suggestion?

dkorunic commented Mar 29, 2016

@zioproto I've tried latest 1.28 beta (v1.28-8-ga1323ebβ and v1.28-9-g9dccf91β). Unfortunately neither work OK with S3 and huge uploads. Any suggestion?

@zioproto

This comment has been minimized.

Show comment
Hide comment
@zioproto

zioproto Mar 29, 2016

What file size are you trying ?

zioproto commented Mar 29, 2016

What file size are you trying ?

@dkorunic

This comment has been minimized.

Show comment
Hide comment
@dkorunic

dkorunic Mar 29, 2016

@zioproto ~2.64 GB. However, from what I see it might be related to the fact that I'm doing a S3 to S3 uploads, from one Ceph to other Ceph.

dkorunic commented Mar 29, 2016

@zioproto ~2.64 GB. However, from what I see it might be related to the fact that I'm doing a S3 to S3 uploads, from one Ceph to other Ceph.

@zioproto

This comment has been minimized.

Show comment
Hide comment
@zioproto

zioproto Mar 31, 2016

Your file is very small. The problem I had was with 80Gb. I think you see a different problem here. Please fill a new bug

zioproto commented Mar 31, 2016

Your file is very small. The problem I had was with 80Gb. I think you see a different problem here. Please fill a new bug

@mistur

This comment has been minimized.

Show comment
Hide comment
@mistur

mistur Feb 8, 2017

@dkorunic it seem that I have the same bug in the same situation, copy a big file from s3 ceph to s3 ceph and reach the MaxUploadParts (10000) limit when I copy a 8GB files

caused by: TotalPartsExceeded: exceeded total allowed configured MaxUploadParts (10000). Adjust PartSize to fit in this limit

mistur commented Feb 8, 2017

@dkorunic it seem that I have the same bug in the same situation, copy a big file from s3 ceph to s3 ceph and reach the MaxUploadParts (10000) limit when I copy a 8GB files

caused by: TotalPartsExceeded: exceeded total allowed configured MaxUploadParts (10000). Adjust PartSize to fit in this limit

@zioproto

This comment has been minimized.

Show comment
Hide comment
@zioproto

zioproto Feb 8, 2017

@ncw I was able to reproduce the bug. I think we should reopen this issue. @dkorunic was right ! Looks like also if the file is quite small like 8Gb, when source and remote are S3, the MaxUploadParts is exceeded. The value of 5242880 is used for the part size.

zioproto commented Feb 8, 2017

@ncw I was able to reproduce the bug. I think we should reopen this issue. @dkorunic was right ! Looks like also if the file is quite small like 8Gb, when source and remote are S3, the MaxUploadParts is exceeded. The value of 5242880 is used for the part size.

@ncw

This comment has been minimized.

Show comment
Hide comment
@ncw

ncw Feb 8, 2017

Owner

I've just tested this from S3 to S3 (I disabled server side copy) by copying an 8GB file from one bucket to another. I didn't see any problems.

An 8 GB file gives: PartSize = 5242880, size = 8590983168, MaxUploadParts = 10000

So this should take 8590983168/5242880 parts which is 1639 parts, so well below 10,000

@zioproto are you trying from Ceph to Ceph too? Is it possible this is a ceph bug?

If you do -v --dump-headers you can see all the parts as they go by and count them to see if they really do exceed 10,000

And are you both using a recent version of rclone? Versions before v1.29 don't contain the fix to the aws s3manager library which does cause exactly this problem (see #415)

Owner

ncw commented Feb 8, 2017

I've just tested this from S3 to S3 (I disabled server side copy) by copying an 8GB file from one bucket to another. I didn't see any problems.

An 8 GB file gives: PartSize = 5242880, size = 8590983168, MaxUploadParts = 10000

So this should take 8590983168/5242880 parts which is 1639 parts, so well below 10,000

@zioproto are you trying from Ceph to Ceph too? Is it possible this is a ceph bug?

If you do -v --dump-headers you can see all the parts as they go by and count them to see if they really do exceed 10,000

And are you both using a recent version of rclone? Versions before v1.29 don't contain the fix to the aws s3manager library which does cause exactly this problem (see #415)

@mistur

This comment has been minimized.

Show comment
Hide comment
@mistur

mistur Feb 8, 2017

I have a nginx reverse proxy set in front of the radosgw on one side, maybe there is something missing in the nginx conf ?

mistur commented Feb 8, 2017

I have a nginx reverse proxy set in front of the radosgw on one side, maybe there is something missing in the nginx conf ?

@ncw

This comment has been minimized.

Show comment
Hide comment
@ncw

ncw Feb 8, 2017

Owner

@mistur Try the transfer with -v --dump-headers - that should tell you what is going on.

Owner

ncw commented Feb 8, 2017

@mistur Try the transfer with -v --dump-headers - that should tell you what is going on.

@zioproto

This comment has been minimized.

Show comment
Hide comment
@zioproto

zioproto Feb 8, 2017

I see in the apache log of the destination radosgw that parts from 1 to 9999 are written and then the client gives up and makes a DELETE for the object.

192.26.29.0 - - [08/Feb/2017:12:31:55 +0100] "PUT /million-song/I.tar.gz?partNumber=9990&uploadId=2~afSgOhGMTMzK8NfpRahbnj8HVtI3AGr HTTP/1.1" 200 254 "-" "rclone/v1.34" 192.26.29.0 - - [08/Feb/2017:12:31:56 +0100] "PUT /million-song/I.tar.gz?partNumber=9991&uploadId=2~afSgOhGMTMzK8NfpRahbnj8HVtI3AGr HTTP/1.1" 200 254 "-" "rclone/v1.34" 192.26.29.0 - - [08/Feb/2017:12:31:56 +0100] "PUT /million-song/I.tar.gz?partNumber=9992&uploadId=2~afSgOhGMTMzK8NfpRahbnj8HVtI3AGr HTTP/1.1" 200 273 "-" "rclone/v1.34" 192.26.29.0 - - [08/Feb/2017:12:31:56 +0100] "PUT /million-song/I.tar.gz?partNumber=9993&uploadId=2~afSgOhGMTMzK8NfpRahbnj8HVtI3AGr HTTP/1.1" 200 254 "-" "rclone/v1.34" 192.26.29.0 - - [08/Feb/2017:12:31:57 +0100] "PUT /million-song/I.tar.gz?partNumber=9994&uploadId=2~afSgOhGMTMzK8NfpRahbnj8HVtI3AGr HTTP/1.1" 200 3405 "-" "rclone/v1.34" 192.26.29.0 - - [08/Feb/2017:12:31:57 +0100] "PUT /million-song/I.tar.gz?partNumber=9995&uploadId=2~afSgOhGMTMzK8NfpRahbnj8HVtI3AGr HTTP/1.1" 200 254 "-" "rclone/v1.34" 192.26.29.0 - - [08/Feb/2017:12:31:58 +0100] "PUT /million-song/I.tar.gz?partNumber=9996&uploadId=2~afSgOhGMTMzK8NfpRahbnj8HVtI3AGr HTTP/1.1" 200 254 "-" "rclone/v1.34" 192.26.29.0 - - [08/Feb/2017:12:31:58 +0100] "PUT /million-song/I.tar.gz?partNumber=9997&uploadId=2~afSgOhGMTMzK8NfpRahbnj8HVtI3AGr HTTP/1.1" 200 254 "-" "rclone/v1.34" 192.26.29.0 - - [08/Feb/2017:12:31:59 +0100] "PUT /million-song/I.tar.gz?partNumber=9998&uploadId=2~afSgOhGMTMzK8NfpRahbnj8HVtI3AGr HTTP/1.1" 200 254 "-" "rclone/v1.34" 192.26.29.0 - - [08/Feb/2017:12:32:00 +0100] "PUT /million-song/I.tar.gz?partNumber=9999&uploadId=2~afSgOhGMTMzK8NfpRahbnj8HVtI3AGr HTTP/1.1" 200 254 "-" "rclone/v1.34" 192.26.29.0 - - [08/Feb/2017:12:32:00 +0100] "DELETE /million-song/I.tar.gz?uploadId=2~afSgOhGMTMzK8NfpRahbnj8HVtI3AGr HTTP/1.1" 204 198 "-" "rclone/v1.34"

zioproto commented Feb 8, 2017

I see in the apache log of the destination radosgw that parts from 1 to 9999 are written and then the client gives up and makes a DELETE for the object.

192.26.29.0 - - [08/Feb/2017:12:31:55 +0100] "PUT /million-song/I.tar.gz?partNumber=9990&uploadId=2~afSgOhGMTMzK8NfpRahbnj8HVtI3AGr HTTP/1.1" 200 254 "-" "rclone/v1.34" 192.26.29.0 - - [08/Feb/2017:12:31:56 +0100] "PUT /million-song/I.tar.gz?partNumber=9991&uploadId=2~afSgOhGMTMzK8NfpRahbnj8HVtI3AGr HTTP/1.1" 200 254 "-" "rclone/v1.34" 192.26.29.0 - - [08/Feb/2017:12:31:56 +0100] "PUT /million-song/I.tar.gz?partNumber=9992&uploadId=2~afSgOhGMTMzK8NfpRahbnj8HVtI3AGr HTTP/1.1" 200 273 "-" "rclone/v1.34" 192.26.29.0 - - [08/Feb/2017:12:31:56 +0100] "PUT /million-song/I.tar.gz?partNumber=9993&uploadId=2~afSgOhGMTMzK8NfpRahbnj8HVtI3AGr HTTP/1.1" 200 254 "-" "rclone/v1.34" 192.26.29.0 - - [08/Feb/2017:12:31:57 +0100] "PUT /million-song/I.tar.gz?partNumber=9994&uploadId=2~afSgOhGMTMzK8NfpRahbnj8HVtI3AGr HTTP/1.1" 200 3405 "-" "rclone/v1.34" 192.26.29.0 - - [08/Feb/2017:12:31:57 +0100] "PUT /million-song/I.tar.gz?partNumber=9995&uploadId=2~afSgOhGMTMzK8NfpRahbnj8HVtI3AGr HTTP/1.1" 200 254 "-" "rclone/v1.34" 192.26.29.0 - - [08/Feb/2017:12:31:58 +0100] "PUT /million-song/I.tar.gz?partNumber=9996&uploadId=2~afSgOhGMTMzK8NfpRahbnj8HVtI3AGr HTTP/1.1" 200 254 "-" "rclone/v1.34" 192.26.29.0 - - [08/Feb/2017:12:31:58 +0100] "PUT /million-song/I.tar.gz?partNumber=9997&uploadId=2~afSgOhGMTMzK8NfpRahbnj8HVtI3AGr HTTP/1.1" 200 254 "-" "rclone/v1.34" 192.26.29.0 - - [08/Feb/2017:12:31:59 +0100] "PUT /million-song/I.tar.gz?partNumber=9998&uploadId=2~afSgOhGMTMzK8NfpRahbnj8HVtI3AGr HTTP/1.1" 200 254 "-" "rclone/v1.34" 192.26.29.0 - - [08/Feb/2017:12:32:00 +0100] "PUT /million-song/I.tar.gz?partNumber=9999&uploadId=2~afSgOhGMTMzK8NfpRahbnj8HVtI3AGr HTTP/1.1" 200 254 "-" "rclone/v1.34" 192.26.29.0 - - [08/Feb/2017:12:32:00 +0100] "DELETE /million-song/I.tar.gz?uploadId=2~afSgOhGMTMzK8NfpRahbnj8HVtI3AGr HTTP/1.1" 204 198 "-" "rclone/v1.34"

@ncw

This comment has been minimized.

Show comment
Hide comment
@ncw

ncw Feb 8, 2017

Owner

@zioproto OK so definitely 10,000 parts which is interesting... What does rclone say if you do the transfer with -v --dump-headers? I think you should get a Content-Size header which will show you how big the parts are.

How long is I.tar.gz? What do you get if you rclone ls source_remote:million-song/I.tar.gz is that the number you expect?

Owner

ncw commented Feb 8, 2017

@zioproto OK so definitely 10,000 parts which is interesting... What does rclone say if you do the transfer with -v --dump-headers? I think you should get a Content-Size header which will show you how big the parts are.

How long is I.tar.gz? What do you get if you rclone ls source_remote:million-song/I.tar.gz is that the number you expect?

@mistur

This comment has been minimized.

Show comment
Hide comment
@mistur

mistur Feb 8, 2017

here some debug log :

rclone_debug.log.zip

2017/02/08 11:09:18 >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
2017/02/08 11:09:18 <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
2017/02/08 11:09:18 HTTP RESPONSE (req 0xc420208000)
2017/02/08 11:09:18 HTTP/1.1 200 OK
Content-Length: 8165019608
Accept-Ranges: bytes
Connection: keep-alive
Content-Type: application/octet-stream
Date: Wed, 08 Feb 2017 10:09:18 GMT
Etag: "01bb5f0c5e8e3387f6b6afca7740c171-1558"
Last-Modified: Sun, 11 Sep 2016 21:00:19 GMT
Server: nginx/1.10.0 (Ubuntu)
X-Amz-Meta-Mtime: 1296240919
X-Amz-Request-Id: tx0000000000000000f1d64-00589aee4e-abfcbd-default

2017/02/08 11:09:21 <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
2017/02/08 11:09:21 >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
2017/02/08 11:09:21 HTTP REQUEST (req 0xc4202a84b0)
2017/02/08 11:09:21 PUT /bucket/I.tar.gz?partNumber=3&uploadId=2~afSgOhGMTMzK8NfpRahbnj8HVtI3AGr HTTP/1.1
Host: datasets.domain2.tld
User-Agent: rclone/v1.34
Content-Length: 5242880
Authorization: XXXX
Date: Wed, 08 Feb 2017 10:09:21 UTC
Accept-Encoding: gzip

mistur commented Feb 8, 2017

here some debug log :

rclone_debug.log.zip

2017/02/08 11:09:18 >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
2017/02/08 11:09:18 <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
2017/02/08 11:09:18 HTTP RESPONSE (req 0xc420208000)
2017/02/08 11:09:18 HTTP/1.1 200 OK
Content-Length: 8165019608
Accept-Ranges: bytes
Connection: keep-alive
Content-Type: application/octet-stream
Date: Wed, 08 Feb 2017 10:09:18 GMT
Etag: "01bb5f0c5e8e3387f6b6afca7740c171-1558"
Last-Modified: Sun, 11 Sep 2016 21:00:19 GMT
Server: nginx/1.10.0 (Ubuntu)
X-Amz-Meta-Mtime: 1296240919
X-Amz-Request-Id: tx0000000000000000f1d64-00589aee4e-abfcbd-default

2017/02/08 11:09:21 <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
2017/02/08 11:09:21 >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
2017/02/08 11:09:21 HTTP REQUEST (req 0xc4202a84b0)
2017/02/08 11:09:21 PUT /bucket/I.tar.gz?partNumber=3&uploadId=2~afSgOhGMTMzK8NfpRahbnj8HVtI3AGr HTTP/1.1
Host: datasets.domain2.tld
User-Agent: rclone/v1.34
Content-Length: 5242880
Authorization: XXXX
Date: Wed, 08 Feb 2017 10:09:21 UTC
Accept-Encoding: gzip

@mistur

This comment has been minimized.

Show comment
Hide comment
@mistur

mistur Feb 8, 2017

in the log file, I read from datasets.domain.ch and write to datasets.domain2.tld

mistur commented Feb 8, 2017

in the log file, I read from datasets.domain.ch and write to datasets.domain2.tld

@ncw

This comment has been minimized.

Show comment
Hide comment
@ncw

ncw Feb 8, 2017

Owner

This looks like a repeat of #415. In fact it looks possible that the AWS library maintainer has broken my fix...

Can you try this binary where I've reverted the S3 library to a version which I know was good please?

rclone-v1.35-76-gd091d4a-aws-sdk-revert.zip

Owner

ncw commented Feb 8, 2017

This looks like a repeat of #415. In fact it looks possible that the AWS library maintainer has broken my fix...

Can you try this binary where I've reverted the S3 library to a version which I know was good please?

rclone-v1.35-76-gd091d4a-aws-sdk-revert.zip

@ncw ncw reopened this Feb 8, 2017

@mistur

This comment has been minimized.

Show comment
Hide comment
@mistur

mistur Feb 9, 2017

Hello,

the binary is dynamically linked and not statically

rclone.v1.35 : ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, stripped
rclone.v1.35-76: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), not stripped

I got this kind of errors :

Failed to copy: SerializationError: failed to decode S3 XML error response

or

2017/02/09 15:17:30 <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
2017/02/09 15:17:30 millionsongsubset.tar.gz: Failed to copy: SerializationError: failed to decode S3 XML error response
2017/02/09 15:17:30 Attempt 3/3 failed with 27 errors and: SerializationError: failed to decode S3 XML error response
2017/02/09 15:17:30 Failed to copy: SerializationError: failed to decode S3 XML error response

mistur commented Feb 9, 2017

Hello,

the binary is dynamically linked and not statically

rclone.v1.35 : ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, stripped
rclone.v1.35-76: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked (uses shared libs), not stripped

I got this kind of errors :

Failed to copy: SerializationError: failed to decode S3 XML error response

or

2017/02/09 15:17:30 <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
2017/02/09 15:17:30 millionsongsubset.tar.gz: Failed to copy: SerializationError: failed to decode S3 XML error response
2017/02/09 15:17:30 Attempt 3/3 failed with 27 errors and: SerializationError: failed to decode S3 XML error response
2017/02/09 15:17:30 Failed to copy: SerializationError: failed to decode S3 XML error response

@mistur

This comment has been minimized.

Show comment
Hide comment
@mistur

mistur Feb 9, 2017

Some additional information, the transfer start pretty well but for files bigger than 1GB, after the first 1GB, the transfer stop or decrease so much that becomes impossible to finish the transfer.

I think I found a workaround by disabling buffering in nginx, I don't know if buffering has to be disabled or if there is a bug somewhere and disabling buffering just permit to avoid it.

proxy_buffering off ;
proxy_request_buffering off;

Yoann

mistur commented Feb 9, 2017

Some additional information, the transfer start pretty well but for files bigger than 1GB, after the first 1GB, the transfer stop or decrease so much that becomes impossible to finish the transfer.

I think I found a workaround by disabling buffering in nginx, I don't know if buffering has to be disabled or if there is a bug somewhere and disabling buffering just permit to avoid it.

proxy_buffering off ;
proxy_request_buffering off;

Yoann

@ncw

This comment has been minimized.

Show comment
Hide comment
@ncw

ncw Feb 9, 2017

Owner

The binary is dynamically linked and not statically

It is a dev build - it shouldn't make any difference.

I'm expecting it to be producing errors - these were being masked before.

However I wasn't expecting Failed to copy: SerializationError: failed to decode S3 XML error response - does it do that consistently?

Did you see any other errors in the log? Can you run with -v and check please?

Thanks

Nick

Owner

ncw commented Feb 9, 2017

The binary is dynamically linked and not statically

It is a dev build - it shouldn't make any difference.

I'm expecting it to be producing errors - these were being masked before.

However I wasn't expecting Failed to copy: SerializationError: failed to decode S3 XML error response - does it do that consistently?

Did you see any other errors in the log? Can you run with -v and check please?

Thanks

Nick

@mistur

This comment has been minimized.

Show comment
Hide comment
@mistur

mistur commented Feb 9, 2017

here the log :

rclone_debug_2.txt

@ncw

This comment has been minimized.

Show comment
Hide comment
@ncw

ncw Feb 9, 2017

Owner

Thanks for that... It looks like I reverted too much.

Here is another attempt with just the s3 uploader changes reverted

rclone-v1.35-79-g50e190f-revert-s3-changes.zip

Owner

ncw commented Feb 9, 2017

Thanks for that... It looks like I reverted too much.

Here is another attempt with just the s3 uploader changes reverted

rclone-v1.35-79-g50e190f-revert-s3-changes.zip

@mistur

This comment has been minimized.

Show comment
Hide comment
@mistur

mistur Feb 10, 2017

the problem is still present but now I have an other error :

2017/02/09 17:18:47 D.tar.gz: corrupted on transfer: sizes differ 8223584977 vs 1090510786
2017/02/09 17:18:47 D.tar.gz: Removing failed copy

rclone_debug_3.txt

which is confirm that something stop the transfert after the first 1GB. it can be nginx rpx config because with to option in the server section :

proxy_buffering off ;
proxy_request_buffering off;

I don't have the problem anymore.

the strange things is with the official 1.35, I didn't have the error message below, with the 1.35-DEV yes. and also with the -1.35-DEV rclone is able to use roundrobin DNS, with the official binary, It always goes to the same IP.

mistur commented Feb 10, 2017

the problem is still present but now I have an other error :

2017/02/09 17:18:47 D.tar.gz: corrupted on transfer: sizes differ 8223584977 vs 1090510786
2017/02/09 17:18:47 D.tar.gz: Removing failed copy

rclone_debug_3.txt

which is confirm that something stop the transfert after the first 1GB. it can be nginx rpx config because with to option in the server section :

proxy_buffering off ;
proxy_request_buffering off;

I don't have the problem anymore.

the strange things is with the official 1.35, I didn't have the error message below, with the 1.35-DEV yes. and also with the -1.35-DEV rclone is able to use roundrobin DNS, with the official binary, It always goes to the same IP.

@ncw

This comment has been minimized.

Show comment
Hide comment
@ncw

ncw Feb 10, 2017

Owner

the problem is still present but now I have an other error :
2017/02/09 17:18:47 D.tar.gz: corrupted on transfer: sizes differ 8223584977 vs 1090510786
2017/02/09 17:18:47 D.tar.gz: Removing failed copy

That is what I was expecting to see.

According to your first log, the read part of the transfer stopped before the end. However due to a bug in the s3 library it didn't notice and kepts sending 0 sized parts until it got to the part number limit.

So that is a bug (in the s3 library) which needs fixing.

If you've managed to fix the unerlying problem with nginx settings then that is good too!

the strange things is with the official 1.35, I didn't have the error message below, with the 1.35-DEV yes. and also with the -1.35-DEV rclone is able to use roundrobin DNS, with the official binary, It always goes to the same IP.

That is probably the difference between the dynamic linked version which will use your system resolver where the static one wont. It might also be the difference between go 1.7 and 1.8 also.

Owner

ncw commented Feb 10, 2017

the problem is still present but now I have an other error :
2017/02/09 17:18:47 D.tar.gz: corrupted on transfer: sizes differ 8223584977 vs 1090510786
2017/02/09 17:18:47 D.tar.gz: Removing failed copy

That is what I was expecting to see.

According to your first log, the read part of the transfer stopped before the end. However due to a bug in the s3 library it didn't notice and kepts sending 0 sized parts until it got to the part number limit.

So that is a bug (in the s3 library) which needs fixing.

If you've managed to fix the unerlying problem with nginx settings then that is good too!

the strange things is with the official 1.35, I didn't have the error message below, with the 1.35-DEV yes. and also with the -1.35-DEV rclone is able to use roundrobin DNS, with the official binary, It always goes to the same IP.

That is probably the difference between the dynamic linked version which will use your system resolver where the static one wont. It might also be the difference between go 1.7 and 1.8 also.

@mistur

This comment has been minimized.

Show comment
Hide comment
@mistur

mistur Feb 10, 2017

OK,

is that possible to get an official dynamic linked binary of I must build it from sources ?

mistur commented Feb 10, 2017

OK,

is that possible to get an official dynamic linked binary of I must build it from sources ?

@ncw

This comment has been minimized.

Show comment
Hide comment
@ncw

ncw Feb 10, 2017

Owner

is that possible to get an official dynamic linked binary of I must build it from sources ?

You'll need to build it from source.

The official build should do round robin DNS too - if it isn't then likely you have a local IP in the range of the destination I would guess...

...

Would you be able to try the transfer with this version of rclone, without your nginx fixes?

It should go wrong in the original way with the maxparts error, but it outputs lots of extra logging which will hopefully point at what exactly is going on. Then I can make a test case for the aws lib.

rclone-v1.35-80-g5b9f06d-386-s3-upload.zip

Thanks

Nick

Owner

ncw commented Feb 10, 2017

is that possible to get an official dynamic linked binary of I must build it from sources ?

You'll need to build it from source.

The official build should do round robin DNS too - if it isn't then likely you have a local IP in the range of the destination I would guess...

...

Would you be able to try the transfer with this version of rclone, without your nginx fixes?

It should go wrong in the original way with the maxparts error, but it outputs lots of extra logging which will hopefully point at what exactly is going on. Then I can make a test case for the aws lib.

rclone-v1.35-80-g5b9f06d-386-s3-upload.zip

Thanks

Nick

@mistur

This comment has been minimized.

Show comment
Hide comment
@mistur

mistur Feb 10, 2017

You'll need to build it from source.

ok

The official build should do round robin DNS too - if it isn't then likely you have a local IP in the range of the destination I would guess...

yes.

Would you be able to try the transfer with this version of rclone, without your nginx fixes?

It should go wrong in the original way with the maxparts error, but it outputs lots of extra logging which will hopefully point at what exactly is going on. Then I can make a test case for the aws lib.

rclone-v1.35-80-g5b9f06d-386-s3-upload.zip

I have transfer right now, I will test this next week.

Thanks

Yoann

mistur commented Feb 10, 2017

You'll need to build it from source.

ok

The official build should do round robin DNS too - if it isn't then likely you have a local IP in the range of the destination I would guess...

yes.

Would you be able to try the transfer with this version of rclone, without your nginx fixes?

It should go wrong in the original way with the maxparts error, but it outputs lots of extra logging which will hopefully point at what exactly is going on. Then I can make a test case for the aws lib.

rclone-v1.35-80-g5b9f06d-386-s3-upload.zip

I have transfer right now, I will test this next week.

Thanks

Yoann

@ncw

This comment has been minimized.

Show comment
Hide comment
@ncw

ncw Feb 10, 2017

Owner

Thanks!

Owner

ncw commented Feb 10, 2017

Thanks!

@zioproto

This comment has been minimized.

Show comment
Hide comment
@zioproto

zioproto Apr 5, 2017

@mistur do you have any update on this ? thank you

zioproto commented Apr 5, 2017

@mistur do you have any update on this ? thank you

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment