New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
box: multipart upload problems #2054
Comments
I think there is more than one thing going on here... In rclone.log.1.txt we see
Which means that rclone didn't wait long enough for the file to be finalised. It is likely that if clone had waited longer it would have worked. On the retry we see
Which I think must be the token expired but multiple threads refreshed it. The box refresh token is only valid once though so the remaining threads got this error. That is my guess as to what is going on. This explanation doesn't quite add up as we don't see a In rclone.log.2.txt we see the same thing with rclone not waiting long enough. In rclone.log.3.txt we see the token expiring and the refresh not working, again withouth a
rclone.log.4.beta.txt is the not waiting long enough problem... ConclusionSo, I think for big files transferred to box increasing
It might be that rclone needs to retry if that happens as it does if it gets a If you can send a log with that happening that would be very helpful :-) |
Can you have an experiment with different values of |
Just an FYI, I've seen the same issue. It happens more often when Box is either busy or perhaps they're scaling down for the night - generally some time in the evening, around 8pm I've had retries go up to 15. I have ~6TB in Box right now, streaming at ~400Mbps, night time slows it down to ~160Mbps (we have 10Gbps to Box.com so there seems to be some single-client limit). I'm uploading 50TB in the next couple of days, I'll try to log what I can but upping retries does work thus far. The number of threads doesn't seem to matter. For large numbers of large files, you can use transfers=16, for small files (4kB), transfers above 12 results in throttling issues (probably because you're hammering many requests).
|
Hi Nick, my name is Johannes, I work for Box. I found this entry via Google after not being able to find any errors on the Box side, although the downloads still failed for certain larger files. I followed your recommendation and increased the low level retries to 100 and that seems to have fixed the issue. |
Nice to hear from you Johannes @jmfrank63 . The default for I'd like to find the minimum value of Then I can tune rclone's retries accordingly. |
Hi Nick, no problem, I'll find it, I guess the perfect example for a binary
tree approach ;-)
…On Wed, Jul 4, 2018 at 5:31 PM Nick Craig-Wood ***@***.***> wrote:
Nice to hear from you Johannes @jmfrank63 <https://github.com/jmfrank63> .
The default for --low-level-retries is 10, which isn't quite enough time
to be sure box has re-assembled the parts.
I'd like to find the minimum value of --low-level-retries which produces
a reliable result - could you help with that?
Then I can tune rclone's retries accordingly.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#2054 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AC8oHvgDJIHsbdH_Kdrg-SbtDhatI52Pks5uDO3jgaJpZM4R7RV7>
.
|
Here is a spreadsheet with the number of retries for each file and time intervals it says it will retry it at: You can try making some histograms and predictions. The maximum number I got was 13, no failures. This was a 5.85 TB transfer for 456 items. |
@guruevi that is really useful thank you. I at the moment the number of retries is 20 seems like a safe margin over 14. What does everybody think about this approach? I don't really want to make yet another parameter for rclone if I don't have to! |
This is what I found at my first tries as well. With 20 I got no failures,
but I had no chance to go between 10 and 20 yet. However the failure with
10 was very reliably reproducible. To my experience the number only matters
in case of a failure, for if the file is successful there won't be any
retries, regardless of the parameter settings, so 20 should be fine.
…On Fri, Jul 6, 2018 at 9:09 AM Nick Craig-Wood ***@***.***> wrote:
@guruevi <https://github.com/guruevi> that is really useful thank you.
I at the moment the number of retries is --low-level-retries which is 10
by default. I was thinking I'd multiply that by 2 so the default number of
retries is 20, but the user still has some control over it by adjusting
--low-level-retries.
20 seems like a safe margin over 14.
What does everybody think about this approach? I don't really want to make
yet another parameter for rclone if I don't have to!
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#2054 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AC8oHmfGVbu4yuAYzGn6RSjNwGhfjFtYks5uDxtRgaJpZM4R7RV7>
.
|
Empirically in it was discovered that 20 tries is enough, so make this the minimum value with larger values settable with --low-level-retries Fixes #2054
OK, what I've decided to do is make the minimum value 20, but if you set --low-level-retries larger than that, then it will use that value instead. I've done this in https://beta.rclone.org/branch/v1.42-029-g74ab37f7-fix-2054-box-upload/ (uploaded in 15-30 mins) Can you give it a quick test and if it looks good I'll merge it. |
Hi Nick, just bumped in a file with little over 4GB that did not upload with 20 retries. To make sure it is 20 that is failing I added the parameter and set the value explicitly to 20, and it still failed. With 100 I would like to do a little more investigation why this is taking so long and what the possible problem might be. As I work in User Services I would like to open a ticket on your behalf. May I use your support email address given for rclone? |
@jmfrank63 Thanks for testing.
Hmm...
The server sends a time the client should retry so it depends... If the server doesn't send a time then each retry takes 10s. If you run with
I haven't changed the value for
Yes by all means use nick@craig-wood.com to open a ticket. I'm wondering whether we need something similar to the flag we put in for Amazon Drive:
I think this is the same problem - the assembly of all the parts of a big file takes some time, the bigger the file, the more time. It might be that I should set the number of retries really large (say 100). The max retries protects against things getting in a loop, but if box is still responding sensibly to the requests then there is an argument to say rclone should carry on trying. |
I think what I'll do is make a new variable that can be set independently of |
I've made the new flag and defaulted it to 100 here https://beta.rclone.org/v1.42-099-g751bfd45/ (uploaded in 15-30 mins) This will be in the latest beta and in v1.43 |
rclone fails to upload some of the larger files to Box. My observation is that it typically happens for the exact same files (1-13GB in size), but seems to succeed eventually when restarted over and over. I am unsure if it is because a certain amount of time (days/weeks) passed.
I ran four synchronisations in debugging mode today and attached all log files.
The errors slightly differ, but all mention "refresh token" or "multipart upload" errors:
after many
commit multipart upload failed
the following errors occur
ERROR : box root 'Jos/0.85': Token refresh failed: couldn't list files: Get https://api.box.com/2.0/folders/450***84/items?fields=type%2Cid%2Csequence_id%2Cetag%2Csha1%2Cname%2Csize%2Ccreated_at%2Cmodified_at%2Ccontent_created_at%2Ccontent_modified_at%2Citem_status&limit=1000&offset=0: oauth2: cannot fetch token: 400 Bad Request
Response: {“error":“invalid_grant”,“error_description”:“Refresh token has expired”}
I used to encounter the “range_overlaps_existing_part” error, but I didn't manage reproduce it today in debugging mode.
ERROR : ******.trr: Failed to copy: multipart upload failed to upload part: Error “range_overlaps_existing_part” (416): Part overlaps with previously uploaded part: {id: “15AE83DA”, “offset”: 6106906624, “size”: 8388608}
rclone.log.1.txt
rclone.log.2.txt
rclone.log.3.txt
rclone.log.4.beta.txt
My details:
The text was updated successfully, but these errors were encountered: