New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

drive: increase pacer to 100ms by default #2880

Open
ncw opened this Issue Dec 30, 2018 · 17 comments

Comments

Projects
None yet
5 participants
@ncw
Copy link
Owner

ncw commented Dec 30, 2018

Rclone currently uses a minimum pacer time for the google drive backend of 10mS which equates to 100 transactions per second.

However the maximum allowed quota per user is 1000 transactions in 100 seconds. Even rclone's key can't exceed this.

This means that we are potentially doing more work, receiving more 403 errors and increasing the chance of a ban.

I propose to reduce the minimum pacer timeout to 100mS which is equivalent to 10 transactions per second.

In support of this, here are some tests made with an otherwise unused client_id with the user quota set to 1000 transactions per 100s. These show uploading 100 small fles and listing 2,400 files (using rclone size) each in a reasonably normal file system heirachy.

These support quite strongly the idea that tpslimit 10 is the correct limit.

image

image

image

image

I haven't tested how equivalent exactly --tps-limit and the pacer limits are though.

So questions

  1. Is this a good idea?
  2. Should the pacer be configurable by the user?
  • For drive min-sleep is the only parameter but there are others for the other backends

Thoughts @animosity22 and @B4dM4n ?

I'll do a test implementation with the default pacer set to 100ms and making it configurable and post it here in a moment so we can experiment.

@ncw ncw added this to the v1.46 milestone Dec 30, 2018

@ncw ncw referenced this issue Dec 30, 2018

Merged

Update drive.md #2856

5 of 5 tasks complete

ncw added a commit that referenced this issue Dec 30, 2018

@ncw

This comment has been minimized.

Copy link
Owner

ncw commented Dec 30, 2018

I've made a quick test here

https://beta.rclone.org/branch/v1.45-057-g02c84fce-fix-2880-drive-pacer-beta/ (uploaded in 15-30 mins)

This has the new flag --drive-pacer-min-sleep 100ms the default is 100ms.

Experimenting with this and comparing it to 10ms I see that if you are doing small operations that don't normally return a 403 then it is slower than 100ms as soon as the 403's kick in then they are the same.

It looks like this threshold is about 100 transactions, so you can get ~100 transactions for "free" before the rate limiting kicks in. So I wonder if we should attempt to model that. This is something that https://godoc.org/golang.org/x/time/rate can already do so maybe we should be using that in the pacer.

@animosity22

This comment has been minimized.

Copy link
Contributor

animosity22 commented Dec 31, 2018

I think that's a great idea as having more of the defaults set proper for GD makes the 'out of the box' setup much easier.

As I've seen, we have a lot of programs that seem to be popping up with odd bugs that break things due to how they are setup and how they request things. If we can prevent those issues, that helps everyone out.

@ajkis

This comment has been minimized.

Copy link

ajkis commented Dec 31, 2018

Great will test it after new year ( cant access server now and dont wanna risk something going wrong )

@calisro

This comment has been minimized.

Copy link

calisro commented Jan 2, 2019

Looks good in the surface. I'll have a go. I have services that I've kept at the default (smaller batches) and ones I've raised tps for to 10 already. I'll see what impact this has. It's always nice to have things user configurable but that also means more questions and people setting it incorrectly.

@ajkis

This comment has been minimized.

Copy link

ajkis commented Jan 2, 2019

@ncw it seems one of the flags is broken in this version

Error: invalid argument "128MB" for "--drive-chunk-size" flag: parsing "128MB" as fs.SizeSuffix failed: strconv.ParseFloat: parsing "128M": invalid syntax

Results after a minute

2019/01/02 17:25:09 DEBUG : rclone: Version "v1.45-057-g02c84fce-fix-2880-drive-pacer-beta" starting with parameters ["/usr/bin/rclone" "sync" "dropbox:" "gdrive:" "--retries=5" "--transfers=2" "--checkers=2" "--tpslimit=2" "--drive-pacer-min-sleep=100ms" "--log-level=DEBUG" "--log-file=/home/plex/logs/syncgdrive.log"]
2019/01/02 17:25:09 DEBUG : Using config file from "/home/plex/.config/rclone/rclone.conf"
2019/01/02 17:25:09 INFO  : Starting HTTP transaction limiter: max 2 transactions/s with burst 1
2019/01/02 17:25:11 DEBUG : cloud.check: Size and modification time the same (differ by -333ms, within tolerance 1s)
2019/01/02 17:25:11 DEBUG : cloud.check: Unchanged skipping
2019/01/02 17:25:11 DEBUG : dbupload-file: Modification times differ by -3m41s: 2019-01-02 16:25:07 +0000 UTC, 2019-01-02 16:21:26 +0000 UTC
2019/01/02 17:25:11 DEBUG : wd: Size and modification time the same (differ by 0s, within tolerance 1s)
2019/01/02 17:25:11 DEBUG : wd: Unchanged skipping
2019/01/02 17:25:12 DEBUG : crypt/5mrsfc1j0uv1k431nbvsunre2c: Size and modification time the same (differ by 0s, within tolerance 1s)
2019/01/02 17:25:12 DEBUG : crypt/5mrsfc1j0uv1k431nbvsunre2c: Unchanged skipping
2019/01/02 17:25:12 DEBUG : crypt/sgb71ldpa5vv1npmdmbgmmkog0: Size and modification time the same (differ by 0s, within tolerance 1s)
2019/01/02 17:25:12 DEBUG : crypt/sgb71ldpa5vv1npmdmbgmmkog0: Unchanged skipping
2019/01/02 17:25:12 DEBUG : crypt/h118c1t3j6c4j3c5akht76nvgg: Size and modification time the same (differ by -282ms, within tolerance 1s)
2019/01/02 17:25:12 DEBUG : crypt/h118c1t3j6c4j3c5akht76nvgg: Unchanged skipping
2019/01/02 17:25:14 DEBUG : pacer: Rate limited, sleeping for 1.945344173s (1 consecutive low level retries)
2019/01/02 17:25:14 DEBUG : pacer: low level retry 1/10 (error googleapi: Error 403: Rate Limit Exceeded, rateLimitExceeded)
2019/01/02 17:25:14 DEBUG : pacer: Resetting sleep to minimum 100ms on success
2019/01/02 17:25:15 DEBUG : pacer: Rate limited, sleeping for 1.355006674s (1 consecutive low level retries)
2019/01/02 17:25:15 DEBUG : pacer: low level retry 1/1 (error googleapi: Error 403: Rate Limit Exceeded, rateLimitExceeded)
2019/01/02 17:25:15 DEBUG : dbupload-file: Received error: googleapi: Error 403: Rate Limit Exceeded, rateLimitExceeded - low level retry 1/10
2019/01/02 17:25:15 DEBUG : pacer: Rate limited, sleeping for 2.620814069s (2 consecutive low level retries)
2019/01/02 17:25:15 DEBUG : pacer: low level retry 2/10 (error googleapi: Error 403: Rate Limit Exceeded, rateLimitExceeded)
2019/01/02 17:25:18 DEBUG : pacer: Resetting sleep to minimum 100ms on success
2019/01/02 17:25:20 DEBUG : pacer: Rate limited, sleeping for 1.785896162s (1 consecutive low level retries)
2019/01/02 17:25:20 DEBUG : pacer: low level retry 3/10 (error googleapi: Error 403: Rate Limit Exceeded, rateLimitExceeded)
2019/01/02 17:25:20 DEBUG : pacer: Rate limited, sleeping for 2.693314168s (2 consecutive low level retries)
2019/01/02 17:25:20 DEBUG : pacer: low level retry 1/1 (error googleapi: Error 403: Rate Limit Exceeded, rateLimitExceeded)
2019/01/02 17:25:20 DEBUG : dbupload-file: Received error: googleapi: Error 403: Rate Limit Exceeded, rateLimitExceeded - low level retry 2/10
2019/01/02 17:25:20 DEBUG : pacer: Rate limited, sleeping for 4.661133794s (3 consecutive low level retries)
2019/01/02 17:25:20 DEBUG : pacer: low level retry 1/10 (error googleapi: Error 403: Rate Limit Exceeded, rateLimitExceeded)
2019/01/02 17:25:20 DEBUG : pacer: Rate limited, sleeping for 8.338948761s (4 consecutive low level retries)
2019/01/02 17:25:20 DEBUG : pacer: low level retry 4/10 (error googleapi: Error 403: Rate Limit Exceeded, rateLimitExceeded)
2019/01/02 17:25:22 DEBUG : pacer: Rate limited, sleeping for 16.010951265s (5 consecutive low level retries)
2019/01/02 17:25:22 DEBUG : pacer: low level retry 2/10 (error googleapi: Error 403: Rate Limit Exceeded, rateLimitExceeded)
2019/01/02 17:25:30 DEBUG : pacer: Resetting sleep to minimum 100ms on success
2019/01/02 17:25:47 DEBUG : crypt/mo3m3ugaa2kpcllh3kqkqclcs4/7fmc6u7cdkhg5g2q4b13qtg7dg: Size and modification time the same (differ by -184ms, within tolerance 1s)
2019/01/02 17:25:47 DEBUG : crypt/mo3m3ugaa2kpcllh3kqkqclcs4/7fmc6u7cdkhg5g2q4b13qtg7dg: Unchanged skipping
2019/01/02 17:25:47 INFO  : dbupload-file: Copied (replaced existing)
2019/01/02 17:25:54 DEBUG : pacer: Rate limited, sleeping for 1.926747841s (1 consecutive low level retries)
2019/01/02 17:25:54 DEBUG : pacer: low level retry 1/10 (error googleapi: Error 403: Rate Limit Exceeded, rateLimitExceeded)
2019/01/02 17:25:54 DEBUG : pacer: Rate limited, sleeping for 2.015651772s (2 consecutive low level retries)
2019/01/02 17:25:54 DEBUG : pacer: low level retry 2/10 (error googleapi: Error 403: Rate Limit Exceeded, rateLimitExceeded)
2019/01/02 17:25:55 DEBUG : pacer: Resetting sleep to minimum 100ms on success
@B4dM4n

This comment has been minimized.

Copy link
Collaborator

B4dM4n commented Jan 3, 2019

@ncw it seems one of the flags is broken in this version

Error: invalid argument "128MB" for "--drive-chunk-size" flag: parsing "128MB" as fs.SizeSuffix failed: strconv.ParseFloat: parsing "128M": invalid syntax

@ajkis The unit symbol in fs.SizeSuffix is only one char, like 2b, 128m, 5G.

Experimenting with this and comparing it to 10ms I see that if you are doing small operations that don't normally return a 403 then it is slower than 100ms as soon as the 403's kick in then they are the same.

It looks like this threshold is about 100 transactions, so you can get ~100 transactions for "free" before the rate limiting kicks in. So I wonder if we should attempt to model that. This is something that https://godoc.org/golang.org/x/time/rate can already do so maybe we should be using that in the pacer.

Being able to burst requests is probably a really good feature when using rclone mount, as a default sleep time of 100ms would slow it down substantially when only using requests sporadically.

The https://godoc.org/golang.org/x/time/rate package looks really promising for this.

@animosity22

This comment has been minimized.

Copy link
Contributor

animosity22 commented Jan 3, 2019

@B4dM4n - I'm not quite sure how to test, but can you actually burst on the Google Drive or do you get limited for that as well? I wasn't quite sure how to test if you can do say 900 transactions in 1-2 seconds and none the next 88-89 to meet the 1000 per 100 seconds or if you just got limited as well so the bursting didn't work either.

@ajkis

This comment has been minimized.

Copy link

ajkis commented Jan 4, 2019

nvm
did not have bwlimit set

@ajkis

This comment has been minimized.

Copy link

ajkis commented Jan 6, 2019

Ver: rclone: Version "v1.45-057-g02c84fce-fix-2880-drive-pacer-beta"

rclone sync dropbox: gdrive: --retries=5 --transfers=10 --checkers=4 --tpslimit=2 --drive-pacer-min-sleep=100ms --bwlimit=35M --drive-chunk-size=128M --log-level=INFO --log-file=$LogFile

Still getting

2019/01/06 08:46:32 ERROR : crypt/hm2ka8or5qp1pefhtc1e62jn50/05bcvetfol5rvlssrun8rlsacg: error reading destination directory: couldn't list directory: googleapi: Error 403: Rate Limit Exceeded, rateLimitExceeded 2019/01/06 08:47:30 INFO : Transferred: 50.817G / 1016.531 GBytes, 5%, 29.898 MBytes/s, ETA 9h11m15s Errors: 1 (retrying may help) Checks: 6448 / 6448, 100% Transferred: 8 / 76, 11% Elapsed time: 29m0.4s Transferring: crypt/0ibcad98vheuku…7lqjtei3kqgavdt2hkkg: 8% /69.469G, 3.294M/s, 5h31m7s crypt/0ibcad98vheuku…bku4sdjjnk61lubuelo8: 12% /47.144G, 3.362M/s, 3h30m25s crypt/0ibcad98vheuku…tgfh873a192i7hboog08: 11% /48.363G, 3.867M/s, 3h9m14s crypt/qbu2chuhnvlm4g…18d3dsln8lgmj5p6545u: 83% /2.649G, 3.446M/s, 2m10s crypt/qbu2chuhnvlm4g…b1knj85n0r0gcs9o08q8: 85% /2.650G, 3.338M/s, 2m0s crypt/qbu2chuhnvlm4g…f2fbefsfur20tg8939ku: 87% /2.646G, 3.400M/s, 1m36s crypt/qbu2chuhnvlm4g…o842gaamnge76pbom3im: 87% /2.649G, 3.509M/s, 1m36s crypt/qbu2chuhnvlm4g…p10621m4tbetrfticmqc: 45% /4.372G, 3.284M/s, 12m19s crypt/qbu2chuhnvlm4g…rtfadq11g82n9sqs6p8c: 83% /2.650G, 3.330M/s, 2m18s crypt/qbu2chuhnvlm4g…vlvmm8gbs28g8l3o407i: 84% /2.649G, 3.716M/s, 1m56s

@ajkis

This comment has been minimized.

Copy link

ajkis commented Jan 7, 2019

This is funny how google did not block me yet
cmd:
/usr/bin/rclone sync dropbox: gdrive: --retries=5 --transfers=10 --checkers=3 --tpslimit=2 --drive-pacer-min-sleep=100ms --bwlimit=35M --drive-chunk-size=256M --log-level=INFO --log-file=$LogFile

8784988b09 1

@calisro

This comment has been minimized.

Copy link

calisro commented Jan 7, 2019

Wonder if that is recreatable or a fluke.

@ajkis

This comment has been minimized.

Copy link

ajkis commented Jan 8, 2019

It seems some of the files are getting copied over and over again
Full log: https://paste.ubuntu.com/p/9JrKjZBZjp/
last log

2019/01/08 10:10:30 INFO  : 
Transferred:   	    5.225T / 5.619 TBytes, 93%, 30.522 MBytes/s, ETA 3h45m6s
Errors:                28 (retrying may help)
Checks:            223176 / 223176, 100%
Transferred:          194 / 216, 90%
Elapsed time:  49h52m0.4s
Transferring:
 *     crypt/0ibcad98vheuku…1niqjke6df4of3ap9oh0: 19% /50.926G, 3.510M/s, 3h18m44s
 *     crypt/0ibcad98vheuku…1v9bdn1loi4mfdt77ou2: 20% /48.405G, 3.476M/s, 3h9m41s
 *     crypt/0ibcad98vheuku…2dpfmbm5m77f5fmm7psk: 21% /56.085G, 3.481M/s, 3h35m58s
 *     crypt/0ibcad98vheuku…7lqjtei3kqgavdt2hkkg: 17% /69.469G, 3.546M/s, 4h35m12s
 *     crypt/0ibcad98vheuku…bku4sdjjnk61lubuelo8: 26% /47.144G, 3.484M/s, 2h50m12s
 *     crypt/0ibcad98vheuku…fdpc49eorf4ccmm80da8: 21% /52.072G, 3.502M/s, 3h19m55s
 *     crypt/0ibcad98vheuku…op0aap34r59ehtli4sgg: 27% /41.127G, 3.525M/s, 2h25m3s
 *     crypt/0ibcad98vheuku…rmh25clh54bq54316uqg: 21% /53.052G, 3.494M/s, 3h24m31s
 *     crypt/0ibcad98vheuku…tgfh873a192i7hboog08: 25% /48.363G, 3.520M/s, 2h55m15s
 *     crypt/hm2ka8or5qp1pe…pfghhn1mgvulklkf8gi0: 80% /9.237G, 3.461M/s, 8m51s
@ajkis

This comment has been minimized.

Copy link

ajkis commented Jan 9, 2019

I've made a quick test here

https://beta.rclone.org/branch/v1.45-057-g02c84fce-fix-2880-drive-pacer-beta/ (uploaded in 15-30 mins)

This has the new flag --drive-pacer-min-sleep 100ms the default is 100ms.

Experimenting with this and comparing it to 10ms I see that if you are doing small operations that don't normally return a 403 then it is slower than 100ms as soon as the 403's kick in then they are the same.

It looks like this threshold is about 100 transactions, so you can get ~100 transactions for "free" before the rate limiting kicks in. So I wonder if we should attempt to model that. This is something that https://godoc.org/golang.org/x/time/rate can already do so maybe we should be using that in the pacer.

Any chance you merge this into beta so I can test the future updates as well.

@calisro

This comment has been minimized.

Copy link

calisro commented Jan 9, 2019

I think before the default of 100ms is merged, we should consider the bursting. It will have less impact on mounts and other infrequently used remotes if bursting was handled like @B4dM4n mentioned. Thoughts?

@ajkis

This comment has been minimized.

Copy link

ajkis commented Jan 10, 2019

I think before the default of 100ms is merged, we should consider the bursting. It will have less impact on mounts and other infrequently used remotes if bursting was handled like @B4dM4n mentioned. Thoughts?

I agree bursting could be a good way to get max out of it.

p.s. How is with dropbox and bursting.

@ncw

This comment has been minimized.

Copy link
Owner

ncw commented Jan 22, 2019

https://beta.rclone.org/branch/v1.45-059-gf935acb2-fix-2880-drive-pacer-beta/ (uploaded in 15-30 mins)

This lets the pacer burst when it hasn’t used any calls recently - this should be what rclone mount needs.

The binary has a --driver-pacer-burst flag to control this.

@ncw

This comment has been minimized.

Copy link
Owner

ncw commented Jan 23, 2019

I messed up the build, try

https://beta.rclone.org/branch/v1.45-103-gc9e50b32-fix-2880-drive-pacer-beta/ (uploaded in 15-30 mins)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment