New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Expensive daily s3 to disk replication #2889
Comments
The first thing to do is to use If you want to reduce the ListBucket calls then try If you want to see what operations rclone is doing then do There are some hints to these things in the rclone docs, but maybe there should be a optimizing s3 section in the s3 docs? |
Thank you for the answer. I'll be testing the effects of this optimisation in the next few weeks. I think it's important information that should be present in the documentation as personally, the bill took me by surprise. There was another feedback from the slack channel that had a similar experience with Google Storage. Is the behaviour the same for all providers? |
Great - let us know how it goes
Sorry :-( Let's try to draft some more words for the documents - do you want to have a go?
No, annoyingly the providers are all slightly different! A constant is that |
@kardaj did this save you money? I'm experiencing similarly high costs for requests and I use rclone for daily S3 backups. |
@dertel Yes! using |
I'd quite like to put a section on reducing costs in the s3 docs. What do you think of this? Reducing costsBy default rclone will use the modification time of objects stored in S3 for syncing. This is stored in object metadata which unfortunately takes an extra HEAD request to read which can be expensive. To avoid this when using
Rclone's default directory traversal is to traverse each directory individually. This takes one transaction per directory. Using the Note that if you are only copying a small number of files into a big repository then
If using |
I think there should be an additional warning at a visible place in the documentation that points to the "reducing costs" section. The thing is, since the defaults are different from the |
What are the defaults for the |
Defaults according to the aws s3 cli docs
Option |
@kardaj thanks for the tip, worked for me. |
I have added a section to the s3 docs all about reducing costs in 5063423. This solves the immediate problem here. In future keeping a local db of changes might help too or implementing change detection, but neither of those are s3 specific. |
Can I ask why |
Hey there! This is an updated post from slack a couple weeks ago.
I have a couple of S3 buckets with hundred thousands of objects each and I'm syncing them to a local server on daily basis. I have been using
aws s3 sync
for a couple of years and in the last few months, I moved torclone
to improve sync reliability. This switch came with a noticeable bump in my bill. After breaking down the costs, each daily sync session is costing an average of 10$:From what I noticed,
rclone
is quite aggressive on ListBucket and HeadObject operations. Is there a way to tune down this behaviour through the configuration?Here's some setup information:
$rclone version rclone v1.44 - os/arch: linux/amd64 - go version: go1.11.1
The commands I'm using look like this:
rclone sync s3://bucket /s3/bucket --verbose
The text was updated successfully, but these errors were encountered: