Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

Syncing from S3 to S3 or from S3 to local directory #344

Closed
webmat opened this Issue Jul 21, 2012 · 12 comments

Comments

Projects
None yet
7 participants

webmat commented Jul 21, 2012

Is it possible to sync from one S3 bucket to another?

The way I understand syncers is that they let you specify a cloud destination, then in the 'directories' block, you specify directories local to your server.

This works if your data is on your server and you only back up to S3.

But there are other scenarios where the other ways to sync would be very useful:

  • you need to restore from S3 to your server (and s3sync won't cut it, for the same reason this gem stopped using it)
  • your prod environment stores directly to S3 (e.g. with Paperclip or Carrierwave) and you want to back up the content of your prod bucket elsewhere (to other S3 bucket in other zone, to Rackspace Cloudfiles, to the server running the backup operation)

Has anyone else encountered this need? Is it already possible and I've simply missed how to do it?

Thanks!

I'm also curious on this. I have data go directly to s3 and I want to create a copy, not to handle s3 failure (which really shouldn't happen), but to handle the situation where a bug deletes the original files.

I am in the same exact situation that @ryanstout.

dlackty commented Jan 5, 2013

@ryanstout @eric-smartlove I believed AWS's new service data pipeline is the solution for this situation.

Thank you @dlackty for pointing this AWS recent feature, that is really interesting.

Although it looks overcomplicated at first sigh for simply copying data, this tutorial can help a lot.

However, AWS Data Pipeline does not seem to support full bucket or directory copy, according to this message, so it isn't a proper tool yet to easily backup the data of an application (unless all files are in one or few directories).

For now, I will stay on copy/paste in AWS console(!). For everyone else, see this thread that gives some alternatives.

I really think that the same functionality as is provided in Rsync::Pull should be provided on S3 sync. I'm not sure if there is something technically difficult that makes this hard to do.

I'd like to be able to not only mirror local directories on S3, but also mirror S3 directory on my local drive (i.e. update my local copy to reflect changes on S3). Or am I missing something and this is possible?

webmat commented Feb 20, 2013

@brandonparsons have you tried s3cmd for this (the python tool, not the old ruby gem)? I've been using that successfully on many occasions. It does directory syncing, individual up/downloads (duh), create / destroy buckets, a bunch of features for Cloudfront too.

I've stretched it enormously for big-ish amount of files (860 000+, ~10Gb), but on a smaller scale, it works great.

I also have a proof of concept for a cmd line tool that can parallelize this and work at a greater scale, and also do bucket to bucket syncing. Although since I don't need it anymore, it's been on the backburner for 6+ months... Anybody interested in that?

I would be interested - I'm just trying to figure out the best way to give
this a shot. I'll try the Python version and see if I have more luck!

On Tue, Feb 19, 2013 at 9:32 PM, Mathieu Martin notifications@github.comwrote:

@brandonparsons https://github.com/brandonparsons have you tried s3cmd
for this (the python tool, not the old ruby gem)? I've been using that
successfully on many occasions. It does directory syncing, individual
up/downloads (duh), create / destroy buckets, a bunch of features for
Cloudfront too.

I've stretched it enormously for big-ish amount of files (860 000+,
~10Gb), but on a smaller scale, it works great.

I also have a proof of concept for a cmd line tool that can parallelize
this and work at a greater scale, and also do bucket to bucket syncing.
Although since I don't need it anymore, it's been on the backburner for 6+
months... Anybody interested in that?


Reply to this email directly or view it on GitHubhttps://github.com/meskyanichi/backup/issues/344#issuecomment-13815382.

@tombruijn tombruijn added the Question label Jul 24, 2014

Owner

tombruijn commented Jul 31, 2014

Hi all, I know this is quite an old issue, but I'm cleaning house in the Backup issue tracker.

I feel like this issue is not necessarily within the Backup gem's scope. If you're able to use Backup for this, great! Feel free to discuss your experience here, but I will close the issue.

@tombruijn tombruijn closed this Jul 31, 2014

You mean the gem 'Backup' is meant to make backups for local data (files) but not for online data.
I understand that this feature is not implemented because of the lack of developpers motivated to do so (and I plead guilty), but it's a bit strange to me to say it's outside of the scope.
In the future, there will probably be more and more online data and less local data.

Owner

tombruijn commented Jul 31, 2014

I see I read to quickly and that it is not just about syncing data between s3 buckets or from s3 to other servers, but this is for projects that host certain data on other location such as s3.

Then I will have to say, it's not a planned feature for v4, but I'll ping @meskyanichi to see if he finds it interesting for v5 and maybe can take it in consideration while designing it.

@tombruijn tombruijn reopened this Jul 31, 2014

@tombruijn tombruijn added this to the Version 5 milestone Aug 18, 2014

+1 for syncing from S3 to a local directory, Backup fits almost every need I have for a backup project I'm working on except that one.

@tombruijn tombruijn added Suggestion and removed Question labels Sep 2, 2014

@tombruijn tombruijn referenced this issue in backup/backup-features Sep 20, 2014

Open

Backing up S3 #9

Owner

tombruijn commented Sep 20, 2014

Issue moved to the features repository: meskyanichi/backup-features#9

@tombruijn tombruijn closed this Sep 20, 2014

@tombruijn tombruijn removed this from the Version 5 milestone Sep 20, 2014

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment