Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support parallel streaming upload to s3 #105

Closed
gebi opened this issue Nov 9, 2016 · 4 comments
Closed

Support parallel streaming upload to s3 #105

gebi opened this issue Nov 9, 2016 · 4 comments

Comments

@gebi
Copy link

gebi commented Nov 9, 2016

eg. as implemented in https://github.com/rlmcpherson/s3gof3r

The other feature that isn’t available in most other S3 clients is pipeline support, which is made easy with Go’s reader and writer interfaces. This allows usage like
$ tar -czf - <my_dir/> | gof3r put -b <s3_bucket> -k <s3_object>
$ gof3r get -b <s3_bucket> -k <s3_object> | tar -zx
We use the command line tool at CodeGuard to transfer many terabytes into and out of S3 every day, tarring directories in parallel with the uploads and downloads.

Beside parallel upload in general, streaming upload is a really handy feature for big file transfer.
This project also has the added benefit of much robuster error handling and parallel upload/download.

I'm posting this here in the spirit of providing feedback as eg. parallel multipart uploading might influence the API and possible required configuration if added later on

@matryer
Copy link
Contributor

matryer commented Nov 9, 2016

That looks great. Perhaps https://github.com/rlmcpherson/s3gof3r https://github.com/rlmcpherson/s3gof3r is a better client to use?

On 9 Nov 2016, at 19:04, Michael Gebetsroither notifications@github.com wrote:

eg. as implemented in https://github.com/rlmcpherson/s3gof3r https://github.com/rlmcpherson/s3gof3r
The other feature that isn’t available in most other S3 clients is pipeline support, which is made easy with Go’s reader and writer interfaces. This allows usage like
$ tar -czf - <my_dir/> | gof3r put -b <s3_bucket> -k <s3_object>
$ gof3r get -b <s3_bucket> -k <s3_object> | tar -zx
We use the command line tool at CodeGuard to transfer many terabytes into and out of S3 every day, tarring directories in parallel with the uploads and downloads.

Beside parallel upload in general, streaming upload is a really handy feature for big file transfer.
This project also has the added benefit of much robuster error handling and parallel upload/download.

I'm posting this here in the spirit of providing feedback as eg. parallel multipart uploading might influence the API and possible required configuration if added later on


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub #105, or mute the thread https://github.com/notifications/unsubscribe-auth/AAGNG_XF1V3nM9YUkLvdS_e2rUsb3G9Pks5q8hkzgaJpZM4Kt3zg.

@gebi
Copy link
Author

gebi commented Nov 10, 2016

s3gof3r is only for object transfer, it eg. has no implementation for listing

@dt
Copy link

dt commented Dec 8, 2016

This would be pretty useful for my use case -- I'm currently maintaining a little abstraction layer for writing database backups to s3, google cloud storage, http or local disk, and was looking at moving to Stow. However we have hundreds of ~100MB files so it'd be rough to try to reserve gigs of RAM to buffer them for upload on a running production DB server (where that RAM is usually otherwise in use), so streaming is pretty important to us.

@matryer
Copy link
Contributor

matryer commented Dec 9, 2016 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants