S3 streaming with s3 cp uses several GB of memory on upload #923

rlmcpherson · 2014-09-29T23:54:56Z

In testing the streaming upload feature implemented in #903, it is reading the entire stream into memory, causing large memory usage for the tool. On an ubuntu ec2 instance running the latest master branch, uploading a 9 GB file resulted in 6.5-6.9 GB of real memory usage.

Test command:

cat <large_file> | aws s3 cp - s3://bucket/key

The text was updated successfully, but these errors were encountered:

kyleknap · 2014-09-30T00:13:49Z

Interesting. Will look into it. Also on a side note, make sure you use --expected-size with its value in terms of bytes. This ensures that the number of parts when uploading is less than 1000 (which is required for s3 uploads). We default to 5 MB chunks so the threshold for using this parameter is about 5 GB.

rlmcpherson · 2014-09-30T00:18:53Z

It's 10000 parts max according to the documentation: http://docs.aws.amazon.com/AmazonS3/latest/API/mpUploadUploadPart.html so that's a limit of ~50 GB at the minimum size.

kyleknap · 2014-09-30T01:06:17Z

Yep that's right. The good news is that I have confirmed the bug, and it is a very easy fix. The wrong constant was being used to limit the amount of data in memory. It must have been changed when I rebased off develop to merge the original pull request.

For streaming an upload file, the maximum memory usage you should expect is around 90 MB. For fast processes like running cat, you will tend to see it reach that ceiling. For slower processes, the memory usage will be noticeably less. Memory usage increases though if the size of the file is over 50 GB due to a bump up in chunksize.

Thanks for the catch! I will send a pull request out soon.

smboy · 2015-02-13T08:51:51Z

Its a closed issue, but still commenting. For some reason, this is not working on the EMR isntance I'm using. Could you please let me know what might be wrong?

cat filename.csv | aws s3 cp - s3://test-store/test-bucket/folder/filename.csv

jamesls · 2015-02-13T18:19:13Z

What version of the CLI are you using? In what way is it not working? Do you have more information you can share?

smboy · 2015-02-13T20:43:53Z

The version of the cli is
aws --version
aws-cli/1.3.9 Python/2.6.9 Linux/3.14.20-20.44.amzn1.x86_64

here is the error:
cat ins.csv | aws s3 cp - s3://test-store/test-bucket/folder/ins.csv
[Errno 2] No such file or directory: '/home/hadoop/testuser/-'
Completed 1 part(s) with ... file(s) remaining

smboy · 2015-02-16T00:46:38Z

I just spinned up a new EMR instance and upgraded the aws cli to 1.7. This feature is working as expected. Sorry for the false alarm.

thanks!

RRAlex · 2019-06-19T19:52:04Z

--expected-size is only used to used to segment the upload and doesn't have to be the correct file size right?
Because otherwise, doing tar ... | aws s3 cp s3://... will become very difficult without writing to disk.

kyleknap added bug This issue is a bug. s3 confirmed labels Sep 30, 2014

kyleknap mentioned this issue Sep 30, 2014

Fixed too much memory usage for s3 streaming #924

Merged

kyleknap closed this as completed in #924 Oct 3, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

S3 streaming with s3 cp uses several GB of memory on upload #923

S3 streaming with s3 cp uses several GB of memory on upload #923

rlmcpherson commented Sep 29, 2014

kyleknap commented Sep 30, 2014

rlmcpherson commented Sep 30, 2014

kyleknap commented Sep 30, 2014

smboy commented Feb 13, 2015

jamesls commented Feb 13, 2015

smboy commented Feb 13, 2015

smboy commented Feb 16, 2015

RRAlex commented Jun 19, 2019

S3 streaming with s3 cp uses several GB of memory on upload #923

S3 streaming with s3 cp uses several GB of memory on upload #923

Comments

rlmcpherson commented Sep 29, 2014

kyleknap commented Sep 30, 2014

rlmcpherson commented Sep 30, 2014

kyleknap commented Sep 30, 2014

smboy commented Feb 13, 2015

jamesls commented Feb 13, 2015

smboy commented Feb 13, 2015

smboy commented Feb 16, 2015

RRAlex commented Jun 19, 2019