Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

S3: Should put_object automatically split files larger than 5GB? #1123

Closed
Juboo opened this issue Jun 12, 2017 · 4 comments
Closed

S3: Should put_object automatically split files larger than 5GB? #1123

Juboo opened this issue Jun 12, 2017 · 4 comments
Labels

Comments

@Juboo
Copy link

Juboo commented Jun 12, 2017

S3 doesn't allow you to PUT files more than 5gb at a time. However, boto3 will allow me to run something like :

import boto3
s3 = boto3.resource('s3')
data = open('/6gbfile', 'rb')
s3.Bucket('myTestBucket').put_object(Key='6gbfile', Body=data)

After a minute or so it throws:
botocore.exceptions.ClientError: An error occurred (EntityTooLarge) when calling the PutObject operation: Your proposed upload exceeds the maximum allowed size

I must use split in order to upload my large files! Should boto3 handle this for the user?

@jamesls
Copy link
Member

jamesls commented Jun 12, 2017

The put_object method maps directly to the PutObject API request in S3. boto3 offers higher level abstractions that you can use that will automatically manage the multipart uploads for you. Docs here: http://boto3.readthedocs.io/en/latest/guide/s3.html#uploads. You can use either the upload_file or the upload_fileobj method. Let me know if you have any more questions.

@jamesls jamesls closed this as completed Jun 12, 2017
@Juboo
Copy link
Author

Juboo commented Jun 13, 2017

I love you, I love this tool, you make my life easy

@AndresUrregoAngel
Copy link

AndresUrregoAngel commented Mar 27, 2018

Guys, so according to you then the multipart upload is driven by boto3 automatically? it's not required an upload ID to gather the parts upload of the file in AWS S3? @jamesls

I was following the documentation and then seems like the multipart in Python API is only used for glacier? @kopertop
thanks

@vyviansomaya
Copy link

ClientError: An error occurred (EntityTooLarge) when calling the PutObject operation: Your proposed upload exceeds the maximum allowed size

i am trying to download the coco dataset to s3 bucket and i get this error . how do i solve it? the codes i've written are correct.

`%%time
import boto3
import re
from sagemaker import get_execution_role
from sagemaker.amazon.amazon_estimator import get_image_uri

role = get_execution_role()

bucket='masterthesisvyvian' # customize to your bucket

containers = {'us-west-2' : '433757028032.dkr.us-west-2.amazonaws.com/image-classification:latest',
'us-east-1' : '811284229777.dkr.us-east-1.amazonaws.com/image-classification:latest',
'us-east-2' : '825641698319.dkr.us-east-2.amazonaws.com/image-classification:latest',
'eu-west-1' : '685385470294.dkr.eu-west-1.amazonaws.com/image-classification:latest'}

training_image = containers[boto3.Session().region_name]

print(training_image)

import os
import urllib.request
import boto3

def download(url):
filename = url.split("/")[-1]
if not os.path.exists(filename):
urllib.request.urlretrieve(url, filename)

def upload_to_s3(channel, file):
s3 = boto3.resource('s3')
data = open(file, "rb")
key = channel + '/' + file
s3.Bucket(bucket).put_object(Key=key, Body=data)

download('http://images.cocodataset.org/zips/train2017.zip')
download('http://images.cocodataset.org/zips/test2017.zip')
download('http://images.cocodataset.org/zips/val2017.zip')
download('http://images.cocodataset.org/annotations/annotations_trainval2017.zip')
upload_to_s3('s3_train_key', 'train2017.zip')
upload_to_s3('s3_test_key', 'test2017.zip')
upload_to_s3('s3_val_key', 'val2017.zip')
upload_to_s3('s3_annotation_key', 'annotations_trainval2017.zip')`

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants