Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

completing multipart upload #50

Closed
owenrumney opened this issue Jan 29, 2015 · 7 comments
Closed

completing multipart upload #50

owenrumney opened this issue Jan 29, 2015 · 7 comments
Assignees
Labels
documentation This is a problem with documentation. question

Comments

@owenrumney
Copy link

I'm having trouble with completing a multipart upload

given the following test code

mp = s.create_multipart_upload(Bucket='datalake.primary', Key='test1')
uid = mp['UploadId']
p1 =s.upload_part(Bucket='datalake.primary', Key='test1', PartNumber=1, UploadId=uid, Body='part_0')
s.complete_multipart_upload(Bucket='datalake.primary', Key='test1', UploadId=uid, MultipartUpload=???)

I don't know what I'm supposed to be setting MultipartUpload to and can't work it out in the docs. I see it needs to be a dict but not sure what it should contain.

Without it, I get the error ClientError: An error occurred (InvalidRequest) when calling the CompleteMultipartUpload operation: You must specify at least one part

@danielgtaylor danielgtaylor added question documentation This is a problem with documentation. labels Jan 29, 2015
@danielgtaylor danielgtaylor self-assigned this Jan 29, 2015
@danielgtaylor
Copy link
Member

@owenrumney this is really not obvious from the documentation, so it took me a few tries to get right. Multipart uploads require information about each part when you try to complete the upload. This is how you can accomplish it:

import boto3

bucket = 'my-bucket'
key = 'mp-test.txt'

s3 = boto3.client('s3')

# Initiate the multipart upload and send the part(s)
mpu = s3.create_multipart_upload(Bucket=bucket, Key=key)
part1 = s3.upload_part(Bucket=bucket, Key=key, PartNumber=1,
                       UploadId=mpu['UploadId'], Body='Hello, world!')

# Next, we need to gather information about each part to complete
# the upload. Needed are the part number and ETag.
part_info = {
    'Parts': [
        {
            'PartNumber': 1,
            'ETag': part['ETag']
        }
    ]
}

# Now the upload works!
s3.complete_multipart_upload(Bucket=bucket, Key=key, UploadId=mpu['UploadId'],
                             MultipartUpload=part_info)

I'll see what can be done about updating the documentation upstream. Let me know if you have any other questions!

Also, you can enable low-level logging at any time with this:

boto3.set_stream_logger(name='botocore')

@owenrumney
Copy link
Author

@danielgtaylor thanks, thats much better. I'd seen from the API docs this was the general form but wasn't completely clear. If the documentation could just detail the structure of dict that would probably have been enough.

@blehman
Copy link

blehman commented Dec 16, 2015

What is the ETag? the dict, part, is not defined in this example.

@zWaR
Copy link

zWaR commented Dec 18, 2015

ETag is part of the response of method s3.upload_part(). See the response structure in the doc: https://boto3.readthedocs.org/en/latest/reference/services/s3.html#S3.Client.upload_part

I guess the typo in the example is confusing you. part should be renamed to part1:

part_info = {
    'Parts': [
        {
            'PartNumber': 1,
            'ETag': part1['ETag']
        }
    ]
}

@yar06
Copy link

yar06 commented Dec 5, 2017

Hi,
With the same code, if Ii add a for loop it is not working.

`import boto3

bucket = 'my-bucket'
key = 'mp-test.txt'

s3 = boto3.client('s3')

mpu = s3.create_multipart_upload(Bucket=bucket, Key=key)

for i in range(1,3):
	part = s3.upload_part(Bucket=bucket, Key=key, PartNumber=i,
                       UploadId=mpu['UploadId'], Body='Hello, world!')
	part_info = {
		'Parts': [
			{
				'PartNumber': i,
				'ETag': part['ETag']
			}
		]
	}


s3.complete_multipart_upload(Bucket=bucket, Key=key, UploadId=mpu['UploadId'],
                             MultipartUpload=part_info)`

Now it is throwing the same error.
botocore.exceptions.ClientError: An error occurred (InvalidPart) when calling the CompleteMultipartUpload operation: Unknown

can any one solve this issue.

@zWaR
Copy link

zWaR commented Dec 15, 2017

You are overwriting the part_info['Parts']list. Do this:

parts = {
    'PartNumber': i,
    'ETag': part['ETag']
}
part_info['Parts'].append(parts)

Also it might be worth reading in an actual file, instead of using static Hello, world! in the body for each part.

@a523
Copy link

a523 commented Jul 25, 2018

Is the “ MultipartUpload” REQUIRED?
I can't find the "REQUIRED" behind the arg "MultipartUpload" from the docs of boto3 ,but i code

rep = s3.complete_multipart_upload(Bucket='bucket',
                                   Key='wentao.mp4',
                                   UploadId='2~in_WUwt5z4g7ri1yfT_MiaRqAs8MRXG')

the raised botocore.exceptions.ClientError: An error occurred (MalformedXML) when calling the CompleteMultipartUpload operation: Unknown

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation This is a problem with documentation. question
Projects
None yet
Development

No branches or pull requests

6 participants