New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to use Pre-signed URLs for multipart upload. #2305
Comments
@harshit196 - Thank you for your post. For mutlipart upload first you have to use 3 api call You can use pre-signed url with any of these operations. For example in the below code i have used with with upload part api call : import boto3
import requests
s3 = boto3.client('s3')
max_size = 5 * 1024 * 1024 #you can define your own size
res = s3.create_multipart_upload(Bucket=bucket_name, Key=key)
upload_id = res['UploadId']
# please note this is for only 1 part of the file, you have to do it for all parts and store all the etag, partnumber in a list
parts=[]
signed_url = s3.generate_presigned_url(ClientMethod='upload_part',Params={'Bucket': bucket_name, 'Key': key, 'UploadId': upload_id, 'PartNumber': part_no})
with target_file.open('rb') as f:
file_data = f.read(max_size) #here reading content of only 1 part of file
res = requests.put(signed_url, data=file_data)
etag = res.headers['ETag']
parts.append({'ETag': etag, 'PartNumber': part}) #you have to append etag and partnumber of each parts
#After completing for all parts, you will use complete_multipart_upload api which requires that parts list
res = s3.complete_multipart_upload(Bucket=bucket_name, Key=key, MultipartUpload={'Parts': parts},UploadId=upload_id) In this above example i have used presigned_url only for Hope it helps and please let me know if you have any questions. |
Thanks for the detailed response. |
So as per your question if you have 15 parts then you have to generate 15 signed url and then use those url with requests.put() operation to upload each part to s3. |
Thanks for great help. |
Reading your code sample @swetashre, I was wondering: is there any way to leverage boto3's multipart file upload capabilities (i.e. retries, multithreading, etc.), when using presigned URLs? i.e. Is there any way to use |
(For context, we allow our users to upload large files, and we'd rather use boto3's code for the actual multipart uploads, than roll out our own custom code to upload each chunk with e.g. requests) |
@julien-c did you find a way to achieve this with boto3-only methods ? We have the exact same needs as you and have our own custom code covering the whole process efficiently, but as we are rethinking part of our codebase I came here to see if there is any new simpler way to leverage Boto3's API. |
@julien-c @PN-picsell Continuing the chain.. did either of you achieve this by reusing anything from boto3? |
@matteosimone @PN-picsell No, I rolled my own... |
@julien-c did you happen to implement the multipart uploads in a public repo? We're about to roll our own as well for https://github.com/trytoolchest/toolchest-client-python/, and it would be amazing to have another open source reference for the additional functionality (retries, multithreading, etc). |
@lebovic not really, sorry. What I have in the open is mostly inside https://github.com/huggingface/huggingface_hub/blob/main/src/huggingface_hub/commands/lfs.py but this is probably a bit specific to the context of implementing a LFS custom transfer agent. Let me know if this helps. |
@julien-c that's very helpful! Thanks for the reference. |
Hi, has anyone actually managed to get a pre-signed URL for the It seems to work fine for the This is how I create the URL:
This is the URL that gets spit out:
Or am I expected to put the Related SO question: https://stackoverflow.com/q/70754676/1370154 |
@suzukieng , yes ! see https://stackoverflow.com/q/70754676/1370154 to update your code ( MultipartUpload is removed from Params + CompleteMultipartUpload passed in body as xml) |
were you able to resolve this issue using generate_presigned_url ? |
I am already using the presigned URLs to enable client side to put files in S3. But the issue which is still not clear is how to put large files to S3 using these pre-signed URLs in multipart form.
The text was updated successfully, but these errors were encountered: