-
-
Notifications
You must be signed in to change notification settings - Fork 301
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize MD5 checksum calculation #10278
Comments
Agreed, it would be great if there was a way to disable the checksum because it takes too long on >100GB files. I notice people complain about it e.g. https://community.rackspace.com/general/f/general-discussion-forum/1775/cyberduck-incredibly-slow PS: (looking through some code changes I see you already know this) but although the ETag calculation is not officially defined by Amazon, the resulting ETag is a completed multipart upload is an MD5 of each part's MD5 followed by "-" and the number of parts. This could be verified if you're paranoid, though I guess it would have to be recomputed if Cyberduck is restarted in the middle of a multipart upload. |
Replying to [comment:2 jamshid]:
|
We should possibly skip checksum calculation for multipart uploads as we additionally checksum parts verified on the server side. |
Milestone renamed |
Two suggestions to optimize checksum calculation while uploading to S3.
I frequently upload very large files (75-100GB) to S3 and the checksum calculation adds a significant delay in a time sensitive workflow. I was just uploading a 75GB file, and the checksum calculation took 10min before the actual upload started. Actual upload time is 32min, so that adds a 33% time penalty in uploading, which is significant and very unfortunate.
The text was updated successfully, but these errors were encountered: