Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Perform MD5 hash calculation during upload #5186

Closed
cyberduck opened this issue Sep 13, 2010 · 4 comments
Closed

Perform MD5 hash calculation during upload #5186

cyberduck opened this issue Sep 13, 2010 · 4 comments

Comments

@cyberduck
Copy link
Collaborator

@cyberduck cyberduck commented Sep 13, 2010

4cddcf2 created the issue

Currently a MD5 hash of every upload to S3 is calculated before starting the upload. This can consume a large amount of time and no progress bar can be given during that operation therefor the upload time estimate is useless.

I suggest to calculate the MD5 hash during the upload when reading from the stream. See for an example: http://stackoverflow.com/questions/304268/using-java-to-get-a-files-md5-checksum

Now S3 will not return an error for a corrupted upload since it has no hash to compare. Instead the returned ETag from S3 has to be used to verify that the upload was successful: http://docs.amazonwebservices.com/AmazonS3/latest/API/RESTObjectPOST.html

Alternatively it would be good to have at least the option to disable the hash computation since there are cases where the overhead is not justified.

@cyberduck
Copy link
Collaborator Author

@cyberduck cyberduck commented Sep 13, 2010

@dkocher commented

I agree. Calculating the hash on the fly would be an improvment. The only downside is that we need a second request when we still want to set the value of the MD5 in the metadata of the file as we currently do (see md5-hash in metadata).

Loading

@cyberduck
Copy link
Collaborator Author

@cyberduck cyberduck commented Sep 13, 2010

4cddcf2 commented

Agreed and i was thinking about it. I think however that it is obsolete, one could use the ETag all the way through instead.

See:
http://docs.amazonwebservices.com/AmazonS3/latest/API/RESTObjectGET.html
http://docs.amazonwebservices.com/AmazonS3/latest/API/RESTObjectHEAD.html

Loading

@cyberduck
Copy link
Collaborator Author

@cyberduck cyberduck commented Nov 19, 2010

@dkocher commented

If the property s3.upload.metadata.md5 is set to true (false is default), then set the Content-MD5 header and let S3 check the integrity of the upload. Otherwise, we calculate the MD5 on the fly during the upload and compare it to the ETag returned for the upload.

In 82239e1.

Loading

@cyberduck cyberduck closed this Nov 19, 2010
@cyberduck
Copy link
Collaborator Author

@cyberduck cyberduck commented Nov 19, 2010

@dkocher commented

Same fix for Rackspace Cloudfiles in 52ab01c. Addendum in d7f153a.

Loading

@iterate-ch iterate-ch locked as resolved and limited conversation to collaborators Nov 26, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants