Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

only upload file to AWS if MD5 hash has changed #480

Closed
wants to merge 1 commit into from

Conversation

@paulboone
Copy link

@paulboone paulboone commented Sep 18, 2014

Compares an MD5 hash of the local file to the stored MD5 on S3 (in the etag field) before uploading so that only changed files get uploaded to S3. Save time and $!

@ddfreyne
Copy link
Member

@ddfreyne ddfreyne commented Nov 3, 2014

This PR is missing a test case. Could you provide one? Looks good otherwise.

Loading

@paulboone
Copy link
Author

@paulboone paulboone commented Nov 5, 2014

My hazy recollection is that testing the specifics was kind of awkward, but it didn't break the existing general test case, so I figured it was kind of already tested...?

Happy to take another look, but it won't be for a bit, since I'm pretty crushed until Dec.

Loading

:key => key,
:body => File.open(file_path),
:public => true)
if (config[:provider] != "aws") || (Digest::MD5.file(file_path).hexdigest != md5_by_key[key])
Copy link
Member

@mpapis mpapis Nov 5, 2014

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

try:

if (!md5_by_key[key]) || (Digest::MD5.file(file_path).hexdigest != md5_by_key[key])

this way when adding new provider MD5s there will be no need to change the code

Loading

Copy link
Member

@ddfreyne ddfreyne Nov 5, 2014

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’d like this to be extracted in its own helper function too. Something along the lines of #identical? that takes the hashes, the config and the filepath.

Loading

Copy link
Member

@ddfreyne ddfreyne Nov 5, 2014

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, please use something else than MD5.

Loading

Copy link
Member

@mpapis mpapis Nov 5, 2014

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well this takes advantage of file.etag being MD5 - I do not think there is any other way to verify if file changed without downloading the whole file

Loading

Copy link
Author

@paulboone paulboone Nov 5, 2014

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, S3 populates the hexdigest field automatically with the MD5, which is what makes this possible.

Loading

@mpapis
Copy link
Member

@mpapis mpapis commented Nov 5, 2014

hmm, maybe we can merge it as it is and I would improve it in new PR including extraction of Md5Hash class to handle collecting and checking.

Loading

@ddfreyne
Copy link
Member

@ddfreyne ddfreyne commented Dec 21, 2014

@mpapis Can you take over this PR? It shouldn’t be merged if it’s not ready, but it can get some more work in a branch of your own.

Loading

@mpapis
Copy link
Member

@mpapis mpapis commented Dec 21, 2014

I will work on it in the evening

Loading

@ddfreyne
Copy link
Member

@ddfreyne ddfreyne commented Dec 25, 2014

@mpapis What’s the status of this?

Loading

@mpapis
Copy link
Member

@mpapis mpapis commented Dec 25, 2014

sorry missed to define which evening ;) caught by Christmas, will have more time for it starting next week, but if you have time for it now then go for it

Loading

@ddfreyne
Copy link
Member

@ddfreyne ddfreyne commented Jan 11, 2015

@mpapis Are you working on this now?

Loading

@mpapis
Copy link
Member

@mpapis mpapis commented Jan 11, 2015

no I'm not, if you have time for it right now go for it, I probably wont have time before evening (it's waiting in my queue)

Loading

@ddfreyne ddfreyne added this to the 3.7.6 milestone Jan 12, 2015
@ddfreyne ddfreyne added this to the 3.7.6 milestone Jan 12, 2015
@ddfreyne
Copy link
Member

@ddfreyne ddfreyne commented Jan 12, 2015

Let’s get this in 3.7.6!

Loading

@ddfreyne
Copy link
Member

@ddfreyne ddfreyne commented Feb 21, 2015

I’ll take over this PR.

I’d like to have this change in 3.7.x rather than in master (3.8.0). This is not a feature and one could argue that uploading unchanged files is a bug.

Loading

@ddfreyne
Copy link
Member

@ddfreyne ddfreyne commented Feb 21, 2015

Superseded by #536.

Loading

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

3 participants