New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

only upload file to AWS if MD5 hash has changed #480

Closed
wants to merge 1 commit into
base: master
from

Conversation

Projects
None yet
3 participants
@paulboone

paulboone commented Sep 18, 2014

Compares an MD5 hash of the local file to the stored MD5 on S3 (in the etag field) before uploading so that only changed files get uploaded to S3. Save time and $!

@ddfreyne

This comment has been minimized.

Show comment
Hide comment
@ddfreyne

ddfreyne Nov 3, 2014

Member

This PR is missing a test case. Could you provide one? Looks good otherwise.

Member

ddfreyne commented Nov 3, 2014

This PR is missing a test case. Could you provide one? Looks good otherwise.

@paulboone

This comment has been minimized.

Show comment
Hide comment
@paulboone

paulboone Nov 5, 2014

My hazy recollection is that testing the specifics was kind of awkward, but it didn't break the existing general test case, so I figured it was kind of already tested...?

Happy to take another look, but it won't be for a bit, since I'm pretty crushed until Dec.

paulboone commented Nov 5, 2014

My hazy recollection is that testing the specifics was kind of awkward, but it didn't break the existing general test case, so I figured it was kind of already tested...?

Happy to take another look, but it won't be for a bit, since I'm pretty crushed until Dec.

:key => key,
:body => File.open(file_path),
:public => true)
if (config[:provider] != "aws") || (Digest::MD5.file(file_path).hexdigest != md5_by_key[key])

This comment has been minimized.

@mpapis

mpapis Nov 5, 2014

Member

try:

if (!md5_by_key[key]) || (Digest::MD5.file(file_path).hexdigest != md5_by_key[key])

this way when adding new provider MD5s there will be no need to change the code

@mpapis

mpapis Nov 5, 2014

Member

try:

if (!md5_by_key[key]) || (Digest::MD5.file(file_path).hexdigest != md5_by_key[key])

this way when adding new provider MD5s there will be no need to change the code

This comment has been minimized.

@ddfreyne

ddfreyne Nov 5, 2014

Member

I’d like this to be extracted in its own helper function too. Something along the lines of #identical? that takes the hashes, the config and the filepath.

@ddfreyne

ddfreyne Nov 5, 2014

Member

I’d like this to be extracted in its own helper function too. Something along the lines of #identical? that takes the hashes, the config and the filepath.

This comment has been minimized.

@ddfreyne

ddfreyne Nov 5, 2014

Member

Also, please use something else than MD5.

@ddfreyne

ddfreyne Nov 5, 2014

Member

Also, please use something else than MD5.

This comment has been minimized.

@mpapis

mpapis Nov 5, 2014

Member

well this takes advantage of file.etag being MD5 - I do not think there is any other way to verify if file changed without downloading the whole file

@mpapis

mpapis Nov 5, 2014

Member

well this takes advantage of file.etag being MD5 - I do not think there is any other way to verify if file changed without downloading the whole file

This comment has been minimized.

@paulboone

paulboone Nov 5, 2014

Yep, S3 populates the hexdigest field automatically with the MD5, which is what makes this possible.

@paulboone

paulboone Nov 5, 2014

Yep, S3 populates the hexdigest field automatically with the MD5, which is what makes this possible.

md5_by_key = {}
if config[:provider] == "aws"
files.each {|file| md5_by_key[file.key] = file.etag}
else # should support other providers here, as possible

This comment has been minimized.

@mpapis

mpapis Nov 5, 2014

Member

assuming more providers can be added a better style would be:

case config[:provider]
when "aws" then files.each {|file| md5_by_key[file.key] = file.etag}
end
@mpapis

mpapis Nov 5, 2014

Member

assuming more providers can be added a better style would be:

case config[:provider]
when "aws" then files.each {|file| md5_by_key[file.key] = file.etag}
end

This comment has been minimized.

@ddfreyne

ddfreyne Nov 5, 2014

Member

Good suggestion.

I’d extract the entire construction of md5_by_key into a separate helper function.

(A separate helper function that takes config and files will be much easier to test.)

@ddfreyne

ddfreyne Nov 5, 2014

Member

Good suggestion.

I’d extract the entire construction of md5_by_key into a separate helper function.

(A separate helper function that takes config and files will be much easier to test.)

@mpapis

This comment has been minimized.

Show comment
Hide comment
@mpapis

mpapis Nov 5, 2014

Member

hmm, maybe we can merge it as it is and I would improve it in new PR including extraction of Md5Hash class to handle collecting and checking.

Member

mpapis commented Nov 5, 2014

hmm, maybe we can merge it as it is and I would improve it in new PR including extraction of Md5Hash class to handle collecting and checking.

@ddfreyne

This comment has been minimized.

Show comment
Hide comment
@ddfreyne

ddfreyne Dec 21, 2014

Member

@mpapis Can you take over this PR? It shouldn’t be merged if it’s not ready, but it can get some more work in a branch of your own.

Member

ddfreyne commented Dec 21, 2014

@mpapis Can you take over this PR? It shouldn’t be merged if it’s not ready, but it can get some more work in a branch of your own.

@mpapis

This comment has been minimized.

Show comment
Hide comment
@mpapis

mpapis Dec 21, 2014

Member

I will work on it in the evening

Member

mpapis commented Dec 21, 2014

I will work on it in the evening

@ddfreyne

This comment has been minimized.

Show comment
Hide comment
@ddfreyne

ddfreyne Dec 25, 2014

Member

@mpapis What’s the status of this?

Member

ddfreyne commented Dec 25, 2014

@mpapis What’s the status of this?

@mpapis

This comment has been minimized.

Show comment
Hide comment
@mpapis

mpapis Dec 25, 2014

Member

sorry missed to define which evening ;) caught by Christmas, will have more time for it starting next week, but if you have time for it now then go for it

Member

mpapis commented Dec 25, 2014

sorry missed to define which evening ;) caught by Christmas, will have more time for it starting next week, but if you have time for it now then go for it

@ddfreyne

This comment has been minimized.

Show comment
Hide comment
@ddfreyne

ddfreyne Jan 11, 2015

Member

@mpapis Are you working on this now?

Member

ddfreyne commented Jan 11, 2015

@mpapis Are you working on this now?

@mpapis

This comment has been minimized.

Show comment
Hide comment
@mpapis

mpapis Jan 11, 2015

Member

no I'm not, if you have time for it right now go for it, I probably wont have time before evening (it's waiting in my queue)

Member

mpapis commented Jan 11, 2015

no I'm not, if you have time for it right now go for it, I probably wont have time before evening (it's waiting in my queue)

@mpapis mpapis referenced this pull request Jan 11, 2015

Merged

Add release notes for 3.7.5 #506

@ddfreyne ddfreyne modified the milestone: 3.7.6 Jan 12, 2015

@ddfreyne

This comment has been minimized.

Show comment
Hide comment
@ddfreyne

ddfreyne Jan 12, 2015

Member

Let’s get this in 3.7.6!

Member

ddfreyne commented Jan 12, 2015

Let’s get this in 3.7.6!

@ddfreyne

This comment has been minimized.

Show comment
Hide comment
@ddfreyne

ddfreyne Feb 21, 2015

Member

I’ll take over this PR.

I’d like to have this change in 3.7.x rather than in master (3.8.0). This is not a feature and one could argue that uploading unchanged files is a bug.

Member

ddfreyne commented Feb 21, 2015

I’ll take over this PR.

I’d like to have this change in 3.7.x rather than in master (3.8.0). This is not a feature and one could argue that uploading unchanged files is a bug.

@ddfreyne

This comment has been minimized.

Show comment
Hide comment
@ddfreyne

ddfreyne Feb 21, 2015

Member

Superseded by #536.

Member

ddfreyne commented Feb 21, 2015

Superseded by #536.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment