-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
Catching md5sum" up with previously uploaded content to seed it for the actual transfer, and then incrementally updating the md5sum as bytes are uploaded. Also adding md5sum rollback support in the event of an retryable exception. In addition to speedying up resumed uploads (especially where the previous uploaded portion is small with respect to the file size), this also simplfies the md5 logic as we use the incremental md5 in all cases now.
- Loading branch information
There are no files selected for viewing
3 comments
on commit 60cc631
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Evan,
Nice improvement, making separate-process resume use incremental md5 computation.
I think the else at line 597 is sufficiently important that it would be good to add a test for it. Can you please take a look at tests/s3/test_resumable_uploads.py? I implemented a CallbackTestHarnass that lets you force exceptions to occur partway through an upload - so it shouldn't be too hard to exercise this case.
Thanks again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Mike,
I agree that else is important. There is already a test which exposes this feature (https://github.com/boto/boto/blob/develop/tests/s3/test_resumable_uploads.py#L354).
That test was failing until I added the "md5sum rollback" support.
Is that sufficient, or would you like some other type of of test added?
Thanks,
Evan
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, test_upload_with_inital_partial_upload_before_failure does test that case.
Code looks good, and all tests pass for me too. I just added one tiny comment - once you do that I'll commit this change.
Thanks again.
could you please delete "key=key"?
It's leftover from some debugging I did a while back; I often do something like:
if condition_being_debugged:
import pdb;pdb.set_trace()
key=key
so I'll break at that point.
Thanks.