New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Copy hardlinks #58
Copy hardlinks #58
Commits on Feb 24, 2012
-
Apply excludes/includes at local os.walk() time
Matt Domsch authored and Matt Domsch committedFeb 24, 2012
Commits on Feb 27, 2012
-
add --delete-after option for sync
Matt Domsch authored and Matt Domsch committedFeb 27, 2012 -
add more --delete-after to sync variations
Matt Domsch authored and Matt Domsch committedFeb 27, 2012 -
Merge remote-tracking branch 'origin/master' into merge
Matt Domsch authored and Matt Domsch committedFeb 27, 2012 -
Merge branch 'delete-after' into merge
Matt Domsch authored and Matt Domsch committedFeb 27, 2012 -
Matt Domsch authored and Matt Domsch committed
Feb 27, 2012 -
Merge branch 'delete-after' into merge
Matt Domsch authored and Matt Domsch committedFeb 27, 2012 -
fix os.walk() exclusions for new upstream code
Matt Domsch authored and Matt Domsch committedFeb 27, 2012 -
Merge branch 'master' into merge
Matt Domsch authored and Matt Domsch committedFeb 27, 2012
Commits on Mar 1, 2012
-
Matt Domsch authored and Matt Domsch committed
Mar 1, 2012 -
Matt Domsch authored and Matt Domsch committed
Mar 1, 2012
Commits on Jun 16, 2012
-
Handle hardlinks and duplicate files
Minimize uploads in sync local->remote by looking for existing same files elsewhere in remote destination and do an S3 COPY command instead of uploading the file again. We now store the (locally generated) md5 of the file in the x-amz-meta-s3cmd-attrs metadata, because we can't count on the ETag being correct due to multipart uploads. Use this value if it's available. This also reduces the number of local stat() calls made by recording more useful information during the inital os.walk(). This cuts the number of stat()s in half.
-
If remote doesn't have any copies of the file, we transfer one instance first, then copy thereafter. But we were dereferencing the destination list improperly in this case, causing a crash. This patch fixes the crash cleanly.
Commits on Jun 17, 2012
Commits on Jun 18, 2012
-
handle remote->local transfers with local hardlink/copy if possible
Reworked some of the hardlink / same file detection code to be a little more general purpose. Now it can be used to detect duplicate files on either remote or local side. When transferring remote->local, if we already have a copy (same md5sum) of a file locally that we would otherwise transfer, don't transfer, but hardlink it. Should hardlink not be avaialble (e.g. on Windows), use shutil.copy2() instead. This lets us avoid the second download completely. _get_filelist_local() grew an initial list argument. This lets us avoid copying / merging / updating a bunch of different lists back into one - it starts as one list and grows. Much cleaner (and the fact these were separate cost me several hours of debugging to track down why something would get set, like the by_md5 hash, only to have it be empty shortly thereafter.