Skip to content
Commits on Jul 14, 2012
  1. sync: refactor parent/child and single process code

    os.fork() and os.wait() don't exist on Windows, and the
    multiprocessing module doesn't exist until python 2.6.  So instead, we
    conditionalize calling os.fork() depending on its existance, and on
    there being > 1 destination.
    
    Also simply rearranges the code so that subfunctions within
    local2remote are defined at the top of their respective functions, for
    better readability through the main execution of the function.
    committed with Jun 19, 2012
Commits on Jul 13, 2012
  1. merge manpage conflict]

    committed with Jul 13, 2012
Commits on Jun 23, 2012
Commits on Jun 19, 2012
  1. fix getting uid

    committed with Jun 18, 2012
  2. add local tree MD5 caching

    This creates and maintains a cache (aka HashCache) of each inode in
    the local tree.  This is used to avoid doing local disk I/O to
    calculate an MD5 value for a file if it's inode, mtime, and size
    haven't changed.  If these values have changed, then it does the disk
    I/O.
    
    This introduces command line option --cache-file <foo>.  The file is
    created if it does not exist, is read upon start and written upon
    close. The contents are only useful for a given directory tree, so
    caches should not be reused for different directory tree syncs.
    committed with Jun 18, 2012
  3. sync: add --add-destination, parallelize uploads to multiple destinat…

    …ions
    
    Only meaningful at present in the sync local->remote(s) case, this
    adds the --add-destination <foo> command line option.  For the last
    arg (the traditional destination), and each destination specified via
    --add-destination, fork and upload after the initial walk of the local
    file system has completed (and done all the disk I/O to calculate md5
    values for each file).
    
    This keeps us from pounding the file system doing (the same) disk I/O
    for each possible destination, and allows full use of our bandwidth to
    upload in parallel.
    committed with Jun 18, 2012
  4. handle remote->local transfers with local hardlink/copy if possible

    Reworked some of the hardlink / same file detection code to be a
    little more general purpose.  Now it can be used to detect duplicate
    files on either remote or local side.
    
    When transferring remote->local, if we already have a copy (same
    md5sum) of a file locally that we would otherwise transfer, don't
    transfer, but hardlink it.  Should hardlink not be avaialble (e.g. on
    Windows), use shutil.copy2() instead.  This lets us avoid the second
    download completely.
    
    _get_filelist_local() grew an initial list argument.  This lets us
    avoid copying / merging / updating a bunch of different lists back
    into one - it starts as one list and grows.  Much cleaner (and the
    fact these were separate cost me several hours of debugging to track
    down why something would get set, like the by_md5 hash, only to have
    it be empty shortly thereafter.
    committed with Jun 18, 2012
  5. hardlink/copy fix

    If remote doesn't have any copies of the file, we transfer one
    instance first, then copy thereafter.  But we were dereferencing the
    destination list improperly in this case, causing a crash.  This patch
    fixes the crash cleanly.
    committed with Jun 16, 2012
  6. Handle hardlinks and duplicate files

    Minimize uploads in sync local->remote by looking for existing same
    files elsewhere in remote destination and do an S3 COPY command
    instead of uploading the file again.
    
    We now store the (locally generated) md5 of the file in the
    x-amz-meta-s3cmd-attrs metadata, because we can't count on the ETag
    being correct due to multipart uploads.  Use this value if it's
    available.
    
    This also reduces the number of local stat() calls made by
    recording more useful information during the inital
    os.walk().  This cuts the number of stat()s in half.
    committed with Jun 15, 2012
  7. add --delay-updates option

    committed with Mar 1, 2012
  8. add Config.delete_after

    committed with Feb 27, 2012
  9. add --delete-after option for sync

    committed with Feb 27, 2012
  10. fix getting uid

    committed with Jun 18, 2012
  11. add local tree MD5 caching

    This creates and maintains a cache (aka HashCache) of each inode in
    the local tree.  This is used to avoid doing local disk I/O to
    calculate an MD5 value for a file if it's inode, mtime, and size
    haven't changed.  If these values have changed, then it does the disk
    I/O.
    
    This introduces command line option --cache-file <foo>.  The file is
    created if it does not exist, is read upon start and written upon
    close. The contents are only useful for a given directory tree, so
    caches should not be reused for different directory tree syncs.
    committed with Jun 18, 2012
Commits on Jun 18, 2012
  1. sync: add --add-destination, parallelize uploads to multiple destinat…

    …ions
    
    Only meaningful at present in the sync local->remote(s) case, this
    adds the --add-destination <foo> command line option.  For the last
    arg (the traditional destination), and each destination specified via
    --add-destination, fork and upload after the initial walk of the local
    file system has completed (and done all the disk I/O to calculate md5
    values for each file).
    
    This keeps us from pounding the file system doing (the same) disk I/O
    for each possible destination, and allows full use of our bandwidth to
    upload in parallel.
    committed with Jun 18, 2012
  2. handle remote->local transfers with local hardlink/copy if possible

    Reworked some of the hardlink / same file detection code to be a
    little more general purpose.  Now it can be used to detect duplicate
    files on either remote or local side.
    
    When transferring remote->local, if we already have a copy (same
    md5sum) of a file locally that we would otherwise transfer, don't
    transfer, but hardlink it.  Should hardlink not be avaialble (e.g. on
    Windows), use shutil.copy2() instead.  This lets us avoid the second
    download completely.
    
    _get_filelist_local() grew an initial list argument.  This lets us
    avoid copying / merging / updating a bunch of different lists back
    into one - it starts as one list and grows.  Much cleaner (and the
    fact these were separate cost me several hours of debugging to track
    down why something would get set, like the by_md5 hash, only to have
    it be empty shortly thereafter.
    committed with Jun 18, 2012
Commits on Jun 17, 2012
Commits on Jun 16, 2012
  1. hardlink/copy fix

    If remote doesn't have any copies of the file, we transfer one
    instance first, then copy thereafter.  But we were dereferencing the
    destination list improperly in this case, causing a crash.  This patch
    fixes the crash cleanly.
    committed with Jun 16, 2012
  2. Handle hardlinks and duplicate files

    Minimize uploads in sync local->remote by looking for existing same
    files elsewhere in remote destination and do an S3 COPY command
    instead of uploading the file again.
    
    We now store the (locally generated) md5 of the file in the
    x-amz-meta-s3cmd-attrs metadata, because we can't count on the ETag
    being correct due to multipart uploads.  Use this value if it's
    available.
    
    This also reduces the number of local stat() calls made by
    recording more useful information during the inital
    os.walk().  This cuts the number of stat()s in half.
    committed with Jun 15, 2012
Commits on Apr 12, 2012
  1. @mludvig

    Merge pull request #40 from res0nat0r/bucket-locations

    Add all bucket endpoints to --help
    mludvig committed Apr 12, 2012
  2. @mludvig

    Merge pull request #46 from smcq/license-file

    adding LICENSE file containing GPL v2 text
    mludvig committed Apr 12, 2012
Commits on Mar 29, 2012
  1. @res0nat0r

    Added all bucket endpoints

    res0nat0r committed Mar 29, 2012
Commits on Mar 1, 2012
  1. finish merge

    committed Mar 1, 2012
  2. add --delay-updates option

    committed Mar 1, 2012
  3. @mludvig

    Merge pull request #32 from kellymclaughlin/check-for-empty-error-res…

    …ponse-body
    
    Handle empty return bodies when processing S3 errors.
    mludvig committed Feb 29, 2012
Commits on Feb 29, 2012
  1. @kellymclaughlin

    Handle empty return bodies when processing S3 errors.

    Currently error commands that do not return a body cause
    s3cmd to output an ugly backtrace. This change checks to
    see if the data field of the response is non-empty before
    calling `getTreeFromXml` on it. An example of an offending
    command is using `s3cmd info` on a nonexistent object.
    kellymclaughlin committed Feb 29, 2012
Commits on Feb 27, 2012
  1. Merge branch 'master' into merge

    committed Feb 27, 2012
Something went wrong with that request. Please try again.