Skip to content
This repository

add --files-from=FILE to allow transfer of select files only #116

Merged
merged 4 commits into from about 1 year ago

2 participants

Matt Domsch Michal Ludvig
Matt Domsch
Collaborator

This solves the change of behavior introduced by processing
excludes/includes during os.walk(), where previously:

s3cmd sync --exclude='' --include='.gpg'

would walk the whole tree and transfer only the files named *.gpg.

Since the change to os.walk(), the exclude '*' matches everything, and
nothing is transferred.

This patch introduces --files-from=FILE to match rsync behaviour,
where the list of files to transfer (local to remote) is taken not
from an os.walk(), but from the explicit list in FILE.

The equivalent for remote to local, and remote to remote, is not yet
implemented.

mdomsch added some commits
Matt Domsch mdomsch add --files-from=FILE to allow transfer of select files only
This solves the change of behavior introduced by processing
excludes/includes during os.walk(), where previously:

s3cmd sync --exclude='*' --include='*.gpg'

would walk the whole tree and transfer only the files named *.gpg.

Since the change to os.walk(), the exclude '*' matches everything, and
nothing is transferred.

This patch introduces --files-from=FILE to match rsync behaviour,
where the list of files to transfer (local to remote) is taken not
from an os.walk(), but from the explicit list in FILE.

The equivalent for remote to local, and remote to remote, is not yet
implemented.
3ce5e98
Matt Domsch mdomsch accept --files-from=- to read from stdin
This allows shell syntax:

find . -name \*.gpg | s3cmd sync --files-from=- src dst

to take the list of files to transfer from stdin.

Be careful, as using with a --delete option will cause files on the
remote side not listed in stdin to be deleted too.
b76c5b3
Matt Domsch
Collaborator

I added a second patch to accept '-' meaning read from stdin by request.

mdomsch added some commits
Matt Domsch mdomsch add --files-from to manpage bc547d1
Matt Domsch mdomsch use --files-from only on local source (not remote, not dest)
The restriction to use --files-from on the source and not the
destination list comes from the equivalent behavior in rsync.

The restriction on not using it for remote sources is only because I
haven't figured out the best way to handle that.  That may be added in
the future.
ba412e2
Matt Domsch mdomsch referenced this pull request from a commit in mdomsch/s3cmd
Matt Domsch mdomsch Merge branch 'files-from' into merge, pull request #116 6a82024
Michal Ludvig mludvig merged commit ba412e2 into from
Michal Ludvig mludvig closed this
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Showing 4 unique commits by 2 authors.

Feb 19, 2013
Matt Domsch mdomsch add --files-from=FILE to allow transfer of select files only
This solves the change of behavior introduced by processing
excludes/includes during os.walk(), where previously:

s3cmd sync --exclude='*' --include='*.gpg'

would walk the whole tree and transfer only the files named *.gpg.

Since the change to os.walk(), the exclude '*' matches everything, and
nothing is transferred.

This patch introduces --files-from=FILE to match rsync behaviour,
where the list of files to transfer (local to remote) is taken not
from an os.walk(), but from the explicit list in FILE.

The equivalent for remote to local, and remote to remote, is not yet
implemented.
3ce5e98
Feb 20, 2013
Matt Domsch mdomsch accept --files-from=- to read from stdin
This allows shell syntax:

find . -name \*.gpg | s3cmd sync --files-from=- src dst

to take the list of files to transfer from stdin.

Be careful, as using with a --delete option will cause files on the
remote side not listed in stdin to be deleted too.
b76c5b3
Feb 21, 2013
Matt Domsch mdomsch add --files-from to manpage bc547d1
Matt Domsch mdomsch use --files-from only on local source (not remote, not dest)
The restriction to use --files-from on the source and not the
destination list comes from the equivalent behavior in rsync.

The restriction on not using it for remote sources is only because I
haven't figured out the best way to handle that.  That may be added in
the future.
ba412e2
This page is out of date. Refresh to see the latest.

Showing 4 changed files with 59 additions and 8 deletions. Show diff stats Hide diff stats

  1. +1 0  S3/Config.py
  2. +48 5 S3/FileLists.py
  3. +6 3 s3cmd
  4. +4 0 s3cmd.1
1  S3/Config.py
@@ -92,6 +92,7 @@ class Config(object):
92 92 website_error = ""
93 93 website_endpoint = "http://%(bucket)s.s3-website-%(location)s.amazonaws.com/"
94 94 additional_destinations = []
  95 + files_from = []
95 96 cache_file = ""
96 97 add_headers = ""
97 98
53 S3/FileLists.py
@@ -14,6 +14,7 @@
14 14 from logging import debug, info, warning, error
15 15
16 16 import os
  17 +import sys
17 18 import glob
18 19 import copy
19 20
@@ -140,7 +141,45 @@ def handle_exclude_include_walk(root, dirs, files):
140 141 else:
141 142 debug(u"PASS: %r" % (file))
142 143
143   -def fetch_local_list(args, recursive = None):
  144 +
  145 +def _get_filelist_from_file(cfg, local_path):
  146 + def _append(d, key, value):
  147 + if key not in d:
  148 + d[key] = [value]
  149 + else:
  150 + d[key].append(value)
  151 +
  152 + filelist = {}
  153 + for fname in cfg.files_from:
  154 + if fname == u'-':
  155 + f = sys.stdin
  156 + else:
  157 + try:
  158 + f = open(fname, 'r')
  159 + except IOError, e:
  160 + warning(u"--files-from input file %s could not be opened for reading (%s), skipping." % (fname, e.strerror))
  161 + continue
  162 +
  163 + for line in f:
  164 + line = line.strip()
  165 + line = os.path.normpath(os.path.join(local_path, line))
  166 + dirname = os.path.dirname(line)
  167 + basename = os.path.basename(line)
  168 + _append(filelist, dirname, basename)
  169 + if f != sys.stdin:
  170 + f.close()
  171 +
  172 + # reformat to match os.walk()
  173 + result = []
  174 + keys = filelist.keys()
  175 + keys.sort()
  176 + for key in keys:
  177 + values = filelist[key]
  178 + values.sort()
  179 + result.append((key, [], values))
  180 + return result
  181 +
  182 +def fetch_local_list(args, is_src = False, recursive = None):
144 183 def _get_filelist_local(loc_list, local_uri, cache):
145 184 info(u"Compiling list of local files...")
146 185
@@ -156,11 +195,15 @@ def _get_filelist_local(loc_list, local_uri, cache):
156 195 if local_uri.isdir():
157 196 local_base = deunicodise(local_uri.basename())
158 197 local_path = deunicodise(local_uri.path())
159   - if cfg.follow_symlinks:
160   - filelist = _fswalk_follow_symlinks(local_path)
  198 + if is_src and len(cfg.files_from):
  199 + filelist = _get_filelist_from_file(cfg, local_path)
  200 + single_file = False
161 201 else:
162   - filelist = _fswalk_no_symlinks(local_path)
163   - single_file = False
  202 + if cfg.follow_symlinks:
  203 + filelist = _fswalk_follow_symlinks(local_path)
  204 + else:
  205 + filelist = _fswalk_no_symlinks(local_path)
  206 + single_file = False
164 207 else:
165 208 local_base = ""
166 209 local_path = deunicodise(local_uri.dirname())
9 s3cmd
@@ -271,7 +271,7 @@ def cmd_object_put(args):
271 271 if len(args) == 0:
272 272 raise ParameterError("Nothing to upload. Expecting a local file or directory.")
273 273
274   - local_list, single_file_local = fetch_local_list(args)
  274 + local_list, single_file_local = fetch_local_list(args, is_src = True)
275 275
276 276 local_list, exclude_list = filter_exclude_include(local_list)
277 277
@@ -717,7 +717,7 @@ def cmd_sync_remote2local(args):
717 717 s3 = S3(Config())
718 718
719 719 destination_base = args[-1]
720   - local_list, single_file_local = fetch_local_list(destination_base, recursive = True)
  720 + local_list, single_file_local = fetch_local_list(destination_base, is_src = False, recursive = True)
721 721 remote_list = fetch_remote_list(args[:-1], recursive = True, require_attribs = True)
722 722
723 723 local_count = len(local_list)
@@ -1136,7 +1136,7 @@ def cmd_sync_local2remote(args):
1136 1136 error(u"or disable encryption with --no-encrypt parameter.")
1137 1137 sys.exit(1)
1138 1138
1139   - local_list, single_file_local = fetch_local_list(args[:-1], recursive = True)
  1139 + local_list, single_file_local = fetch_local_list(args[:-1], is_src = True, recursive = True)
1140 1140
1141 1141 destinations = [args[-1]]
1142 1142 if cfg.additional_destinations:
@@ -1738,6 +1738,7 @@ def main():
1738 1738 optparser.add_option( "--rinclude", dest="rinclude", action="append", metavar="REGEXP", help="Same as --include but uses REGEXP (regular expression) instead of GLOB")
1739 1739 optparser.add_option( "--rinclude-from", dest="rinclude_from", action="append", metavar="FILE", help="Read --rinclude REGEXPs from FILE")
1740 1740
  1741 + optparser.add_option( "--files-from", dest="files_from", action="append", metavar="FILE", help="Read list of source-file names from FILE. Use - to read from stdin.")
1741 1742 optparser.add_option( "--bucket-location", dest="bucket_location", help="Datacentre to create bucket in. As of now the datacenters are: US (default), EU, ap-northeast-1, ap-southeast-1, sa-east-1, us-west-1 and us-west-2")
1742 1743 optparser.add_option( "--reduced-redundancy", "--rr", dest="reduced_redundancy", action="store_true", help="Store object with 'Reduced redundancy'. Lower per-GB price. [put, cp, mv]")
1743 1744
@@ -1910,6 +1911,8 @@ def main():
1910 1911
1911 1912 if options.additional_destinations:
1912 1913 cfg.additional_destinations = options.additional_destinations
  1914 + if options.files_from:
  1915 + cfg.files_from = options.files_from
1913 1916
1914 1917 ## Set output and filesystem encoding for printing out filenames.
1915 1918 sys.stdout = codecs.getwriter(cfg.encoding)(sys.stdout, "replace")
4 s3cmd.1
@@ -222,6 +222,10 @@ Same as --include but uses REGEXP (regular expression) instead of GLOB
222 222 \fB\-\-rinclude\-from\fR=FILE
223 223 Read --rinclude REGEXPs from FILE
224 224 .TP
  225 +\fB\-\-files\-from\fR=FILE
  226 +Read list of source-file names from FILE. Use - to read from stdin.
  227 +May be repeated.
  228 +.TP
225 229 \fB\-\-bucket\-location\fR=BUCKET_LOCATION
226 230 Datacentre to create bucket in. As of now the datacenters are: US (default), EU, ap-northeast-1, ap-southeast-1, sa-east-1, us-west-1 and us-west-2
227 231 .TP

Tip: You can add notes to lines in a file. Hover to the left of a line to make a note

Something went wrong with that request. Please try again.