Permalink
Browse files

Merge branch 'master' of git://github.com/s3tools/s3cmd

Conflicts:
	S3/S3.py
  • Loading branch information...
2 parents 07d6d87 + 19a529a commit 9847f33cab85ea8931ff4d751bb4e133f5bee302 @ksperling ksperling committed Jun 27, 2012
Showing with 609 additions and 62 deletions.
  1. +339 −0 LICENSE
  2. +7 −1 NEWS
  3. +13 −1 S3/CloudFront.py
  4. +4 −1 S3/Config.py
  5. +1 −1 S3/Exceptions.py
  6. +11 −10 S3/FileLists.py
  7. +113 −0 S3/MultiPart.py
  8. +1 −1 S3/PkgInfo.py
  9. +57 −20 S3/S3.py
  10. +3 −0 S3/S3Uri.py
  11. +13 −4 format-manpage.pl
  12. +1 −1 run-tests.py
  13. +20 −10 s3cmd
  14. +26 −12 s3cmd.1
View
339 LICENSE
@@ -0,0 +1,339 @@
+ GNU GENERAL PUBLIC LICENSE
+ Version 2, June 1991
+
+ Copyright (C) 1989, 1991 Free Software Foundation, Inc.,
+ 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA
+ Everyone is permitted to copy and distribute verbatim copies
+ of this license document, but changing it is not allowed.
+
+ Preamble
+
+ The licenses for most software are designed to take away your
+freedom to share and change it. By contrast, the GNU General Public
+License is intended to guarantee your freedom to share and change free
+software--to make sure the software is free for all its users. This
+General Public License applies to most of the Free Software
+Foundation's software and to any other program whose authors commit to
+using it. (Some other Free Software Foundation software is covered by
+the GNU Lesser General Public License instead.) You can apply it to
+your programs, too.
+
+ When we speak of free software, we are referring to freedom, not
+price. Our General Public Licenses are designed to make sure that you
+have the freedom to distribute copies of free software (and charge for
+this service if you wish), that you receive source code or can get it
+if you want it, that you can change the software or use pieces of it
+in new free programs; and that you know you can do these things.
+
+ To protect your rights, we need to make restrictions that forbid
+anyone to deny you these rights or to ask you to surrender the rights.
+These restrictions translate to certain responsibilities for you if you
+distribute copies of the software, or if you modify it.
+
+ For example, if you distribute copies of such a program, whether
+gratis or for a fee, you must give the recipients all the rights that
+you have. You must make sure that they, too, receive or can get the
+source code. And you must show them these terms so they know their
+rights.
+
+ We protect your rights with two steps: (1) copyright the software, and
+(2) offer you this license which gives you legal permission to copy,
+distribute and/or modify the software.
+
+ Also, for each author's protection and ours, we want to make certain
+that everyone understands that there is no warranty for this free
+software. If the software is modified by someone else and passed on, we
+want its recipients to know that what they have is not the original, so
+that any problems introduced by others will not reflect on the original
+authors' reputations.
+
+ Finally, any free program is threatened constantly by software
+patents. We wish to avoid the danger that redistributors of a free
+program will individually obtain patent licenses, in effect making the
+program proprietary. To prevent this, we have made it clear that any
+patent must be licensed for everyone's free use or not licensed at all.
+
+ The precise terms and conditions for copying, distribution and
+modification follow.
+
+ GNU GENERAL PUBLIC LICENSE
+ TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
+
+ 0. This License applies to any program or other work which contains
+a notice placed by the copyright holder saying it may be distributed
+under the terms of this General Public License. The "Program", below,
+refers to any such program or work, and a "work based on the Program"
+means either the Program or any derivative work under copyright law:
+that is to say, a work containing the Program or a portion of it,
+either verbatim or with modifications and/or translated into another
+language. (Hereinafter, translation is included without limitation in
+the term "modification".) Each licensee is addressed as "you".
+
+Activities other than copying, distribution and modification are not
+covered by this License; they are outside its scope. The act of
+running the Program is not restricted, and the output from the Program
+is covered only if its contents constitute a work based on the
+Program (independent of having been made by running the Program).
+Whether that is true depends on what the Program does.
+
+ 1. You may copy and distribute verbatim copies of the Program's
+source code as you receive it, in any medium, provided that you
+conspicuously and appropriately publish on each copy an appropriate
+copyright notice and disclaimer of warranty; keep intact all the
+notices that refer to this License and to the absence of any warranty;
+and give any other recipients of the Program a copy of this License
+along with the Program.
+
+You may charge a fee for the physical act of transferring a copy, and
+you may at your option offer warranty protection in exchange for a fee.
+
+ 2. You may modify your copy or copies of the Program or any portion
+of it, thus forming a work based on the Program, and copy and
+distribute such modifications or work under the terms of Section 1
+above, provided that you also meet all of these conditions:
+
+ a) You must cause the modified files to carry prominent notices
+ stating that you changed the files and the date of any change.
+
+ b) You must cause any work that you distribute or publish, that in
+ whole or in part contains or is derived from the Program or any
+ part thereof, to be licensed as a whole at no charge to all third
+ parties under the terms of this License.
+
+ c) If the modified program normally reads commands interactively
+ when run, you must cause it, when started running for such
+ interactive use in the most ordinary way, to print or display an
+ announcement including an appropriate copyright notice and a
+ notice that there is no warranty (or else, saying that you provide
+ a warranty) and that users may redistribute the program under
+ these conditions, and telling the user how to view a copy of this
+ License. (Exception: if the Program itself is interactive but
+ does not normally print such an announcement, your work based on
+ the Program is not required to print an announcement.)
+
+These requirements apply to the modified work as a whole. If
+identifiable sections of that work are not derived from the Program,
+and can be reasonably considered independent and separate works in
+themselves, then this License, and its terms, do not apply to those
+sections when you distribute them as separate works. But when you
+distribute the same sections as part of a whole which is a work based
+on the Program, the distribution of the whole must be on the terms of
+this License, whose permissions for other licensees extend to the
+entire whole, and thus to each and every part regardless of who wrote it.
+
+Thus, it is not the intent of this section to claim rights or contest
+your rights to work written entirely by you; rather, the intent is to
+exercise the right to control the distribution of derivative or
+collective works based on the Program.
+
+In addition, mere aggregation of another work not based on the Program
+with the Program (or with a work based on the Program) on a volume of
+a storage or distribution medium does not bring the other work under
+the scope of this License.
+
+ 3. You may copy and distribute the Program (or a work based on it,
+under Section 2) in object code or executable form under the terms of
+Sections 1 and 2 above provided that you also do one of the following:
+
+ a) Accompany it with the complete corresponding machine-readable
+ source code, which must be distributed under the terms of Sections
+ 1 and 2 above on a medium customarily used for software interchange; or,
+
+ b) Accompany it with a written offer, valid for at least three
+ years, to give any third party, for a charge no more than your
+ cost of physically performing source distribution, a complete
+ machine-readable copy of the corresponding source code, to be
+ distributed under the terms of Sections 1 and 2 above on a medium
+ customarily used for software interchange; or,
+
+ c) Accompany it with the information you received as to the offer
+ to distribute corresponding source code. (This alternative is
+ allowed only for noncommercial distribution and only if you
+ received the program in object code or executable form with such
+ an offer, in accord with Subsection b above.)
+
+The source code for a work means the preferred form of the work for
+making modifications to it. For an executable work, complete source
+code means all the source code for all modules it contains, plus any
+associated interface definition files, plus the scripts used to
+control compilation and installation of the executable. However, as a
+special exception, the source code distributed need not include
+anything that is normally distributed (in either source or binary
+form) with the major components (compiler, kernel, and so on) of the
+operating system on which the executable runs, unless that component
+itself accompanies the executable.
+
+If distribution of executable or object code is made by offering
+access to copy from a designated place, then offering equivalent
+access to copy the source code from the same place counts as
+distribution of the source code, even though third parties are not
+compelled to copy the source along with the object code.
+
+ 4. You may not copy, modify, sublicense, or distribute the Program
+except as expressly provided under this License. Any attempt
+otherwise to copy, modify, sublicense or distribute the Program is
+void, and will automatically terminate your rights under this License.
+However, parties who have received copies, or rights, from you under
+this License will not have their licenses terminated so long as such
+parties remain in full compliance.
+
+ 5. You are not required to accept this License, since you have not
+signed it. However, nothing else grants you permission to modify or
+distribute the Program or its derivative works. These actions are
+prohibited by law if you do not accept this License. Therefore, by
+modifying or distributing the Program (or any work based on the
+Program), you indicate your acceptance of this License to do so, and
+all its terms and conditions for copying, distributing or modifying
+the Program or works based on it.
+
+ 6. Each time you redistribute the Program (or any work based on the
+Program), the recipient automatically receives a license from the
+original licensor to copy, distribute or modify the Program subject to
+these terms and conditions. You may not impose any further
+restrictions on the recipients' exercise of the rights granted herein.
+You are not responsible for enforcing compliance by third parties to
+this License.
+
+ 7. If, as a consequence of a court judgment or allegation of patent
+infringement or for any other reason (not limited to patent issues),
+conditions are imposed on you (whether by court order, agreement or
+otherwise) that contradict the conditions of this License, they do not
+excuse you from the conditions of this License. If you cannot
+distribute so as to satisfy simultaneously your obligations under this
+License and any other pertinent obligations, then as a consequence you
+may not distribute the Program at all. For example, if a patent
+license would not permit royalty-free redistribution of the Program by
+all those who receive copies directly or indirectly through you, then
+the only way you could satisfy both it and this License would be to
+refrain entirely from distribution of the Program.
+
+If any portion of this section is held invalid or unenforceable under
+any particular circumstance, the balance of the section is intended to
+apply and the section as a whole is intended to apply in other
+circumstances.
+
+It is not the purpose of this section to induce you to infringe any
+patents or other property right claims or to contest validity of any
+such claims; this section has the sole purpose of protecting the
+integrity of the free software distribution system, which is
+implemented by public license practices. Many people have made
+generous contributions to the wide range of software distributed
+through that system in reliance on consistent application of that
+system; it is up to the author/donor to decide if he or she is willing
+to distribute software through any other system and a licensee cannot
+impose that choice.
+
+This section is intended to make thoroughly clear what is believed to
+be a consequence of the rest of this License.
+
+ 8. If the distribution and/or use of the Program is restricted in
+certain countries either by patents or by copyrighted interfaces, the
+original copyright holder who places the Program under this License
+may add an explicit geographical distribution limitation excluding
+those countries, so that distribution is permitted only in or among
+countries not thus excluded. In such case, this License incorporates
+the limitation as if written in the body of this License.
+
+ 9. The Free Software Foundation may publish revised and/or new versions
+of the General Public License from time to time. Such new versions will
+be similar in spirit to the present version, but may differ in detail to
+address new problems or concerns.
+
+Each version is given a distinguishing version number. If the Program
+specifies a version number of this License which applies to it and "any
+later version", you have the option of following the terms and conditions
+either of that version or of any later version published by the Free
+Software Foundation. If the Program does not specify a version number of
+this License, you may choose any version ever published by the Free Software
+Foundation.
+
+ 10. If you wish to incorporate parts of the Program into other free
+programs whose distribution conditions are different, write to the author
+to ask for permission. For software which is copyrighted by the Free
+Software Foundation, write to the Free Software Foundation; we sometimes
+make exceptions for this. Our decision will be guided by the two goals
+of preserving the free status of all derivatives of our free software and
+of promoting the sharing and reuse of software generally.
+
+ NO WARRANTY
+
+ 11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY
+FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN
+OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES
+PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED
+OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
+MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS
+TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE
+PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING,
+REPAIR OR CORRECTION.
+
+ 12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
+WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR
+REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES,
+INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING
+OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED
+TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY
+YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER
+PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE
+POSSIBILITY OF SUCH DAMAGES.
+
+ END OF TERMS AND CONDITIONS
+
+ How to Apply These Terms to Your New Programs
+
+ If you develop a new program, and you want it to be of the greatest
+possible use to the public, the best way to achieve this is to make it
+free software which everyone can redistribute and change under these terms.
+
+ To do so, attach the following notices to the program. It is safest
+to attach them to the start of each source file to most effectively
+convey the exclusion of warranty; and each file should have at least
+the "copyright" line and a pointer to where the full notice is found.
+
+ <one line to give the program's name and a brief idea of what it does.>
+ Copyright (C) <year> <name of author>
+
+ This program is free software; you can redistribute it and/or modify
+ it under the terms of the GNU General Public License as published by
+ the Free Software Foundation; either version 2 of the License, or
+ (at your option) any later version.
+
+ This program is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ GNU General Public License for more details.
+
+ You should have received a copy of the GNU General Public License along
+ with this program; if not, write to the Free Software Foundation, Inc.,
+ 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
+
+Also add information on how to contact you by electronic and paper mail.
+
+If the program is interactive, make it output a short notice like this
+when it starts in an interactive mode:
+
+ Gnomovision version 69, Copyright (C) year name of author
+ Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show w'.
+ This is free software, and you are welcome to redistribute it
+ under certain conditions; type `show c' for details.
+
+The hypothetical commands `show w' and `show c' should show the appropriate
+parts of the General Public License. Of course, the commands you use may
+be called something other than `show w' and `show c'; they could even be
+mouse-clicks or menu items--whatever suits your program.
+
+You should also get your employer (if you work as a programmer) or your
+school, if any, to sign a "copyright disclaimer" for the program, if
+necessary. Here is a sample; alter the names:
+
+ Yoyodyne, Inc., hereby disclaims all copyright interest in the program
+ `Gnomovision' (which makes passes at compilers) written by James Hacker.
+
+ <signature of Ty Coon>, 1 April 1989
+ Ty Coon, President of Vice
+
+This General Public License does not permit incorporating your program into
+proprietary programs. If your program is a subroutine library, you may
+consider it more useful to permit linking proprietary applications with the
+library. If this is what you want to do, use the GNU Lesser General
+Public License instead of this License.
View
8 NEWS
@@ -1,11 +1,17 @@
s3cmd 1.1.0 - ???
===========
+* MultiPart upload enabled for both [put] and [sync]. Default chunk
+ size is 15MB.
* CloudFront invalidation via [sync --cf-invalidate] and [cfinvalinfo].
* Increased socket_timeout from 10 secs to 5 mins.
* Added "Static WebSite" support [ws-create / ws-delete / ws-info]
(contributed by Jens Braeuer)
* Force MIME type with --mime-type=abc/xyz, also --guess-mime-type
- is no longer on by default.
+ is now on by default, -M is no longer shorthand for --guess-mime-type
+* Allow parameters in MIME types, for example:
+ --mime-type="text/plain; charset=utf-8"
+* MIME type can be guessed by python-magic which is a lot better than
+ relying on the extension. Contributed by Karsten Sperling.
* Support for environment variables as config values. For instance
in ~/.s3cmd put "access_key=$S3_ACCESS_KEY". Contributed by Ori Bar.
* Support for --configure checking access to a specific bucket instead
View
14 S3/CloudFront.py
@@ -553,7 +553,19 @@ def get_dist_name_for_bucket(self, uri):
response = self.GetList()
CloudFront.dist_list = {}
for d in response['dist_list'].dist_summs:
- CloudFront.dist_list[getBucketFromHostname(d.info['S3Origin']['DNSName'])[0]] = d.uri()
+ if d.info.has_key("S3Origin"):
+ CloudFront.dist_list[getBucketFromHostname(d.info['S3Origin']['DNSName'])[0]] = d.uri()
+ elif d.info.has_key("CustomOrigin"):
+ # Aral: This used to skip over distributions with CustomOrigin, however, we mustn't
+ # do this since S3 buckets that are set up as websites use custom origins.
+ # Thankfully, the custom origin URLs they use start with the URL of the
+ # S3 bucket. Here, we make use this naming convention to support this use case.
+ distListIndex = getBucketFromHostname(d.info['CustomOrigin']['DNSName'])[0];
+ distListIndex = distListIndex[:len(uri.bucket())]
+ CloudFront.dist_list[distListIndex] = d.uri()
+ else:
+ # Aral: I'm not sure when this condition will be reached, but keeping it in there.
+ continue
debug("dist_list: %s" % CloudFront.dist_list)
try:
return CloudFront.dist_list[uri.bucket()]
View
5 S3/Config.py
@@ -61,8 +61,10 @@ class Config(object):
use_https = False
bucket_location = "US"
default_mime_type = "binary/octet-stream"
- guess_mime_type = False
+ guess_mime_type = True
mime_type = ""
+ enable_multipart = True
+ multipart_chunk_size_mb = 15 # MB
# List of checks to be performed for 'sync'
sync_checks = ['size', 'md5'] # 'weak-timestamp'
# List of compiled REGEXPs
@@ -202,3 +204,4 @@ def dump(self, section, config):
for option in config.option_list():
self.stream.write("%s = %s\n" % (option, getattr(config, option)))
+# vim:et:ts=4:sts=4:ai
View
2 S3/Exceptions.py
@@ -44,7 +44,7 @@ def __init__(self, response):
if response.has_key("headers"):
for header in response["headers"]:
debug("HttpHeader: %s: %s" % (header, response["headers"][header]))
- if response.has_key("data"):
+ if response.has_key("data") and response["data"]:
tree = getTreeFromXml(response["data"])
error_node = tree
if not error_node.tag == "Error":
View
21 S3/FileLists.py
@@ -8,6 +8,7 @@
from S3Uri import S3Uri
from SortedDict import SortedDict
from Utils import *
+from Exceptions import ParameterError
from logging import debug, info, warning, error
@@ -23,18 +24,12 @@ def _fswalk_follow_symlinks(path):
If a recursive directory link is detected, emit a warning and skip.
'''
assert os.path.isdir(path) # only designed for directory argument
- walkdirs = set([path])
- targets = set()
+ walkdirs = [path]
for dirpath, dirnames, filenames in os.walk(path):
for dirname in dirnames:
current = os.path.join(dirpath, dirname)
- target = os.path.realpath(current)
if os.path.islink(current):
- if target in targets:
- warning("Skipping recursively symlinked directory %s" % dirname)
- else:
- walkdirs.add(current)
- targets.add(target)
+ walkdirs.append(current)
for walkdir in walkdirs:
for value in os.walk(walkdir):
yield value
@@ -300,8 +295,14 @@ def __direction_str(is_remote):
debug(u"XFER: %s (size mismatch: src=%s dst=%s)" % (file, src_list[file]['size'], dst_list[file]['size']))
attribs_match = False
- if attribs_match and 'md5' in cfg.sync_checks:
- ## ... same size, check MD5
+ ## Check MD5
+ compare_md5 = 'md5' in cfg.sync_checks
+ # Multipart-uploaded files don't have a valid MD5 sum - it ends with "...-NN"
+ if compare_md5:
+ if (src_remote == True and src_list[file]['md5'].find("-") >= 0) or (dst_remote == True and dst_list[file]['md5'].find("-") >= 0):
+ compare_md5 = False
+ info(u"Disabled MD5 check for %s" % file)
+ if attribs_match and compare_md5:
try:
if src_remote == False and dst_remote == True:
src_md5 = hash_file_md5(src_list[file]['full_name'])
View
113 S3/MultiPart.py
@@ -0,0 +1,113 @@
+## Amazon S3 Multipart upload support
+## Author: Jerome Leclanche <jerome.leclanche@gmail.com>
+## License: GPL Version 2
+
+import os
+from stat import ST_SIZE
+from logging import debug, info, warning, error
+from Utils import getTextFromXml, formatSize, unicodise
+from Exceptions import S3UploadError
+
+class MultiPartUpload(object):
+
+ MIN_CHUNK_SIZE_MB = 5 # 5MB
+ MAX_CHUNK_SIZE_MB = 5120 # 5GB
+ MAX_FILE_SIZE = 42949672960 # 5TB
+
+ def __init__(self, s3, file, uri, headers_baseline = {}):
+ self.s3 = s3
+ self.file = file
+ self.uri = uri
+ self.parts = {}
+ self.headers_baseline = headers_baseline
+ self.upload_id = self.initiate_multipart_upload()
+
+ def initiate_multipart_upload(self):
+ """
+ Begin a multipart upload
+ http://docs.amazonwebservices.com/AmazonS3/latest/API/index.html?mpUploadInitiate.html
+ """
+ request = self.s3.create_request("OBJECT_POST", uri = self.uri, headers = self.headers_baseline, extra = "?uploads")
+ response = self.s3.send_request(request)
+ data = response["data"]
+ self.upload_id = getTextFromXml(data, "UploadId")
+ return self.upload_id
+
+ def upload_all_parts(self):
+ """
+ Execute a full multipart upload on a file
+ Returns the seq/etag dict
+ TODO use num_processes to thread it
+ """
+ if not self.upload_id:
+ raise RuntimeError("Attempting to use a multipart upload that has not been initiated.")
+
+ size_left = file_size = os.stat(self.file.name)[ST_SIZE]
+ self.chunk_size = self.s3.config.multipart_chunk_size_mb * 1024 * 1024
+ nr_parts = file_size / self.chunk_size + (file_size % self.chunk_size and 1)
+ debug("MultiPart: Uploading %s in %d parts" % (self.file.name, nr_parts))
+
+ seq = 1
+ while size_left > 0:
+ offset = self.chunk_size * (seq - 1)
+ current_chunk_size = min(file_size - offset, self.chunk_size)
+ size_left -= current_chunk_size
+ labels = {
+ 'source' : unicodise(self.file.name),
+ 'destination' : unicodise(self.uri.uri()),
+ 'extra' : "[part %d of %d, %s]" % (seq, nr_parts, "%d%sB" % formatSize(current_chunk_size, human_readable = True))
+ }
+ try:
+ self.upload_part(seq, offset, current_chunk_size, labels)
+ except:
+ error(u"Upload of '%s' part %d failed. Aborting multipart upload." % (self.file.name, seq))
+ self.abort_upload()
+ raise
+ seq += 1
+
+ debug("MultiPart: Upload finished: %d parts", seq - 1)
+
+ def upload_part(self, seq, offset, chunk_size, labels):
+ """
+ Upload a file chunk
+ http://docs.amazonwebservices.com/AmazonS3/latest/API/index.html?mpUploadUploadPart.html
+ """
+ # TODO implement Content-MD5
+ debug("Uploading part %i of %r (%s bytes)" % (seq, self.upload_id, chunk_size))
+ headers = { "content-length": chunk_size }
+ query_string = "?partNumber=%i&uploadId=%s" % (seq, self.upload_id)
+ request = self.s3.create_request("OBJECT_PUT", uri = self.uri, headers = headers, extra = query_string)
+ response = self.s3.send_file(request, self.file, labels, offset = offset, chunk_size = chunk_size)
+ self.parts[seq] = response["headers"]["etag"]
+ return response
+
+ def complete_multipart_upload(self):
+ """
+ Finish a multipart upload
+ http://docs.amazonwebservices.com/AmazonS3/latest/API/index.html?mpUploadComplete.html
+ """
+ debug("MultiPart: Completing upload: %s" % self.upload_id)
+
+ parts_xml = []
+ part_xml = "<Part><PartNumber>%i</PartNumber><ETag>%s</ETag></Part>"
+ for seq, etag in self.parts.items():
+ parts_xml.append(part_xml % (seq, etag))
+ body = "<CompleteMultipartUpload>%s</CompleteMultipartUpload>" % ("".join(parts_xml))
+
+ headers = { "content-length": len(body) }
+ request = self.s3.create_request("OBJECT_POST", uri = self.uri, headers = headers, extra = "?uploadId=%s" % (self.upload_id))
+ response = self.s3.send_request(request, body = body)
+
+ return response
+
+ def abort_upload(self):
+ """
+ Abort multipart upload
+ http://docs.amazonwebservices.com/AmazonS3/latest/API/index.html?mpUploadAbort.html
+ """
+ debug("MultiPart: Aborting upload: %s" % self.upload_id)
+ request = self.s3.create_request("OBJECT_DELETE", uri = self.uri, extra = "?uploadId=%s" % (self.upload_id))
+ response = self.s3.send_request(request)
+ return response
+
+# vim:et:ts=4:sts=4:ai
View
2 S3/PkgInfo.py
@@ -1,5 +1,5 @@
package = "s3cmd"
-version = "1.1.0-beta1"
+version = "1.1.0-beta3"
url = "http://s3tools.org"
license = "GPL version 2"
short_description = "Command line tool for managing Amazon S3 and CloudFront services"
View
77 S3/S3.py
@@ -20,11 +20,12 @@
from Utils import *
from SortedDict import SortedDict
+from AccessLog import AccessLog
+from ACL import ACL, GranteeLogDelivery
from BidirMap import BidirMap
from Config import Config
from Exceptions import *
-from ACL import ACL, GranteeLogDelivery
-from AccessLog import AccessLog
+from MultiPart import MultiPartUpload
from S3Uri import S3Uri
try:
@@ -36,7 +37,7 @@ def mime_magic_file(file):
return magic_.from_file(file)
def mime_magic_buffer(buffer):
return magic_.from_buffer(buffer)
- except AttributeError:
+ except (TypeError, AttributeError):
## Older python-magic versions
magic_ = magic.open(magic.MAGIC_MIME)
magic_.load()
@@ -52,16 +53,20 @@ def mime_magic(file):
else:
return (mime_magic_buffer(gzip.open(file).read(8192)), 'gzip')
-except ImportError:
+except ImportError, e:
+ if str(e).find("magic") >= 0:
+ magic_message = "Module python-magic is not available."
+ else:
+ magic_message = "Module python-magic can't be used (%s)." % e.message
+ magic_message += " Guessing MIME types based on file extensions."
magic_warned = False
def mime_magic(file):
global magic_warned
if (not magic_warned):
- warning("python-magic is not available, guessing MIME types based on file extensions only")
+ warning(magic_message)
magic_warned = True
return mimetypes.guess_type(file)[0]
-
__all__ = []
class S3Request(object):
def __init__(self, s3, method_string, resource, headers, params = {}):
@@ -123,15 +128,16 @@ class S3(object):
PUT = 0x02,
HEAD = 0x04,
DELETE = 0x08,
- MASK = 0x0F,
- )
+ POST = 0x10,
+ MASK = 0x1F,
+ )
targets = BidirMap(
SERVICE = 0x0100,
BUCKET = 0x0200,
OBJECT = 0x0400,
MASK = 0x0700,
- )
+ )
operations = BidirMap(
UNDFINED = 0x0000,
@@ -143,13 +149,14 @@ class S3(object):
OBJECT_GET = targets["OBJECT"] | http_methods["GET"],
OBJECT_HEAD = targets["OBJECT"] | http_methods["HEAD"],
OBJECT_DELETE = targets["OBJECT"] | http_methods["DELETE"],
+ OBJECT_POST = targets["OBJECT"] | http_methods["POST"],
)
codes = {
"NoSuchBucket" : "Bucket '%s' does not exist",
"AccessDenied" : "Access to bucket '%s' was denied",
"BucketAlreadyExists" : "Bucket '%s' already exists",
- }
+ }
## S3 sometimes sends HTTP-307 response
redir_map = {}
@@ -357,10 +364,12 @@ def object_put(self, filename, uri, extra_headers = None, extra_label = ""):
size = os.stat(filename)[ST_SIZE]
except (IOError, OSError), e:
raise InvalidFileError(u"%s: %s" % (unicodise(filename), e.strerror))
+
headers = SortedDict(ignore_case = True)
if extra_headers:
headers.update(extra_headers)
- headers["content-length"] = size
+
+ ## MIME-type handling
content_type = self.config.mime_type
content_encoding = None
if not content_type and self.config.guess_mime_type:
@@ -371,10 +380,24 @@ def object_put(self, filename, uri, extra_headers = None, extra_label = ""):
headers["content-type"] = content_type
if content_encoding is not None:
headers["content-encoding"] = content_encoding
+
+ ## Other Amazon S3 attributes
if self.config.acl_public:
headers["x-amz-acl"] = "public-read"
if self.config.reduced_redundancy:
headers["x-amz-storage-class"] = "REDUCED_REDUNDANCY"
+
+ ## Multipart decision
+ multipart = False
+ if self.config.enable_multipart:
+ if size > self.config.multipart_chunk_size_mb * 1024 * 1024:
+ multipart = True
+ if multipart:
+ # Multipart requests are quite different... drop here
+ return self.send_file_multipart(file, headers, uri, size)
+
+ ## Not multipart...
+ headers["content-length"] = size
request = self.create_request("OBJECT_PUT", uri = uri, headers = headers)
labels = { 'source' : unicodise(filename), 'destination' : unicodise(uri.uri()), 'extra' : extra_label }
response = self.send_file(request, file, labels)
@@ -573,7 +596,9 @@ def send_request(self, request, body = None, retries = _max_retries):
for header in headers.keys():
headers[header] = str(headers[header])
conn = self.get_connection(resource['bucket'])
- conn.request(method_string, self.format_uri(resource), body, headers)
+ uri = self.format_uri(resource)
+ debug("Sending request method_string=%r, uri=%r, headers=%r, body=(%i bytes)" % (method_string, uri, headers, len(body or "")))
+ conn.request(method_string, uri, body, headers)
response = {}
http_response = conn.getresponse()
response["status"] = http_response.status
@@ -615,7 +640,7 @@ def send_request(self, request, body = None, retries = _max_retries):
return response
- def send_file(self, request, file, labels, throttle = 0, retries = _max_retries):
+ def send_file(self, request, file, labels, throttle = 0, retries = _max_retries, offset = 0, chunk_size = -1):
method_string, resource, headers = request.get_triplet()
size_left = size_total = headers.get("content-length")
if self.config.progress_meter:
@@ -638,15 +663,15 @@ def send_file(self, request, file, labels, throttle = 0, retries = _max_retries)
warning("Waiting %d sec..." % self._fail_wait(retries))
time.sleep(self._fail_wait(retries))
# Connection error -> same throttle value
- return self.send_file(request, file, labels, throttle, retries - 1)
+ return self.send_file(request, file, labels, throttle, retries - 1, offset, chunk_size)
else:
raise S3UploadError("Upload failed for: %s" % resource['uri'])
- file.seek(0)
+ file.seek(offset)
md5_hash = md5()
try:
while (size_left > 0):
#debug("SendFile: Reading up to %d bytes from '%s'" % (self.config.send_chunk, file.name))
- data = file.read(self.config.send_chunk)
+ data = file.read(min(self.config.send_chunk, size_left))
md5_hash.update(data)
conn.send(data)
if self.config.progress_meter:
@@ -675,7 +700,7 @@ def send_file(self, request, file, labels, throttle = 0, retries = _max_retries)
warning("Waiting %d sec..." % self._fail_wait(retries))
time.sleep(self._fail_wait(retries))
# Connection error -> same throttle value
- return self.send_file(request, file, labels, throttle, retries - 1)
+ return self.send_file(request, file, labels, throttle, retries - 1, offset, chunk_size)
else:
debug("Giving up on '%s' %s" % (file.name, e))
raise S3UploadError("Upload failed for: %s" % resource['uri'])
@@ -697,7 +722,7 @@ def send_file(self, request, file, labels, throttle = 0, retries = _max_retries)
redir_hostname = getTextFromXml(response['data'], ".//Endpoint")
self.set_hostname(redir_bucket, redir_hostname)
warning("Redirected to: %s" % (redir_hostname))
- return self.send_file(request, file, labels)
+ return self.send_file(request, file, labels, offset = offset, chunk_size = chunk_size)
# S3 from time to time doesn't send ETag back in a response :-(
# Force re-upload here.
@@ -720,7 +745,7 @@ def send_file(self, request, file, labels, throttle = 0, retries = _max_retries)
warning("Upload failed: %s (%s)" % (resource['uri'], S3Error(response)))
warning("Waiting %d sec..." % self._fail_wait(retries))
time.sleep(self._fail_wait(retries))
- return self.send_file(request, file, labels, throttle, retries - 1)
+ return self.send_file(request, file, labels, throttle, retries - 1, offset, chunk_size)
else:
warning("Too many failures. Giving up on '%s'" % (file.name))
raise S3UploadError
@@ -733,13 +758,25 @@ def send_file(self, request, file, labels, throttle = 0, retries = _max_retries)
warning("MD5 Sums don't match!")
if retries:
warning("Retrying upload of %s" % (file.name))
- return self.send_file(request, file, labels, throttle, retries - 1)
+ return self.send_file(request, file, labels, throttle, retries - 1, offset, chunk_size)
else:
warning("Too many failures. Giving up on '%s'" % (file.name))
raise S3UploadError
return response
+ def send_file_multipart(self, file, headers, uri, size):
+ chunk_size = self.config.multipart_chunk_size_mb * 1024 * 1024
+ timestamp_start = time.time()
+ upload = MultiPartUpload(self, file, uri, headers)
+ upload.upload_all_parts()
+ response = upload.complete_multipart_upload()
+ timestamp_end = time.time()
+ response["elapsed"] = timestamp_end - timestamp_start
+ response["size"] = size
+ response["speed"] = response["elapsed"] and float(response["size"]) / response["elapsed"] or float(-1)
+ return response
+
def recv_file(self, request, stream, labels, start_position = 0, retries = _max_retries):
method_string, resource, headers = request.get_triplet()
if self.config.progress_meter:
View
3 S3/S3Uri.py
@@ -40,6 +40,9 @@ def __str__(self):
def __unicode__(self):
return self.uri()
+ def __repr__(self):
+ return "<%s: %s>" % (self.__class__.__name__, self.__unicode__())
+
def public_url(self):
raise ValueError("This S3 URI does not have Anonymous URL representation")
View
17 format-manpage.pl
@@ -8,6 +8,7 @@
my $commands = "";
my $cfcommands = "";
+my $wscommands = "";
my $options = "";
while (<>) {
@@ -20,6 +21,8 @@
$cmd = $1;
if ($cmd =~ /^cf/) {
$cfcommands .= ".TP\n$cmdline\n$desc\n";
+ } elsif ($cmd =~ /^ws/) {
+ $wscommands .= ".TP\n$cmdline\n$desc\n";
} else {
$commands .= ".TP\n$cmdline\n$desc\n";
}
@@ -29,18 +32,20 @@
my ($opt, $desc);
while (<>) {
last if (/^\s*$/);
- $_ =~ s/\s*(.*?)\s*$/$1/;
+ $_ =~ s/(.*?)\s*$/$1/;
$desc = "";
$opt = "";
- if (/^(-.*)/) {
+ if (/^ (-.*)/) {
$opt = $1;
if ($opt =~ / /) {
($opt, $desc) = split(/\s\s+/, $opt, 2);
}
- $opt =~ s/(-[^ ,=]+)/\\fB$1\\fR/g;
+ $opt =~ s/(-[^ ,=\.]+)/\\fB$1\\fR/g;
$opt =~ s/-/\\-/g;
$options .= ".TP\n$opt\n";
} else {
+ $_ =~ s/\s*(.*?)\s*$/$1/;
+ $_ =~ s/(--[^ ,=\.]+)/\\fB$1\\fR/g;
$desc .= $_;
}
if ($desc) {
@@ -71,6 +76,10 @@
$commands
.PP
+Commands for static WebSites configuration
+$wscommands
+
+.PP
Commands for CloudFront management
$cfcommands
@@ -179,7 +188,7 @@
Report bugs to
.I s3tools\\-bugs\@lists.sourceforge.net
.SH COPYRIGHT
-Copyright \\(co 2007,2008,2009,2010,2011 Michal Ludvig <http://www.logix.cz/michal>
+Copyright \\(co 2007,2008,2009,2010,2011,2012 Michal Ludvig <http://www.logix.cz/michal>
.br
This is free software. You may redistribute copies of it under the terms of
the GNU General Public License version 2 <http://www.gnu.org/licenses/gpl.html>.
View
2 run-tests.py
@@ -432,7 +432,7 @@ def pbucket(tail):
## ====== Get multiple files
test_s3cmd("Get multiple files", ['get', '%s/xyz/etc2/Logo.PNG' % pbucket(1), '%s/xyz/etc/AtomicClockRadio.ttf' % pbucket(1), 'testsuite-out'],
retcode = 1,
- must_find = [ 'Destination must be a directory when downloading multiple sources.' ])
+ must_find = [ 'Destination must be a directory or stdout when downloading multiple sources.' ])
## ====== Make dst dir for get
View
30 s3cmd
@@ -1,4 +1,4 @@
-#!/usr/bin/env python
+#!/usr/bin/python
## Amazon S3 manager
## Author: Michal Ludvig <michal@logix.cz>
@@ -510,11 +510,11 @@ def subcmd_cp_mv(args, process_fce, action_str, message):
for key in remote_list:
remote_list[key]['dest_name'] = destination_base + key
else:
- key = remote_list.keys()[0]
- if destination_base.endswith("/"):
- remote_list[key]['dest_name'] = destination_base + key
- else:
- remote_list[key]['dest_name'] = destination_base
+ for key in remote_list:
+ if destination_base.endswith("/"):
+ remote_list[key]['dest_name'] = destination_base + key
+ else:
+ remote_list[key]['dest_name'] = destination_base
if cfg.dry_run:
for key in exclude_list:
@@ -1420,7 +1420,7 @@ def format_commands(progname, commands_list):
class OptionMimeType(Option):
def check_mimetype(option, opt, value):
- if re.compile("^[a-z0-9]+/[a-z0-9+\.-]+$", re.IGNORECASE).match(value):
+ if re.compile("^[a-z0-9]+/[a-z0-9+\.-]+(;.*)?$", re.IGNORECASE).match(value):
return value
raise OptionValueError("option %s: invalid MIME-Type format: %r" % (opt, value))
@@ -1511,14 +1511,14 @@ def main():
optparser.add_option( "--rinclude", dest="rinclude", action="append", metavar="REGEXP", help="Same as --include but uses REGEXP (regular expression) instead of GLOB")
optparser.add_option( "--rinclude-from", dest="rinclude_from", action="append", metavar="FILE", help="Read --rinclude REGEXPs from FILE")
- optparser.add_option( "--bucket-location", dest="bucket_location", help="Datacentre to create bucket in. As of now the datacenters are: US (default), EU, us-west-1, and ap-southeast-1")
+ optparser.add_option( "--bucket-location", dest="bucket_location", help="Datacentre to create bucket in. As of now the datacenters are: US (default), EU, ap-northeast-1, ap-southeast-1, sa-east-1, us-west-1 and us-west-2")
optparser.add_option( "--reduced-redundancy", "--rr", dest="reduced_redundancy", action="store_true", help="Store object with 'Reduced redundancy'. Lower per-GB price. [put, cp, mv]")
optparser.add_option( "--access-logging-target-prefix", dest="log_target_prefix", help="Target prefix for access logs (S3 URI) (for [cfmodify] and [accesslog] commands)")
optparser.add_option( "--no-access-logging", dest="log_target_prefix", action="store_false", help="Disable access logging (for [cfmodify] and [accesslog] commands)")
optparser.add_option( "--default-mime-type", dest="default_mime_type", action="store_true", help="Default MIME-type for stored objects. Application default is binary/octet-stream.")
- optparser.add_option("-M", "--guess-mime-type", dest="guess_mime_type", action="store_true", help="Guess MIME-type of files by their extension. Fall back to default MIME-Type as specified by --default-mime-type option")
+ optparser.add_option( "--guess-mime-type", dest="guess_mime_type", action="store_true", help="Guess MIME-type of files by their extension or mime magic. Fall back to default MIME-Type as specified by --default-mime-type option")
optparser.add_option( "--no-guess-mime-type", dest="guess_mime_type", action="store_false", help="Don't guess MIME-type and use the default type instead.")
optparser.add_option("-m", "--mime-type", dest="mime_type", type="mimetype", metavar="MIME/TYPE", help="Force MIME-type. Override both --default-mime-type and --guess-mime-type.")
@@ -1527,6 +1527,9 @@ def main():
optparser.add_option( "--encoding", dest="encoding", metavar="ENCODING", help="Override autodetected terminal and filesystem encoding (character set). Autodetected: %s" % preferred_encoding)
optparser.add_option( "--verbatim", dest="urlencoding_mode", action="store_const", const="verbatim", help="Use the S3 name as given on the command line. No pre-processing, encoding, etc. Use with caution!")
+ optparser.add_option( "--disable-multipart", dest="enable_multipart", action="store_false", help="Disable multipart upload on files bigger than --multipart-chunk-size-mb")
+ optparser.add_option( "--multipart-chunk-size-mb", dest="multipart_chunk_size_mb", type="int", action="store", metavar="SIZE", help="Size of each chunk of a multipart upload. Files bigger than SIZE are automatically uploaded as multithreaded-multipart, smaller files are uploaded using the traditional method. SIZE is in Mega-Bytes, default chunk size is %defaultMB, minimum allowed chunk size is 5MB, maximum is 5GB.")
+
optparser.add_option( "--list-md5", dest="list_md5", action="store_true", help="Include MD5 sums in bucket listings (only for 'ls' command).")
optparser.add_option("-H", "--human-readable-sizes", dest="human_readable_sizes", action="store_true", help="Print sizes in human readable form (eg 1kB instead of 1234).")
@@ -1631,7 +1634,7 @@ def main():
if options.check_md5 == False:
try:
cfg.sync_checks.remove("md5")
- except:
+ except Exception:
pass
if options.check_md5 == True and cfg.sync_checks.count("md5") == 0:
cfg.sync_checks.append("md5")
@@ -1650,6 +1653,12 @@ def main():
cfg.update_option("enable", options.enable)
cfg.update_option("acl_public", options.acl_public)
+ ## Check multipart chunk constraints
+ if cfg.multipart_chunk_size_mb < MultiPartUpload.MIN_CHUNK_SIZE_MB:
+ raise ParameterError("Chunk size %d MB is too small, must be >= %d MB. Please adjust --multipart-chunk-size-mb" % (cfg.multipart_chunk_size_mb, MultiPartUpload.MIN_CHUNK_SIZE_MB))
+ if cfg.multipart_chunk_size_mb > MultiPartUpload.MAX_CHUNK_SIZE_MB:
+ raise ParameterError("Chunk size %d MB is too large, must be <= %d MB. Please adjust --multipart-chunk-size-mb" % (cfg.multipart_chunk_size_mb, MultiPartUpload.MAX_CHUNK_SIZE_MB))
+
## CloudFront's cf_enable and Config's enable share the same --enable switch
options.cf_enable = options.enable
@@ -1787,6 +1796,7 @@ if __name__ == '__main__':
from S3.CloudFront import Cmd as CfCmd
from S3.CloudFront import CloudFront
from S3.FileLists import *
+ from S3.MultiPart import MultiPartUpload
main()
sys.exit(0)
View
38 s3cmd.1
@@ -65,6 +65,10 @@ Sign arbitrary string using the secret key
.TP
s3cmd \fBfixbucket\fR \fIs3://BUCKET[/PREFIX]\fR
Fix invalid file names in a bucket
+
+
+.PP
+Commands for static WebSites configuration
.TP
s3cmd \fBws-create\fR \fIs3://BUCKET\fR
Create Website from bucket
@@ -112,7 +116,7 @@ show this help message and exit
.TP
\fB\-\-configure\fR
Invoke interactive (re)configuration tool. Optionally
-use as '--configure s3://come-bucket' to test access
+use as '\fB--configure\fR s3://come-bucket' to test access
to a specific bucket instead of attempting to list
them all.
.TP
@@ -206,8 +210,7 @@ Read --rexclude REGEXPs from FILE
\fB\-\-include\fR=GLOB
Filenames and paths matching GLOB will be included
even if previously excluded by one of
-.TP
-\fB\-\-(r)exclude(\-from)\fR patterns
+\fB--(r)exclude(-from)\fR patterns
.TP
\fB\-\-include\-from\fR=FILE
Read --include GLOBs from FILE
@@ -240,19 +243,18 @@ commands)
Default MIME-type for stored objects. Application
default is binary/octet-stream.
.TP
-\fB\-M\fR, \fB\-\-guess\-mime\-type\fR
-Guess MIME-type of files by their extension. Fall back
-to default MIME-Type as specified by --default-mime-
-type option
+\fB\-\-guess\-mime\-type\fR
+Guess MIME-type of files by their extension or mime
+magic. Fall back to default MIME-Type as specified by
+\fB--default-mime-type\fR option
.TP
\fB\-\-no\-guess\-mime\-type\fR
Don't guess MIME-type and use the default type
instead.
.TP
\fB\-m\fR MIME/TYPE, \fB\-\-mime\-type\fR=MIME/TYPE
-Force MIME-type. Override both --default-mime-type and
-.TP
-\fB\-\-guess\-mime\-type.\fR
+Force MIME-type. Override both \fB--default-mime-type\fR and
+\fB--guess-mime-type\fR.
.TP
\fB\-\-add\-header\fR=NAME:VALUE
Add a given HTTP header to the upload request. Can be
@@ -268,6 +270,18 @@ Override autodetected terminal and filesystem encoding
Use the S3 name as given on the command line. No pre-
processing, encoding, etc. Use with caution!
.TP
+\fB\-\-disable\-multipart\fR
+Disable multipart upload on files bigger than
+\fB--multipart-chunk-size-mb\fR
+.TP
+\fB\-\-multipart\-chunk\-size\-mb\fR=SIZE
+Size of each chunk of a multipart upload. Files bigger
+than SIZE are automatically uploaded as multithreaded-
+multipart, smaller files are uploaded using the
+traditional method. SIZE is in Mega-Bytes, default
+chunk size is noneMB, minimum allowed chunk size is
+5MB, maximum is 5GB.
+.TP
\fB\-\-list\-md5\fR
Include MD5 sums in bucket listings (only for 'ls'
command).
@@ -326,7 +340,7 @@ Enable verbose output.
Enable debug output.
.TP
\fB\-\-version\fR
-Show s3cmd version (1.1.0-beta1) and exit.
+Show s3cmd version (1.1.0-beta3) and exit.
.TP
\fB\-F\fR, \fB\-\-follow\-symlinks\fR
Follow symbolic links as if they are regular files
@@ -427,7 +441,7 @@ Prefered way to get support is our mailing list:
Report bugs to
.I s3tools\-bugs@lists.sourceforge.net
.SH COPYRIGHT
-Copyright \(co 2007,2008,2009,2010,2011 Michal Ludvig <http://www.logix.cz/michal>
+Copyright \(co 2007,2008,2009,2010,2011,2012 Michal Ludvig <http://www.logix.cz/michal>
.br
This is free software. You may redistribute copies of it under the terms of
the GNU General Public License version 2 <http://www.gnu.org/licenses/gpl.html>.

0 comments on commit 9847f33

Please sign in to comment.