Skip to content
This repository

Fix for issue #80 #81

Closed
wants to merge 3 commits into from

3 participants

Sumit Kumar Matt Domsch Michal Ludvig
Sumit Kumar

fixes #80

This fix makes all the downloads happen to temporary files of type .s3mcd.XXXXX.tmp in the same folder as the target file's. Once the download is complete, the file is renamed to the actual destination. This renaming is atomic in nature; hence any parallel thread or process could work on fully downloaded data (by filtering all files matching .s3cmd.XXXXX.tmp pattern while walking the data directory).

added some commits October 02, 2012
Creates checkpoint files of type .s3cmd.XXXX.tmp while downloading from
S3 in the same directory as of the local destination. This can be useful
for checkpointing state of the downloaded data and then processing on a
checkpointed state itself. This is important for synchronized processing
of data downloaded from S3 synchronously (so that processing doesn't
fail on incomplete data)
9fb5900
removed possibility of creating a 0 byte file if python process is
killed abruptly
67268a2
formatting 5c6dc67
Matt Domsch
Collaborator

I manually merged this (hand-applied each line of diff) to my 'merge' branch, given this portion of the code has changed quite a bit since the last formal release your patch was based upon. You still got credit in the git log.

Thanks!

Michal Ludvig
Owner

Merged through Matt's tree.

Michal Ludvig mludvig closed this March 08, 2013
Kazuhiro Suzuki ksauzz referenced this pull request from a commit in ksauzz/s3cmd December 06, 2012
[sync] download files to a temporary filename, then rename
This fix makes all the downloads happen to temporary files of type
.s3cmd.XXXXX.tmp in the same folder as the target file's. Once the
download is complete, the file is renamed to the actual
destination. This renaming is atomic in nature; hence any parallel
thread or process could work on fully downloaded data (by filtering
all files matching .s3cmd.XXXXX.tmp pattern while walking the data
directory).

s3tools#81

Patch manually applied by Matt Domsch because this portion of the code has
changed more than pulling or rebasing could handle.
eecbcba
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Showing 3 unique commits by 1 author.

Oct 02, 2012
Creates checkpoint files of type .s3cmd.XXXX.tmp while downloading from
S3 in the same directory as of the local destination. This can be useful
for checkpointing state of the downloaded data and then processing on a
checkpointed state itself. This is important for synchronized processing
of data downloaded from S3 synchronously (so that processing doesn't
fail on incomplete data)
9fb5900
Oct 20, 2012
removed possibility of creating a 0 byte file if python process is
killed abruptly
67268a2
formatting 5c6dc67
This page is out of date. Refresh to see the latest.

Showing 1 changed file with 9 additions and 8 deletions. Show diff stats Hide diff stats

  1. 17  s3cmd
17  s3cmd
@@ -23,6 +23,7 @@ import locale
23 23
 import subprocess
24 24
 import htmlentitydefs
25 25
 import socket
  26
+import tempfile
26 27
 
27 28
 from copy import copy
28 29
 from optparse import OptionParser, Option, OptionValueError, IndentedHelpFormatter
@@ -735,17 +736,17 @@ def cmd_sync_remote2local(args):
735 736
                 warning(u"%s: destination directory not writable: %s" % (file, dst_dir))
736 737
                 continue
737 738
             try:
738  
-                open_flags = os.O_CREAT
739  
-                open_flags |= os.O_TRUNC
740  
-                # open_flags |= os.O_EXCL
741  
-
742 739
                 debug(u"dst_file=%s" % unicodise(dst_file))
743  
-                # This will have failed should the file exist
744  
-                os.close(os.open(dst_file, open_flags))
745  
-                # Yeah I know there is a race condition here. Sadly I don't know how to open() in exclusive mode.
746  
-                dst_stream = open(dst_file, "wb")
  740
+                # create temporary files (of type .s3cmd.XXXX.tmp) in the same directory 
  741
+                # for downloading and then rename once downloaded
  742
+                chkptfd, chkptfname = tempfile.mkstemp(".tmp",".s3cmd.",os.path.dirname(dst_file))
  743
+                debug(u"created chkptfname=%s" % unicodise(chkptfname))
  744
+                dst_stream = os.fdopen(chkptfd, "wb")  
747 745
                 response = s3.object_get(uri, dst_stream, extra_label = seq_label)
748 746
                 dst_stream.close()
  747
+                # download completed, rename the file to destination
  748
+                os.rename(chkptfname, dst_file)
  749
+                debug(u"renamed chkptfname=%s to dst_file=%s" % (unicodise(chkptfname), unicodise(dst_file)))
749 750
                 if response['headers'].has_key('x-amz-meta-s3cmd-attrs') and cfg.preserve_attrs:
750 751
                     attrs = _parse_attrs_header(response['headers']['x-amz-meta-s3cmd-attrs'])
751 752
                     if attrs.has_key('mode'):
Commit_comment_tip

Tip: You can add notes to lines in a file. Hover to the left of a line to make a note

Something went wrong with that request. Please try again.