You can clone with
HTTPS or Subversion.
I used s3cmd sync to upload Fedora and EPEL content into a bucket. So far so good. However, on subsequent later runs of s3cmd sync, it fails:
INFO: Summary: 113117 local files to upload, 192 remote files to delete
ERROR: no element found: line 1, column 0
ERROR: Parameter problem: Bucket contains invalid filenames. Please run: s3cmd fixbucket s3://your-bucket/
I traced this down to failure to delete a file in s3cmd:
for key in remote_list:
uri = S3Uri(remote_list[key]['object_uri_str'])
output(u"deleted: '%s'" % uri)
In this case, key contains the file name (e.g. bonnie -1.0.1.i386.rpm) but notice that the plus signs in the name 'bonnie++' have been replaced by spaces. Apparently the uri cannot have plus sign characters, and something (was it s3cmd or Amazon) replaced the plus signs with spaces. Please note that the file may be downloaded from S3 using http to the URL containing plus signs, so that's not a problem. And I can use the AWS Management Console to manually delete such files. It's only invoking object_delete() on that URI that has had the plus chars replace by space chars that fails.
ping. This failure will prevent Fedora from being able to use this tool for syncing.
If I switch to using the fixbucket mode, files upload and can be deleted just fine, but cannot be downloaded using the URL name with plus signs - only using URLs with %2B expansion in them. However, this expanded form is not what yum/urlgrabber use.
Just noting that Fedora is using this tool, and is ignoring the fact that it can't delete these files. Not ideal, but isn't causing significant harm at the moment.
Perhaps we can escape + as %2B in the delete URLs? That may help...
Maybe the (new) multi-delete API can help too, as we no longer have to pass the objects in the HTTP request URL, but inside an XML doc. The respective _do_deletes() functions already take a list of S3 objects to delete, and then calls s3.object_delete() individually. We could turn that into a single S3 call. That would be really nice for Fedora when we delete the development/18 tree a few weeks after we've published releases/18, as there are >30k files deleted at that point. Daily updates would benefit to, but to a much lesser extent.
Author: Matt Domsch firstname.lastname@example.org
Date: Sun Apr 20 21:20:53 2014 -0500
batch_mode-ectomy to fix recursive bucket_list
at least this works now, perhaps because it uses the batch delete operation.