Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Loading…

Failure deleting files that had plus characters in name #28

Closed
mdomsch opened this Issue · 5 comments

2 participants

@mdomsch
Owner

I used s3cmd sync to upload Fedora and EPEL content into a bucket. So far so good. However, on subsequent later runs of s3cmd sync, it fails:

INFO: Summary: 113117 local files to upload, 192 remote files to delete
ERROR: no element found: line 1, column 0
ERROR: Parameter problem: Bucket contains invalid filenames. Please run: s3cmd fixbucket s3://your-bucket/

I traced this down to failure to delete a file in s3cmd:

if cfg.delete_removed:
for key in remote_list:
uri = S3Uri(remote_list[key]['object_uri_str'])
s3.object_delete(uri)
output(u"deleted: '%s'" % uri)

In this case, key contains the file name (e.g. bonnie -1.0.1.i386.rpm) but notice that the plus signs in the name 'bonnie++' have been replaced by spaces. Apparently the uri cannot have plus sign characters, and something (was it s3cmd or Amazon) replaced the plus signs with spaces. Please note that the file may be downloaded from S3 using http to the URL containing plus signs, so that's not a problem. And I can use the AWS Management Console to manually delete such files. It's only invoking object_delete() on that URI that has had the plus chars replace by space chars that fails.

@mdomsch
Owner

ping. This failure will prevent Fedora from being able to use this tool for syncing.

If I switch to using the fixbucket mode, files upload and can be deleted just fine, but cannot be downloaded using the URL name with plus signs - only using URLs with %2B expansion in them. However, this expanded form is not what yum/urlgrabber use.

@mdomsch
Owner

Just noting that Fedora is using this tool, and is ignoring the fact that it can't delete these files. Not ideal, but isn't causing significant harm at the moment.

@mludvig
Owner

Perhaps we can escape + as %2B in the delete URLs? That may help...

@mdomsch
Owner

Maybe the (new) multi-delete API can help too, as we no longer have to pass the objects in the HTTP request URL, but inside an XML doc. The respective _do_deletes() functions already take a list of S3 objects to delete, and then calls s3.object_delete() individually. We could turn that into a single S3 call. That would be really nice for Fedora when we delete the development/18 tree a few weeks after we've published releases/18, as there are >30k files deleted at that point. Daily updates would benefit to, but to a much lesser extent.

@mdomsch
Owner

commit addb152
Author: Matt Domsch matt@domsch.com
Date: Sun Apr 20 21:20:53 2014 -0500

batch_mode-ectomy to fix recursive bucket_list

at least this works now, perhaps because it uses the batch delete operation.

@mdomsch mdomsch closed this
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.