Failure deleting files that had plus characters in name #28

mdomsch · 2012-02-24T21:18:57Z

I used s3cmd sync to upload Fedora and EPEL content into a bucket. So far so good. However, on subsequent later runs of s3cmd sync, it fails:

INFO: Summary: 113117 local files to upload, 192 remote files to delete
ERROR: no element found: line 1, column 0
ERROR: Parameter problem: Bucket contains invalid filenames. Please run: s3cmd fixbucket s3://your-bucket/

I traced this down to failure to delete a file in s3cmd:

if cfg.delete_removed:
for key in remote_list:
uri = S3Uri(remote_list[key]['object_uri_str'])
s3.object_delete(uri)
output(u"deleted: '%s'" % uri)

In this case, key contains the file name (e.g. bonnie -1.0.1.i386.rpm) but notice that the plus signs in the name 'bonnie++' have been replaced by spaces. Apparently the uri cannot have plus sign characters, and something (was it s3cmd or Amazon) replaced the plus signs with spaces. Please note that the file may be downloaded from S3 using http to the URL containing plus signs, so that's not a problem. And I can use the AWS Management Console to manually delete such files. It's only invoking object_delete() on that URI that has had the plus chars replace by space chars that fails.

mdomsch · 2012-02-29T06:08:59Z

ping. This failure will prevent Fedora from being able to use this tool for syncing.

If I switch to using the fixbucket mode, files upload and can be deleted just fine, but cannot be downloaded using the URL name with plus signs - only using URLs with %2B expansion in them. However, this expanded form is not what yum/urlgrabber use.

mdomsch · 2013-03-09T05:13:57Z

Just noting that Fedora is using this tool, and is ignoring the fact that it can't delete these files. Not ideal, but isn't causing significant harm at the moment.

mludvig · 2013-03-10T08:26:02Z

Perhaps we can escape + as %2B in the delete URLs? That may help...

mdomsch · 2013-03-10T12:50:28Z

Maybe the (new) multi-delete API can help too, as we no longer have to pass the objects in the HTTP request URL, but inside an XML doc. The respective _do_deletes() functions already take a list of S3 objects to delete, and then calls s3.object_delete() individually. We could turn that into a single S3 call. That would be really nice for Fedora when we delete the development/18 tree a few weeks after we've published releases/18, as there are >30k files deleted at that point. Daily updates would benefit to, but to a much lesser extent.

mdomsch · 2014-04-25T04:09:39Z

commit addb152
Author: Matt Domsch matt@domsch.com
Date: Sun Apr 20 21:20:53 2014 -0500

batch_mode-ectomy to fix recursive bucket_list

at least this works now, perhaps because it uses the batch delete operation.

mdomsch closed this as completed Apr 25, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Failure deleting files that had plus characters in name #28

Failure deleting files that had plus characters in name #28

mdomsch commented Feb 24, 2012

mdomsch commented Feb 29, 2012

mdomsch commented Mar 9, 2013

mludvig commented Mar 10, 2013

mdomsch commented Mar 10, 2013

mdomsch commented Apr 25, 2014

Failure deleting files that had plus characters in name #28

Failure deleting files that had plus characters in name #28

Comments

mdomsch commented Feb 24, 2012

mdomsch commented Feb 29, 2012

mdomsch commented Mar 9, 2013

mludvig commented Mar 10, 2013

mdomsch commented Mar 10, 2013

mdomsch commented Apr 25, 2014