Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failure deleting files that had plus characters in name #28

Closed
mdomsch opened this issue Feb 24, 2012 · 5 comments
Closed

Failure deleting files that had plus characters in name #28

mdomsch opened this issue Feb 24, 2012 · 5 comments

Comments

@mdomsch
Copy link
Contributor

mdomsch commented Feb 24, 2012

I used s3cmd sync to upload Fedora and EPEL content into a bucket. So far so good. However, on subsequent later runs of s3cmd sync, it fails:

INFO: Summary: 113117 local files to upload, 192 remote files to delete
ERROR: no element found: line 1, column 0
ERROR: Parameter problem: Bucket contains invalid filenames. Please run: s3cmd fixbucket s3://your-bucket/

I traced this down to failure to delete a file in s3cmd:

if cfg.delete_removed:
for key in remote_list:
uri = S3Uri(remote_list[key]['object_uri_str'])
s3.object_delete(uri)
output(u"deleted: '%s'" % uri)

In this case, key contains the file name (e.g. bonnie -1.0.1.i386.rpm) but notice that the plus signs in the name 'bonnie++' have been replaced by spaces. Apparently the uri cannot have plus sign characters, and something (was it s3cmd or Amazon) replaced the plus signs with spaces. Please note that the file may be downloaded from S3 using http to the URL containing plus signs, so that's not a problem. And I can use the AWS Management Console to manually delete such files. It's only invoking object_delete() on that URI that has had the plus chars replace by space chars that fails.

@mdomsch
Copy link
Contributor Author

mdomsch commented Feb 29, 2012

ping. This failure will prevent Fedora from being able to use this tool for syncing.

If I switch to using the fixbucket mode, files upload and can be deleted just fine, but cannot be downloaded using the URL name with plus signs - only using URLs with %2B expansion in them. However, this expanded form is not what yum/urlgrabber use.

@mdomsch
Copy link
Contributor Author

mdomsch commented Mar 9, 2013

Just noting that Fedora is using this tool, and is ignoring the fact that it can't delete these files. Not ideal, but isn't causing significant harm at the moment.

@mludvig
Copy link
Contributor

mludvig commented Mar 10, 2013

Perhaps we can escape + as %2B in the delete URLs? That may help...

@mdomsch
Copy link
Contributor Author

mdomsch commented Mar 10, 2013

Maybe the (new) multi-delete API can help too, as we no longer have to pass the objects in the HTTP request URL, but inside an XML doc. The respective _do_deletes() functions already take a list of S3 objects to delete, and then calls s3.object_delete() individually. We could turn that into a single S3 call. That would be really nice for Fedora when we delete the development/18 tree a few weeks after we've published releases/18, as there are >30k files deleted at that point. Daily updates would benefit to, but to a much lesser extent.

@mdomsch
Copy link
Contributor Author

mdomsch commented Apr 25, 2014

commit addb152
Author: Matt Domsch matt@domsch.com
Date: Sun Apr 20 21:20:53 2014 -0500

batch_mode-ectomy to fix recursive bucket_list

at least this works now, perhaps because it uses the batch delete operation.

@mdomsch mdomsch closed this as completed Apr 25, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants