-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AWS CLI can't remove objects containing "\n" in Key #675
Comments
Looking into this. From what I can tell it's only when we try to do a recursive type of delete (either
It seems to be that in the recursive case we send:
But when I specify the single file directly we send:
Though it's interesting that S3 sends a 204 response in both cases. I would have expected the first request with %0A to generate a 404. |
So I believe the issue here is our use of cElementTree as an XML parser. I can repro the issue using just the xml parser alone. What's happening is that if we give the parser keys with a carriage return, it will automatically convert them to a new line char. This is why it does not work in the recursive delete case. We need to parse the XML response to know what keys to delete. But when a single file is explicitly specified by the user, we properly encode the character.
I'll need to do some investigation with our XML parser to see what options we have. |
One potential option we have is to use the encoding-type parameter to force the key name to be urlencoded. I'll need to investigate what, if anything, this does to performance as we'd now have to urldecode every key name, but I can confirm that it works:
|
Botocore's xml parser does not handle control chars properly, so we need to urlencode the keys in the response so that we're able to handle them appropriately. Fixes aws#675.
Fixes aws#749. This was a regression from the fix for aws#675 where we use the encoding_type of "url" to workaround the stdlib xmlparser not handling new lines. The problem is that pagination in s3 uses the last key name as the marker, and because the keys are returned urlencoded, we need to urldecode the keys so botocore sends the correct next marker. In the case where urldecoded(key) != key we will incorrectly sync new files.
Fixes aws#749. This was a regression from the fix for aws#675 where we use the encoding_type of "url" to workaround the stdlib xmlparser not handling new lines. The problem is that pagination in s3 uses the last key name as the marker, and because the keys are returned urlencoded, we need to urldecode the keys so botocore sends the correct next marker. In the case where urldecoded(key) != key we will incorrectly sync new files. Also added an integ test for syncing with '+' chars.
Fixes aws#749. This was a regression from the fix for aws#675 where we use the encoding_type of "url" to workaround the stdlib xmlparser not handling new lines. The problem is that pagination in s3 uses the last key name as the marker, and because the keys are returned urlencoded, we need to urldecode the keys so botocore sends the correct next marker. In the case where urldecoded(key) != key we will incorrectly sync new files. Also added an integ test for syncing with '+' chars.
Here is how to reproduce the issue.
The text was updated successfully, but these errors were encountered: