Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with
or
.
Download ZIP

Loading…

[aws|storage] Delete multiple objects #686

Closed
wants to merge 3 commits into from

6 participants

@galfert

This adds support for S3 "Delete multiple objects" as described in
http://docs.amazonwebservices.com/AmazonS3/latest/API/multiobjectdeleteapi.html

tests/aws/requests/storage/object_tests.rb
@@ -84,6 +84,15 @@
Fog::Storage[:aws].delete_object(@directory.identity, 'fog_object')
end
+ tests("delete_multiple_objects('#{@directory.identity}', ['fog_object'])").returns({
+ 'DeleteResult' => [
+ { 'Deleted' => { 'Key' => 'fog_object' } }
+ ]
+ }) do
+ Fog::Storage[:aws].delete_multiple_objects(@directory.identity, ['fog_object'])
@dylanegan Collaborator

Any reason for this line? Do you need to delete it twice?

@galfert
galfert added a note

You are right. That line is superfluous and shouldn't have been in there. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
tests/aws/requests/storage/object_tests.rb
@@ -84,6 +84,15 @@
Fog::Storage[:aws].delete_object(@directory.identity, 'fog_object')
end
+ tests("delete_multiple_objects('#{@directory.identity}', ['fog_object'])").returns({
+ 'DeleteResult' => [
+ { 'Deleted' => { 'Key' => 'fog_object' } }
+ ]
+ }) do
@dylanegan Collaborator

Would be nice to put the format in a constant.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@geemus
Owner

@galfert looks good. I think we should consider how to handle calling this when you want to specific versions for some (or all) of the objects to be deleted though. The current arguments don't provide an obvious way to allow for this, so I think it would be good to figure out something for that before we merge this in (and promptly have to change it). I'm not sure what the best way to add versions would be though. I'd say maybe a hash with object_key => version pairs, but that seems like a pain for non-versioned things. Ideally it would allow for passing in similar args either way and add something optionally for versions, but in any event, we should discuss that a bit. Thoughts?

@galfert

@geemus as soon as I find some time in the next few days I will look into that and try to figure something out. I will get back to you with a suggestion how to handle the versioned objects.

@geemus
Owner

@galfert Just thought I'd check in and see if there is anything I can help with. Thanks!

@nirvdrum nirvdrum commented on the diff
lib/fog/aws/requests/storage/delete_multiple_objects.rb
((29 lines not shown))
+ # ==== See Also
+ # http://docs.amazonwebservices.com/AmazonS3/latest/API/multiobjectdeleteapi.html
+
+ def delete_multiple_objects(bucket_name, object_names, options = {})
+ data = "<Delete>"
+ data << "<Quiet>true</Quiet>" if options.delete(:quiet)
+ object_names.each do |object_name|
+ data << "<Object>"
+ data << "<Key>#{object_name}</Key>"
+ data << "</Object>"
+ end
+ data << "</Delete>"
+
+ headers = options
+ headers['Content-Length'] = data.length
+ headers['Content-MD5'] = Base64.encode64(Digest::MD5.digest(data)).strip
@nirvdrum Collaborator

Would this be susceptible to the recent bug that was fixed for 1.8.7, whereby Base64 could span multiple lines and strip would truncate? If so, we probably want to gsub('\n', '') again.

@geemus Owner
geemus added a note

Yeah, I think it would be best to go with the gsub (safer and more consistent).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@mdonoughe mdonoughe commented on the diff
lib/fog/aws/requests/storage/delete_multiple_objects.rb
((18 lines not shown))
+ # * 'Deleted'<~Hash>:
+ # * 'Key'<~String> - Name of the object that was deleted
+ # * 'VersionId'<~String> - ID for the versioned onject in case of a versioned delete
+ # * 'DeleteMarker'<~Boolean> - Indicates if the request accessed a delete marker
+ # * 'DeleteMarkerVersionId'<~String> - Version ID of the delete marker accessed
+ # * 'Error'<~Hash>:
+ # * 'Key'<~String> - Name of the object that failed to be deleted
+ # * 'VersionId'<~String> - ID of the versioned object that was attempted to be deleted
+ # * 'Code'<~String> - Status code for the result of the failed delete
+ # * 'Message'<~String> - Error description
+ #
+ # ==== See Also
+ # http://docs.amazonwebservices.com/AmazonS3/latest/API/multiobjectdeleteapi.html
+
+ def delete_multiple_objects(bucket_name, object_names, options = {})
+ data = "<Delete>"

Concatenating strings together to form XML like this will not work if object_name contains a left angle bracket or an ampersand, and watch out for "index.html.png" or similar.

That comment got mangled. It was an XML injection to delete index.html. </Key></Object><Object><Key>index.html</Key></Object><Object><Key>.png

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
@geemus
Owner

@galfert - could you rebase and fix the last couple little things so we could get this merged in? Thanks!

@timuralp
Collaborator

Hey guys, I was just wondering if there are plans to merge this into fog. I could probably finish up the small fixes and make a new pull request if @galfert hasn't had time to clean it up?

@nirvdrum
Collaborator

I'll have to read the S3 docs on the multi-object delete, but it seems odd that it wouldn't be impacted by bucket versioning. I would expect S3 to add a DeleteMarker for each object being deleted. The current mock code doesn't do that. But, like I said, I haven't actually tried it yet to see what the real response is from Amazon.

@timuralp
Collaborator

After reading the API, there are a few scenarios for the multi-delete operation. For a non-versioned bucked, it removes the objects from the bucket, which is straightforward. For a versioned bucket, it depends on the request.

If the request only specifies the object keys, S3 will create a delete marker and return its version ID. If a version is specified, S3 will either delete a particular version or remove the delete marker (undeleting the object).

In the case of multi-delete operation, either responses from all the deletes are turned to the called -- verbose mode, or only errors generated by deleting any of the objects -- quiet mode.

It's basically the same semantics as the delete operation (it seems to me), but with the responses concatenated for up to 1000 objects.

@nirvdrum
Collaborator

Cool. I already implemented versioned deletes for the delete_object request. You could probably adopt most of that directly or even extract out to a utility method for the S3 mock storage class.

@geemus
Owner

@timuralp @nirvdrum - sounds great, if you guys need any help coordinating on this or have questions let me know. Thanks!

@timuralp
Collaborator

Sorry for the delay -- I finally have time to finish this up. @nirvdrum I think a utility method makes the most sense. Where should it be placed? I can then call it for each of the elements in the delete_multiple call.

@timuralp
Collaborator

I put a pull request that adds versioned bucket support to @galfert patches. I put it up as a separate pull request -- wasn't sure how to add it to this one.

@geemus
Owner

Going to go ahead and close this one in favor of #1117

@geemus geemus closed this
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
This page is out of date. Refresh to see the latest.
View
50 lib/fog/aws/parsers/storage/delete_multiple_objects.rb
@@ -0,0 +1,50 @@
+module Fog
+ module Parsers
+ module Storage
+ module AWS
+
+ class DeleteMultipleObjects < Fog::Parsers::Base
+
+ def reset
+ @deleted = { 'Deleted' => {} }
+ @error = { 'Error' => {} }
+ @response = { 'DeleteResult' => [] }
+ end
+
+ def start_element(name, attrs = [])
+ super
+ case name
+ when 'Deleted'
+ @in_deleted = true
+ end
+ end
+
+ def end_element(name)
+ case name
+ when 'Deleted'
+ @response['DeleteResult'] << @deleted
+ @deleted = { 'Deleted' => {} }
+ @in_deleted = false
+ when 'Error'
+ @response['DeleteResult'] << @error
+ @error = { 'Error' => {} }
+ when 'Key', 'VersionId'
+ if @in_deleted
+ @deleted['Deleted'][name] = value
+ else
+ @error['Error'][name] = value
+ end
+ when 'DeleteMarker', 'DeletemarkerVersionId'
+ @deleted['Deleted'][name] = value
+ when 'Code', 'Message'
+ @error['Error'][name] = value
+ end
+ end
+
+ end
+
+ end
+ end
+ end
+end
+
View
82 lib/fog/aws/requests/storage/delete_multiple_objects.rb
@@ -0,0 +1,82 @@
+module Fog
+ module Storage
+ class AWS
+ class Real
+
+ require 'fog/aws/parsers/storage/delete_multiple_objects'
+
+ # Delete multiple objects from S3
+ #
+ # ==== Parameters
+ # * bucket_name<~String> - Name of bucket containing object to delete
+ # * object_names<~Array> - Array of object names to delete
+ #
+ # ==== Returns
+ # * response<~Excon::Response>:
+ # * body<~Hash>:
+ # * 'DeleteResult'<~Array>:
+ # * 'Deleted'<~Hash>:
+ # * 'Key'<~String> - Name of the object that was deleted
+ # * 'VersionId'<~String> - ID for the versioned onject in case of a versioned delete
+ # * 'DeleteMarker'<~Boolean> - Indicates if the request accessed a delete marker
+ # * 'DeleteMarkerVersionId'<~String> - Version ID of the delete marker accessed
+ # * 'Error'<~Hash>:
+ # * 'Key'<~String> - Name of the object that failed to be deleted
+ # * 'VersionId'<~String> - ID of the versioned object that was attempted to be deleted
+ # * 'Code'<~String> - Status code for the result of the failed delete
+ # * 'Message'<~String> - Error description
+ #
+ # ==== See Also
+ # http://docs.amazonwebservices.com/AmazonS3/latest/API/multiobjectdeleteapi.html
+
+ def delete_multiple_objects(bucket_name, object_names, options = {})
+ data = "<Delete>"

Concatenating strings together to form XML like this will not work if object_name contains a left angle bracket or an ampersand, and watch out for "index.html.png" or similar.

That comment got mangled. It was an XML injection to delete index.html. </Key></Object><Object><Key>index.html</Key></Object><Object><Key>.png

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
+ data << "<Quiet>true</Quiet>" if options.delete(:quiet)
+ object_names.each do |object_name|
+ data << "<Object>"
+ data << "<Key>#{object_name}</Key>"
+ data << "</Object>"
+ end
+ data << "</Delete>"
+
+ headers = options
+ headers['Content-Length'] = data.length
+ headers['Content-MD5'] = Base64.encode64(Digest::MD5.digest(data)).strip
@nirvdrum Collaborator

Would this be susceptible to the recent bug that was fixed for 1.8.7, whereby Base64 could span multiple lines and strip would truncate? If so, we probably want to gsub('\n', '') again.

@geemus Owner
geemus added a note

Yeah, I think it would be best to go with the gsub (safer and more consistent).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
+
+ request({
+ :body => data,
+ :expects => 200,
+ :headers => headers,
+ :host => "#{bucket_name}.#{@host}",
+ :method => 'POST',
+ :parser => Fog::Parsers::Storage::AWS::DeleteMultipleObjects.new,
+ :query => {'delete' => nil}
+ })
+ end
+
+ end
+
+ class Mock # :nodoc:all
+
+ def delete_multiple_objects(bucket_name, object_names, options = {})
+ response = Excon::Response.new
+ if bucket = self.data[:buckets][bucket_name]
+ response.status = 200
+ response.body = { 'DeleteResult' => [] }
+ object_names.each do |object_name|
+ bucket[:objects].delete(object_name)
+ deleted_entry = { 'Deleted' => { 'Key' => object_name } }
+ response.body['DeleteResult'] << deleted_entry
+ end
+ else
+ response.status = 404
+ raise(Excon::Errors.status_error({:expects => 200}, response))
+ end
+ response
+ end
+
+ end
+ end
+ end
+end
+
View
2  lib/fog/aws/storage.rb
@@ -22,6 +22,7 @@ class AWS < Fog::Service
request :delete_bucket_policy
request :delete_bucket_website
request :delete_object
+ request :delete_multiple_objects
request :get_bucket
request :get_bucket_acl
request :get_bucket_location
@@ -338,6 +339,7 @@ def signature(params)
for key in (params[:query] || {}).keys.sort
if %w{
acl
+ delete
location
logging
notification
View
16 tests/aws/requests/storage/object_tests.rb
@@ -4,6 +4,14 @@
tests('success') do
+ @multiple_delete_format = {
+ 'DeleteResult' => [{
+ 'Deleted' => {
+ 'Key' => String
+ }
+ }]
+ }
+
tests("#put_object('#{@directory.identity}', 'fog_object', lorem_file)").succeeds do
Fog::Storage[:aws].put_object(@directory.identity, 'fog_object', lorem_file)
end
@@ -84,6 +92,10 @@
Fog::Storage[:aws].delete_object(@directory.identity, 'fog_object')
end
+ tests("delete_multiple_objects('#{@directory.identity}', ['fog_object', 'fog_other_object'])").formats(@multiple_delete_format) do
+ Fog::Storage[:aws].delete_multiple_objects(@directory.identity, ['fog_object', 'fog_other_object']).body
+ end
+
end
tests('failure') do
@@ -124,6 +136,10 @@
Fog::Storage[:aws].delete_object('fognonbucket', 'fog_non_object')
end
+ tests("#delete_multiple_objects('fognonbucket', ['fog_non_object'])").raises(Excon::Errors::NotFound) do
+ Fog::Storage[:aws].delete_multiple_objects('fognonbucket', ['fog_non_object'])
+ end
+
tests("#put_object_acl('#{@directory.identity}', 'fog_object', 'invalid')").raises(Excon::Errors::BadRequest) do
Fog::Storage[:aws].put_object_acl('#{@directory.identity}', 'fog_object', 'invalid')
end
Something went wrong with that request. Please try again.