Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Setting ACLs via headers at PUT Object creation #631

Conversation

reiddraper
Copy link
Contributor

Originally reported by Eugene Doudine via the riak-users mailing list.

Attempts to set ACLs via x-amz-grant-* headers during object creation (PUT) are ignored. Setting them after an object is created via the XML body of a PUT with the query string parameter ?acl appended works.

Test case using the AWS Ruby SDK:

require "aws-sdk"
require "logger"

AWS.config(
  logger: Logger.new($stdout),
  log_formatter: AWS::Core::LogFormatter.debug,
  log_level: :debug
)

s3 = AWS::S3.new(
  # Credentials for admin@admin.com.
  access_key_id: "T3J4N9T8PNLAJMYLSWXB",
  secret_access_key: "oWl6LH9BJumR6tFMvzc8KduClDKfFYxi3ADXPA==",
  proxy_uri: "http://localhost:8888", # 8888 is for Charles Proxy
  use_ssl: false,
  http_read_timeout: 2000,
  max_retries: 0
)

timestamp   = Time.now.to_i
bucket_name = "acl-test-bucket-#{timestamp}"
object_name = "acl-test-object-#{timestamp}"

bucket = s3.buckets.create(bucket_name)

# Attempt to set ACLs at object creation time.
s3.buckets[bucket_name].objects[object_name].write("Joker",
  grant_full_control: "emailAddress\"admin@admin.com\"",
  grant_read: "emailAddress=\"acl_test@basho.com\""
)

# List ACLs to see if it worked -- it doesn't.
s3.buckets[bucket_name].objects[object_name].acl

# Create similar ACLs to those above.
acl = AWS::S3::AccessControlList.new
acl.grant(:full_control).to(amazon_customer_email: "admin@admin.com")
acl.grant(:read).to(amazon_customer_email: "acl_test@basho.com")

# Apply those ACLs after the object is created.
s3.buckets[bucket_name].objects[object_name].acl = acl

# List ACLs to see if it worked -- it did.
s3.buckets[bucket_name].objects[object_name].acl

Debugging output of the above script running against a local Riak CS instance with two active users:

+-------------------------------------------------------------------------------
| AWS us-east-1 S3 create_bucket 0.028978 0 retries
+-------------------------------------------------------------------------------
|   REQUEST
+-------------------------------------------------------------------------------
|    METHOD: PUT
|       URL: http://acl-test-bucket-1375233769.s3.amazonaws.com::80:/
|   HEADERS: {"user-agent"=>"aws-sdk-ruby/1.14.1 ruby/1.9.3 x86_64-darwin12.4.0", "date"=>"Wed, 31 Jul 2013 01:22:49 GMT", "authorization"=>"AWS T3J4N9T8PNLAJMYLSWXB:xoSN9EAmxvEl1LugYMfPpk6ic0c="}
|      BODY:
+-------------------------------------------------------------------------------
|  RESPONSE
+-------------------------------------------------------------------------------
|    STATUS: 200
|   HEADERS: {"server"=>["Riak CS"], "date"=>["Wed, 31 Jul 2013 01:22:49 GMT"], "content-type"=>["application/xml"], "content-length"=>["0"], "proxy-connection"=>["Keep-alive"]}
|      BODY:
+-------------------------------------------------------------------------------
| AWS us-east-1 S3 put_object 0.018525 0 retries
+-------------------------------------------------------------------------------
|   REQUEST
+-------------------------------------------------------------------------------
|    METHOD: PUT
|       URL: http://acl-test-bucket-1375233769.s3.amazonaws.com::80:/acl-test-object-1375233769
|   HEADERS: {"content-length"=>5, "x-amz-grant-read"=>"emailAddress=\"acl_test@basho.com\"", "x-amz-grant-full-control"=>"emailAddress\"admin@admin.com\"", "user-agent"=>"aws-sdk-ruby/1.14.1 ruby/1.9.3 x86_64-darwin12.4.0", "date"=>"Wed, 31 Jul 2013 01:22:49 GMT", "authorization"=>"AWS T3J4N9T8PNLAJMYLSWXB:yqhecCmkheKGsuIscKP1HSMU/sg="}
|      BODY:
+-------------------------------------------------------------------------------
|  RESPONSE
+-------------------------------------------------------------------------------
|    STATUS: 200
|   HEADERS: {"server"=>["Riak CS"], "etag"=>["\"e55318040a854ebdc8d0508d2522bee5\""], "date"=>["Wed, 31 Jul 2013 01:22:49 GMT"], "content-type"=>["text/plain"], "content-length"=>["0"], "proxy-connection"=>["Keep-alive"]}
|      BODY:
+-------------------------------------------------------------------------------
| AWS us-east-1 S3 get_object_acl 0.013928 0 retries
+-------------------------------------------------------------------------------
|   REQUEST
+-------------------------------------------------------------------------------
|    METHOD: GET
|       URL: http://acl-test-bucket-1375233769.s3.amazonaws.com::80:/acl-test-object-1375233769?acl
|   HEADERS: {"user-agent"=>"aws-sdk-ruby/1.14.1 ruby/1.9.3 x86_64-darwin12.4.0", "date"=>"Wed, 31 Jul 2013 01:22:49 GMT", "authorization"=>"AWS T3J4N9T8PNLAJMYLSWXB:QW094CPm9FHBaQZk5CcjgAwT58g="}
|      BODY:
+-------------------------------------------------------------------------------
|  RESPONSE
+-------------------------------------------------------------------------------
|    STATUS: 200
|   HEADERS: {"server"=>["Riak CS"], "date"=>["Wed, 31 Jul 2013 01:22:49 GMT"], "content-type"=>["application/octet-stream"], "content-length"=>["495"], "proxy-connection"=>["Keep-alive"]}
|      BODY: <?xml version="1.0" encoding="UTF-8"?><AccessControlPolicy><Owner><ID>51e365ea87a8cfb147ed24533cc36493f6664880c6960f5e378b1dcfe5d43039</ID><DisplayName>admin</DisplayName></Owner><AccessControlList><Grant><Grantee xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="CanonicalUser"><ID>51e365ea87a8cfb147ed24533cc36493f6664880c6960f5e378b1dcfe5d43039</ID><DisplayName>admin</DisplayName></Grantee><Permission>FULL_CONTROL</Permission></Grant></AccessControlList></AccessControlPolicy>
+-------------------------------------------------------------------------------
| AWS us-east-1 S3 put_object_acl 0.041236 0 retries
+-------------------------------------------------------------------------------
|   REQUEST
+-------------------------------------------------------------------------------
|    METHOD: PUT
|       URL: http://acl-test-bucket-1375233769.s3.amazonaws.com::80:/acl-test-object-1375233769?acl
|   HEADERS: {"content-length"=>529, "user-agent"=>"aws-sdk-ruby/1.14.1 ruby/1.9.3 x86_64-darwin12.4.0", "date"=>"Wed, 31 Jul 2013 01:22:49 GMT", "authorization"=>"AWS T3J4N9T8PNLAJMYLSWXB:mhdmPbILw1l3XqSdatQTTpeqiyE="}
|      BODY: <AccessControlPolicy xmlns="http://s3.amazonaws.com/doc/2006-03-01/"><AccessControlList><Grant><Grantee xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="AmazonCustomerByEmail"><EmailAddress>admin@admin.com</EmailAddress></Grantee><Permission>FULL_CONTROL</Permission></Grant><Grant><Grantee xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="AmazonCustomerByEmail"><EmailAddress>acl_test@basho.com</EmailAddress></Grantee><Permission>READ</Permission></Grant></AccessControlList></AccessControlPolicy>
+-------------------------------------------------------------------------------
|  RESPONSE
+-------------------------------------------------------------------------------
|    STATUS: 200
|   HEADERS: {"server"=>["Riak CS"], "date"=>["Wed, 31 Jul 2013 01:22:49 GMT"], "content-type"=>["text/plain"], "content-length"=>["0"], "proxy-connection"=>["Keep-alive"]}
|      BODY:
+-------------------------------------------------------------------------------
| AWS us-east-1 S3 get_object_acl 0.010334 0 retries
+-------------------------------------------------------------------------------
|   REQUEST
+-------------------------------------------------------------------------------
|    METHOD: GET
|       URL: http://acl-test-bucket-1375233769.s3.amazonaws.com::80:/acl-test-object-1375233769?acl
|   HEADERS: {"user-agent"=>"aws-sdk-ruby/1.14.1 ruby/1.9.3 x86_64-darwin12.4.0", "date"=>"Wed, 31 Jul 2013 01:22:49 GMT", "authorization"=>"AWS T3J4N9T8PNLAJMYLSWXB:QW094CPm9FHBaQZk5CcjgAwT58g="}
|      BODY:
+-------------------------------------------------------------------------------
|  RESPONSE
+-------------------------------------------------------------------------------
|    STATUS: 200
|   HEADERS: {"server"=>["Riak CS"], "date"=>["Wed, 31 Jul 2013 01:22:49 GMT"], "content-type"=>["application/octet-stream"], "content-length"=>["676"], "proxy-connection"=>["Keep-alive"]}
|      BODY: <?xml version="1.0" encoding="UTF-8"?><AccessControlPolicy><Owner><ID></ID><DisplayName></DisplayName></Owner><AccessControlList><Grant><Grantee xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="CanonicalUser"><ID>51e365ea87a8cfb147ed24533cc36493f6664880c6960f5e378b1dcfe5d43039</ID><DisplayName>admin</DisplayName></Grantee><Permission>FULL_CONTROL</Permission></Grant><Grant><Grantee xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="CanonicalUser"><ID>d9fc95243a15bcb0253977cb65a8ed51204ed16c848241eb409a04c1e196bf07</ID><DisplayName>acl_test</DisplayName></Grantee><Permission>READ</Permission></Grant></AccessControlList></AccessControlPolicy>

@ghost ghost assigned reiddraper Aug 1, 2013
@reiddraper
Copy link
Contributor

I've also now been able to reproduce this with s3cmd.

@reiddraper
Copy link
Contributor

The core issue here is that we only look for 'canned' ACLs in HTTP headers, both for the normal PUT object resource and the ?acl subresource. In the ?acl subresource we do read specific grants from the xml, if there is a body. The fix will be to add support for non-canned ACLs to be read from headers.

@reiddraper
Copy link
Contributor

Note to self: 'If you use these ACL specific headers, you cannot use x-amz-acl header to set a canned ACL.', from here.

@reiddraper
Copy link
Contributor

As far as I can tell, s3cmd actually doesn't correctly do this with S3 either. We should still support it though.

@reiddraper
Copy link
Contributor

This is going to slip to 1.5. WIP branch is at bugfix/create-specific-acl-grants-from-headers.

@reiddraper
Copy link
Contributor

  • Figure out what do to if both headers and xml body are provided on ACL subresource
  • Figure out what do to if both headers and canned ACL are provided
  • Make sure ACL owner is correctly calculated
  • Do buckets need to be able to accept ACL headers too?
  • What about multi-part upload?

@reiddraper
Copy link
Contributor

Figure out what do to if both headers and xml body are provided on ACL subresource

Returns HTTP 400:

<Error><Code>UnexpectedContent</Code><Message>This request does not support content</Message><RequestId>85EB8464B89BB03E</RequestId><HostId>drrWMRbRXPGihrJdRKAZ7Alb1GpeNzgPvoWF7NvvQdixs8YgGFTU92qtxk5ie+V9</HostId></Error>

@reiddraper
Copy link
Contributor

Do buckets need to be able to accept ACL headers too?

Yes. See PUT bucket acl.

@reiddraper
Copy link
Contributor

Figure out what do to if both headers and canned ACL are provided

Returns HTTP 400:

<Error><Code>InvalidRequest</Code><Message>Specifying both Canned ACLs and Header Grants is not allowed</Message><RequestId>EC102B8ECAD083BC</RequestId><HostId>aXzlpoh3iYt1Z8zDflklsxV/J7Hitcx2+8sA4x1vt0sZDaCi4ekozzOFf7Ccdjbv</HostId></Error>```

@reiddraper
Copy link
Contributor

And if you specify all three:

  • canned acl
  • specific acl header
  • xml body

you get the HTTP 400 unexpected content warning.

@reiddraper
Copy link
Contributor

What about multi-part upload?

We'll need to support the same acl headers in Initiate multipart upload.

@reiddraper
Copy link
Contributor

Note: we don't support changing the ACL right now with Put Copy (we should). This was added in #425.

Previously, ACLs could only be set via canned-ACL headers, or through
the XML document PUT to the ?acl subresource. S3 also supports ACLs
being set through non-canned ACL headers, like `x-amz-grant-read` and
`x-amz-grant-write`. The specifics of the syntax are described in each
of the resources, like in [PUT
Bucket](http://docs.aws.amazon.com/AmazonS3/latest/API/RESTBucketPUT.html).
More information on ACLs can also be found on the [ACL
Overview](http://docs.aws.amazon.com/AmazonS3/latest/dev/ACLOverview.html).
With this change, there are now three ways to specify an ACL for a
bucket or object:

1. a canned-ACL header
2. specific ACL-grants in the headers
3. an XML document PUT to the ACL subresource

Only _one_ of these can be provided on a given request. HTTP 400 will be
returned if the user provides more than one of these ACLs, with an error
message equivalent to what S3 provides. This change applies to the
following resources:

* riak_cs_wm_object
* riak_cs_wm_object_acl
* riak_cs_wm_bucket
* riak_cs_wm_bucket_acl
* riak_cs_wm_object_upload (initiate multi-part upload)

The following lists several other notable changes:

* Along the way, this commit also fixes a bug in
 `riak_cs_acl_utils:add_grant/2`:

 `lists:splitwith` only splits on the first occurrence. `lists:partition`
 provides the expected behavior here. An EQC test has been added.

* Remove unused record field

 This record is never persisted, to changing the field would only matter
 for live-code changes, which we tend not to do.

* Add riak_test and tests to both the Clojure and Python test suites.
@@ -92,8 +93,7 @@
api :: atom()
}).

-record(key_context, {context :: #context{},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice catch. I didn't even know it was never used.

@kuenishi
Copy link
Contributor

I'll take this review, but the time is up for today. Will resume tomorrow.

  • code review
  • spec review
  • riak_test
  • dialyzer
  • xref
  • eunit

Here's one high-level qestion: does the ACL thing have nothing to do with Swift API?

@reiddraper
Copy link
Contributor

Here's one high-level qestion: does the ACL thing have nothing to do with Swift API?

Nope.

@kuenishi
Copy link
Contributor

Note: in R16B03 make pulse failed. With patched R1501 it went well.

Compiled test/twop_set_eqc.erl
test/riak_cs_get_fsm_pulse.erl:none: internal error in lint_module;
crash reason: {function_clause,
                  [{erl_internal,bif,
                       [{atom,50,setup},{integer,50,0}],
                       [{file,"erl_internal.erl"},{line,248}]},
                   {erl_lint,expr,3,[{file,"erl_lint.erl"},{line,2018}]},
                   {erl_lint,'-expr_list/3-fun-0-',3,
                       [{file,"erl_lint.erl"},{line,2151}]},
                   {lists,foldl,3,[{file,"lists.erl"},{line,1248}]},
                   {erl_lint,expr_list,3,[{file,"erl_lint.erl"},{line,2150}]},
                   {erl_lint,'-expr_list/3-fun-0-',3,
                       [{file,"erl_lint.erl"},{line,2151}]},
                   {lists,foldl,3,[{file,"lists.erl"},{line,1248}]},
                   {erl_lint,expr_list,3,
                       [{file,"erl_lint.erl"},{line,2150}]}]}
ERROR: eunit failed while processing /home/kuenishi/src/riak_cs: rebar_abort

@kuenishi
Copy link
Contributor

One new warning seems to appear with this branch:

riak_cs_acl_utils.erl:188: Invalid type specification for function riak_cs_acl_utils:split_header_values_and_strip/1. The success typing is (string()) -> [string()]

Reid: fixed in 2f32ef8

eqc_test_() ->
{spawn,
[
{timeout, 60, ?_assertEqual(true, quickcheck(numtests(?TEST_ITERATIONS, ?QC_OUT(prop_add_grant_idempotent()))))}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trivial, but maybe this line is too long?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree it's long, but I've always had trouble in Erlang getting long lines to be readable once wrapped. This is what I came up with, but I think it's quite a bit less readable. Thoughts?

@kuenishi
Copy link
Contributor

So here comes another question; probably we don't need but just want to be sure. Do we do nny canonical id validation when PUTting ACL with "id=deadbeef01010101badbadbad" style?

Other part of the code looks nice and straightforward though I'm not a good clojure reader.

@reiddraper
Copy link
Contributor

Note: in R16B03 make pulse failed. With patched R1501 it went well.

I've been using the basho-patched R16B02.

Fixes dialyzer error
The test was written, and passing, but it wasn't being run because it
wasn't in the test suite description.
Prevent starts_and_ends_with_quotes from raising an exception by calling
`hd` on an empty list. Further, ensure at least one character could be
between the quotes.
@kuenishi
Copy link
Contributor

+1 to merge. Nice work! (Confirmed that make pulse passed with R16B02)

Note that this code will conflict against #761 . Please merge with care.

@reiddraper
Copy link
Contributor

I still have to fix the test that wasn't included, too. I'll probably just end up rebasing and squashing this whole branch against develop one more time before merging too.

This test was previously written, but just not run
@reiddraper
Copy link
Contributor

I've pushed a rebased (not-yet-squashed) branch here. Some of the conflicts were detected in the rebase, and others were fixed in 3be0660.

@reiddraper
Copy link
Contributor

@shino would you mind taking a look at my rebased branch (and 3be0660) to make sure I've correctly used the features from your branch?

@reiddraper
Copy link
Contributor

Blagh, looks like 3be0660 doesn't work actually. Will look for a fix now.

@reiddraper
Copy link
Contributor

So, the issue I'm running into is that not all of the resources use the #key_context{} type. Some of them have an empty local_context, like riak_cs_wm_bucket and riak_cs_wm_bucket_acl. So since the bucket_object field was added to key_context, it's not available (nor populated with ensure_doc) in those resources. And my use of riak_cs_wm_utils:maybe_acl_from_context_and_request/2 tries to be agnostic to the specific context it's being called with. ie., sometimes the local_context field is going to be populated, and sometimes it won't. Any ideas? Maybe the bucket_object field should belong in the 'upper/outer' context?

@shino
Copy link
Contributor

shino commented Jan 22, 2014

Quick note before reading diffs/conflicts.

I added bucket_object to #key_context because bucket objects are not modified by object operations.
It may be possible to add bucket_object to upper context (#context) if we take care of bucket object updates.

@shino
Copy link
Contributor

shino commented Jan 22, 2014

One possible, but not-so-beautiful, fix is (I don't execute/test it):

--- a/src/riak_cs_wm_utils.erl
+++ b/src/riak_cs_wm_utils.erl
@@ -420,10 +420,21 @@ maybe_update_context_with_acl_from_headers(RD, Ctx=#context{user=User}) ->
 %% It could also reasonable be called `nothing'.
 -spec maybe_acl_from_context_and_request(#wm_reqdata{}, #context{}) ->
     {ok, acl_or_error()} | error.
-maybe_acl_from_context_and_request(RD, #context{user=User,
-                                                local_context=KeyContext,
+maybe_acl_from_context_and_request(RD, #context{bucket=Bucket,
+                                                user=User,
+                                                local_context=LocalContext,
                                                 riakc_pid=RiakcPid}) ->
-    BucketObj = KeyContext#key_context.bucket_object,
+    BucketObj = case LocalContext of
+                    %% #key_context has bucket object as its field
+                    _ when is_record(LocalContext, key_context) ->
+                        LocalContext#key_context.bucket_object;
+                    _ ->
+                        case riak_cs_utils:fetch_bucket_object(Bucket, RiakcPid) of
+                            {ok, Obj} -> Obj;
+                            %% In PUT Bucket, a bucket object does not exists beforehand.
+                            {error, no_such_bucket} -> ??;
+                            {error, notfound} -> ??;
+                            {error, Reason} -> real error??
+                end,

@reiddraper
Copy link
Contributor

Yeah, that was basically my thought as well. Hmm.

@reiddraper
Copy link
Contributor

What are the downsides of putting the bucket_object field in the outer #context{}, instead of #key_context{}?

@reiddraper
Copy link
Contributor

Put more concretely, what i'm suggesting is moving bucket_object to the outer context, and writing a function similar to ensure_doc, but that works on the outer #context{}.

@shino
Copy link
Contributor

shino commented Jan 23, 2014

Put more concretely, what i'm suggesting is moving bucket_object to the outer context, and writing a function similar to ensure_doc, but that works on the outer #context{}.

No strong objection :)

I decided to use bucket_object as cache (= unmodified in wm resource lifecycle)
and put it to #key_context at #761.
Some thoughts at that time...:

  • riak_cs_wm_buckets is not bucket-specific
  • Some of wm resources related to bucket (e.g. riak_cs_wm_bucket) triggers updates
    of bucket objects, but updates are executed in stanchion, the outside of riak_cs.
    We should decide what bucket_object is set to, undefined or
    it_has_been_updated_and_stale_so_get_again or new riak object by executing get again.

So if I include bucket_object in #context, it would represent some
kinds of status, a little too complicated to single field.

  1. normal bucket object
  2. not exists: it's correct when PUT Bucket (create bucket)
  3. unspecified: riak_cs_wm_buckets case
  4. (optional) stale maker

@shino
Copy link
Contributor

shino commented Jan 23, 2014

Just an idea: instead of using raw riak object, holding a record (brand new one or ?RCS_BUCKET, hmm) might work better (?)

@kuenishi
Copy link
Contributor

Put more concretely, what i'm suggesting is moving bucket_object to the outer context, and writing a function similar to ensure_doc, but that works on the outer #context{}.

+1 to this idea.

So far that's enough to proceed other works - but is there any reason why ACL is stored as metadata, not bucket object itself? To do a sole refactoring, we'd better move from #moss_bucket_v1{} to #rcs_bucket_v2{}, which includes bucket policy as well. Or this might be another refactoring issue.

@reiddraper
Copy link
Contributor

Yeah, @shino makes some good points. Not all resources that have a context operate on just a single bucket (though there is already a bucket field in the context now (facepalm)). Maybe I will just make a separate local_context for riak_cs_wm_bucket and riak_cs_wm_bucket_acl for now, and ensure the bucket_object field is populated. Would be really nice is we were able to my polymorphic and say something like context.bucket_object, where context was one of key_context or this new context I describe. Guess we'll have to wait for maps.

@reiddraper
Copy link
Contributor

Alright, still need to squash and stuff, but feeling pretty good about the changes here. Not the most elegant solution, but it passes the tests, and could be worse.

@kuenishi
Copy link
Contributor

Those changes looks pretty good to me.

@kuenishi
Copy link
Contributor

Dialyzer, riak_test, xref, eunit, pulse, all tests are also green to me. A newly squashed branch named bugfix/create-specific-acl-grants-from-headers-squashed-rebased-squashed would be cool :D

---------------------------------------------
0 Tests Failed
17 Tests Passed
That's 100.0% for those keeping score

@reiddraper
Copy link
Contributor

Alright, what I intend to be the final branch/commit has been pushed here. It's just been squashed from the previous branch.

@kuenishi
Copy link
Contributor

+1 to that branch ( bugfix/create-specific-acl-grants-from-headers-squashed-rebased-squashed ) .
I'm coming to think much more elegant solution can be found only after major refactoring of module and function namings around wm resources (though FSMs are in rather good modularity).

@reiddraper
Copy link
Contributor

Merged in 3039694.

@reiddraper reiddraper closed this Jan 29, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants