Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Expose --metadata-directive for s3 commands #1188

Merged
merged 3 commits into from Mar 5, 2015

Conversation

kyleknap
Copy link
Member

Before if you specified parameters like --content-type, --content-encoding, etc when you were doing a CopyObject, those values would not be applied to the copied object. To actually use those values you need to specify REPLACE for the metadata-directive header. So this exposes the header as --metadata-directive for cp and mv. Note this is the same parameter name as the a3api copy-object parameter.

cc @jamesls @danielgtaylor

@coveralls
Copy link

Coverage Status

Coverage increased (+0.23%) to 92.03% when pulling b88c3b4 on kyleknap:cp-replace into bd14d17 on aws:develop.

@danielgtaylor
Copy link
Contributor

Overall I think that this looks okay, but I have some high level comments. The fact that I can run this and get surprising behavior without any warnings or errors is kind of scary:

$ aws s3 cp s3://bucket/key s3://bucket/key2 --content-type text/plain

Considering the s3 command suite is very high level we might think about either making sure we detect cases like this and print out a big warning, or we handle this for the user by automatically setting REPLACE and copying all the other values. As a user I would expect the above command to do the following:

  1. Copy the object data to the new location
  2. Copy any associated metadata to the new location
  3. Overwrite the content type to be text/plain

If I want fine-grained control, I could always pass --metadata-directive myself to prevent the automatic behavior above. I'm guessing the above steps may be able to remove the element of surprise for the 80% of "normal" use cases. Thoughts?

@kyleknap
Copy link
Member Author

Yes that is the ideal scenario that you laid out.

I talked to @jamesls about this. We considered automatically setting REPLACE at the beginning but we started realizing that it would require a larger change in the s3 commands and much more discussion in order to ensure consistency and to meet user expectations.

One HeadObject will not give us all of the information that we need to get an exact copy of the object. In the end it will require like 4 or 5 different operations not including the copy to get an exact copy of the object. So it would be weird if just used HeadObject and some information was copied over while others parts were not. Also, it would force users, at the minimum, to use an extra HeadObject call for each object copied even though the user may know exactly what metadata parameters they want in the copied object. In short, we need some opt-in or opt-out option for these types of workflows.

Exposing --metadata-directive just provides a quick way for users to work around this issue currently as it was exposed in s3api but not s3 commands.

I would not be opposed to adding a warning, but there is one edge case that comes up: a multipart copy has completely different metadata behavior than a non-multipart copy. For the multipart-copy, it will actually obey the passing of metadata like --content-type because it is unable to automatically copy metadata. This causes an issue because there is no way to differentiate if any of the files that we plan to upload require a non-multipart copy until we are well into the file transferring process. To successfully be able to warn about this, it will be a good amount of extra complexity in the file transferring part of the code (i.e. is not a simple check at the beginning of the process). If we did add a simple check at the beginning, it would make false-positive warnings at the beginning of the code for multipart copies. I could also just include this tidbit about multipart copies in the description of the parameter.

Let me know what you think.

@danielgtaylor
Copy link
Contributor

@kyleknap, given the complexity of making this simpler and the fact that customers of the high level S3 commands are currently blocked on getting any method of setting this parameter, I'm in favor of merging in this change to unblock those customers.

I would like to circle back in the future and think about how best to provide an interface with the least surprise for customers. I don't think that being difficult to implement is a good argument, but increasing customer cost and general inconsistency from the underlying low-level API are both good reasons to punt on this for now until we can come up with a better solution. It feels like this may need to be an opt-in feature.

Please add some information about multipart copies to the docs, then 🚢-it!

@kyleknap
Copy link
Member Author

kyleknap commented Mar 5, 2015

@danielgtaylor I updated the wording for the --metadata-directive docs let me know how that sounds.

@jamesls
Copy link
Member

jamesls commented Mar 5, 2015

FWIW, :shipit: looks good to me.

@danielgtaylor
Copy link
Contributor

LGTM 🚢-it!

kyleknap added a commit that referenced this pull request Mar 5, 2015
Expose --metadata-directive for s3 commands
@kyleknap kyleknap merged commit 272791f into aws:develop Mar 5, 2015
@kyleknap kyleknap deleted the cp-replace branch March 5, 2015 20:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants