Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

S3: head_object and get_object should accept the same parameters #3945

Closed
jrandall opened this issue Nov 22, 2023 · 4 comments
Closed

S3: head_object and get_object should accept the same parameters #3945

jrandall opened this issue Nov 22, 2023 · 4 comments
Assignees
Labels
api-documentation documentation This is a problem with documentation. s3 service-api This issue is caused by the service API, not the SDK implementation.

Comments

@jrandall
Copy link

Describe the bug

The documentation for HeadObject says that "A HEAD request has the same options as a GET action on an object." This is consistent with the HTTP semantics for HEAD as defined in RFC7231 section 4.3.2. I have verified that the AWS S3 service does in fact accept all of the parameters for GetObject as parameters to HeadObject and that the response is identical to that of a GetObject request.

However, when the documentation goes on to enumerate the set of URI Request Parameters listed for HeadObject, this does not match the set of URI Request Parameters listed for GetObject in that the former HeadObject set is missing the six response-* parameters that are present in the latter GetObject set.

My guess is that this documentation and the botocore service description for s3 are generated from the same source material, and as a result the botocore client does not accept the six Response* parameters to head_object that are accepted by get_object. Importantly, this is also a problem when generating a presigned URL (e.g. by calling generate_presigned_url with ClientMethod='head_object').

When building an HTTP service that generates pre-signed S3 URLs to which it redirects its own GET requests (including setting custom response headers), this limitation means it is not possible to implement a semantically appropriate response to a HEAD request. Ideally, we could call generate_presigned_url with exactly the same set of Params and just change the ClientMethod from get_object when redirecting a GET request and head_object when redirecting a HEAD request.

Expected Behavior

Calls to head_object should accept the same parameters as calls to get_object and calls to generate_presigned_url with ClientMethod='head_object' should accept the same parameters as with ClientMethod='get_object'

Current Behavior

% python3
Python 3.11.6 (main, Oct  2 2023, 20:46:17) [Clang 14.0.0 (clang-1400.0.29.202)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import boto3
>>> s3 = boto3.client('s3')
>>> params = {'Bucket': 'BUCKET_NAME', 'Key': 'OBJECT_NAME', 'ResponseContentType': 'text/plain'}
>>> s3.generate_presigned_url(ClientMethod='get_object', Params=params)
'https://s3.us-west-2.amazonaws.com/BUCKET_NAME/OBJECT_NAME?response-content-type=text%2Fplain&AWSAccessKeyId=NOT_A_REAL_ACCESS_KEY_ID&Signature=VCnPUJ0OX7JFZO6FXJemDaT%2BCNs%3D&Expires=1700685212'
>>> s3.generate_presigned_url(ClientMethod='head_object', Params=params)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/boto-venv/lib/python3.11/site-packages/botocore/signers.py", line 672, in generate_presigned_url
    request_dict = self._convert_to_request_dict(
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/boto-venv/lib/python3.11/site-packages/botocore/client.py", line 1010, in _convert_to_request_dict
    request_dict = self._serializer.serialize_to_request(
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/boto-venv/lib/python3.11/site-packages/botocore/validate.py", line 381, in serialize_to_request
    raise ParamValidationError(report=report.generate_report())
botocore.exceptions.ParamValidationError: Parameter validation failed:
Unknown parameter in input: "ResponseContentType", must be one of: Bucket, IfMatch, IfModifiedSince, IfNoneMatch, IfUnmodifiedSince, Key, Range, VersionId, SSECustomerAlgorithm, SSECustomerKey, SSECustomerKeyMD5, RequestPayer, PartNumber, ExpectedBucketOwner, ChecksumMode

Reproduction Steps

import boto3
s3 = boto3.client('s3')
params = {'Bucket': 'BUCKET_NAME', 'Key': 'OBJECT_NAME', 'ResponseContentType': 'text/plain'}
s3.generate_presigned_url(ClientMethod='get_object', Params=params)
s3.generate_presigned_url(ClientMethod='head_object', Params=params)

Possible Solution

I have confirmed that the simple fix of modifying the S3 service description in botocore such that the members field for HeadObjectRequest is identical to that of GetObjectRequest (by simply copying the six entries from GetObjectRequest to the appropriate place in HeadObjectRequest ) solves this issue completely.

Patch:

diff --git a/botocore/data/s3/2006-03-01/service-2.json b/botocore/data/s3/2006-03-01/service-2.json
index 94ce66d32..f557fb699 100644
--- a/botocore/data/s3/2006-03-01/service-2.json
+++ b/botocore/data/s3/2006-03-01/service-2.json
@@ -5451,6 +5451,42 @@
           "location":"header",
           "locationName":"Range"
         },
+        "ResponseCacheControl":{
+          "shape":"ResponseCacheControl",
+          "documentation":"<p>Sets the <code>Cache-Control</code> header of the response.</p>",
+          "location":"querystring",
+          "locationName":"response-cache-control"
+        },
+        "ResponseContentDisposition":{
+          "shape":"ResponseContentDisposition",
+          "documentation":"<p>Sets the <code>Content-Disposition</code> header of the response</p>",
+          "location":"querystring",
+          "locationName":"response-content-disposition"
+        },
+        "ResponseContentEncoding":{
+          "shape":"ResponseContentEncoding",
+          "documentation":"<p>Sets the <code>Content-Encoding</code> header of the response.</p>",
+          "location":"querystring",
+          "locationName":"response-content-encoding"
+        },
+        "ResponseContentLanguage":{
+          "shape":"ResponseContentLanguage",
+          "documentation":"<p>Sets the <code>Content-Language</code> header of the response.</p>",
+          "location":"querystring",
+          "locationName":"response-content-language"
+        },
+        "ResponseContentType":{
+          "shape":"ResponseContentType",
+          "documentation":"<p>Sets the <code>Content-Type</code> header of the response.</p>",
+          "location":"querystring",
+          "locationName":"response-content-type"
+        },
+        "ResponseExpires":{
+          "shape":"ResponseExpires",
+          "documentation":"<p>Sets the <code>Expires</code> header of the response.</p>",
+          "location":"querystring",
+          "locationName":"response-expires"
+        },
         "VersionId":{
           "shape":"ObjectVersionId",
           "documentation":"<p>VersionId used to reference a specific version of the object.</p>",

Additional Information/Context

No response

SDK version used

boto3 1.29.5, botocore 1.32.5

Environment details (OS name and version, etc.)

Darwin 23.1.0 arm64

@jrandall jrandall added bug This issue is a confirmed bug. needs-triage This issue or PR still needs to be triaged. labels Nov 22, 2023
@jrandall jrandall changed the title head_object and get_object should accept the same parameters S3: head_object and get_object should accept the same parameters Nov 22, 2023
@tim-finnigan tim-finnigan self-assigned this Nov 29, 2023
@tim-finnigan tim-finnigan added investigating This issue is being investigated and/or work is in progress to resolve the issue. s3 labels Nov 29, 2023
@tim-finnigan
Copy link
Contributor

Hi @jrandall thanks for reaching out. In the documentation for the HeadObject API it notes:

The HEAD operation retrieves metadata from an object without returning the object itself. This operation is useful if you're interested only in an object's metadata.

A HEAD request has the same options as a GET operation on an object. The response is identical to the GET response except that there is no response body.

So the Response* parameters you referenced are intentionally excluded by design. Also these service API models are provided by the S3 team - and for questions involving service API functionality we recommend reaching out through AWS Support. For feedback on the documentation we recommend using the Provide feedback link at the bottom of API docs pages.

@tim-finnigan tim-finnigan added documentation This is a problem with documentation. service-api This issue is caused by the service API, not the SDK implementation. and removed bug This issue is a confirmed bug. investigating This issue is being investigated and/or work is in progress to resolve the issue. needs-triage This issue or PR still needs to be triaged. labels Nov 29, 2023
Copy link

⚠️COMMENT VISIBILITY WARNING⚠️

Comments on closed issues are hard for our team to see.
If you need more assistance, please either tag a team member or open a new issue that references this one.
If you wish to keep having a conversation with other community members under this issue feel free to do so.

@jrandall
Copy link
Author

Hi @tim-finnigan - thanks for your attention and for looking into this. The documentation you cited is exactly the documentation that tells me that the response parameters should be accepted by the HEAD method.

A HEAD request has the same options as a GET operation on an object. The response is identical to the GET response except that there is no response body.

To be clear, the response options do not in any way alter the response body, they only alter the headers sent back in the response. See the documentation for GetObject:

Overriding response header values through the request
There are times when you want to override certain response header values of a GetObject response. For example, you might override the Content-Disposition response header value through your GetObject request.

You can override values for a set of response headers. These modified response header values are included only in a successful response, that is, when the HTTP status code 200 OK is returned. The headers you can override using the following query parameters in the request are a subset of the headers that Amazon S3 accepts when you create an object.

At an HTTP level, the purpose of making a HEAD request on a resource is to examine the headers of the response that would be sent if a GET request were made, but without transferring the actual payload body.

From RFC7231:

The server SHOULD send the same header fields in response to a HEAD request as it would have sent if the request had been a GET,

To be clear, with the AWS S3 service this is possible. The S3 service itself appears to implement this completely consistently with the documentation and the expected HTTP semantics.

The only issue here is that the API documentation and the clients (such as boto3 / botocore) whose implementation is generated from the API model shared by the documentation does not specifically list the response parameters in the HeadObject documentation, only in the GetObject documentation.

Without support for the response options, a presigned URL for the HEAD method cannot be constructed that would give the same headers as a presigned URL for the GET method that sets the Response* options, because the headers in the response would be set differently in the GET response and the HEAD response. This is exactly the issue I am having. I can use a third-party library for interacting with S3 and avoid this issue, but I thought it would be good to bring it to your attention so that you can fix all of the AWS supported SDK libraries.

I will raise this issue through the documentation feedback form as well, as you suggest.

@jrandall
Copy link
Author

Just a note to say that I did pursue this with the AWS S3 service team and they have now released the fix. Public facing documentation has been fixed, and as of botocore and boto3 1.34.138, all of the response overrides available to the GetObject method are now also available to the HeadObject method: https://github.com/boto/boto3/blob/develop/.changes/1.34.138.json#L12-L16

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api-documentation documentation This is a problem with documentation. s3 service-api This issue is caused by the service API, not the SDK implementation.
Projects
None yet
Development

No branches or pull requests

2 participants