Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can not PUT a 0 bytes object when Object Lock is enabled on s3 bucket #1979

Closed
wimglenn opened this issue Feb 21, 2020 · 9 comments · Fixed by #1985 or #2221
Closed

Can not PUT a 0 bytes object when Object Lock is enabled on s3 bucket #1979

wimglenn opened this issue Feb 21, 2020 · 9 comments · Fixed by #1985 or #2221
Assignees
Labels
bug This issue is a confirmed bug. s3

Comments

@wimglenn
Copy link
Contributor

wimglenn commented Feb 21, 2020

When uploading an empty file to an s3 bucket which has object lock enabled:

An error occurred (InvalidRequest) when calling the PutObject operation: Content-MD5 HTTP header is required for Put Object requests with Object Lock parameters

The API requires this header, but in botocore it does not inject it, I think problem is probably here within the handler for before-call.s3.PutObject:

https://github.com/boto/botocore/blame/1aac4d7c1f311826a1a4686e54444f6ded72ea45/botocore/handlers.py#L187-L196

Look at the if-statement:

if request_dict['body'] and 'Content-MD5' not in params['headers']:
    # ... the params['headers']['Content-MD5'] gets set ...

It neglects the possibility that request_dict["body"] is empty.

For >= 1 bytes files, content-md5 gets automatically set and upload succeeds.

@swetashre swetashre self-assigned this Feb 21, 2020
@swetashre swetashre added the s3 label Feb 21, 2020
@swetashre
Copy link
Contributor

Thank you for your post. I am not able to reproduce the issue with a Object Lock enabled s3 bucket. Can you please give me a code sample to reproduce the issue ? I have tested with this code:

s3 = boto3.client('s3')
data = ''
res = s3.put_object(Bucket = 'bucketname', Body=data, Key='emptyfile.txt')

It would be helpful if you could provide me debug logs as well. You can enable logs by adding boto3.set_stream_logger('') to your code.

@wimglenn
Copy link
Contributor Author

wimglenn commented Feb 21, 2020

Hi @swetashre thank you for the prompt response. It seems you're correct this doesn't occur with put_object. But you should see that in this case of using s3.upload_file, the content-md5 does not get set. Here is a reproducer:

import boto3

boto3.set_stream_logger('')
s3 = boto3.client('s3')
with open("./emptyfile.txt", "w") as f:
    pass
res = s3.upload_file(Bucket='bucketname', Filename="emptyfile.txt", Key='emptyfile.txt')

@swetashre
Copy link
Contributor

I am not able to reproduce the issue with upload_file api call. Can you please provide me debug log ? You can enable log by adding boto3.set_stream_logger('') to your code.

@wimglenn
Copy link
Contributor Author

Here is the log. I have search/replaced the actual bucket name with example-bucket and replaced some strings which looked auth-related with XXXXX.

>>> botocore.__version__
'1.15.5'
>>> boto3.__version__
'1.12.5'
>>> client.upload_file(Bucket=bucket, Filename=fname, Key=key)
2020-02-24 14:50:32,471 s3transfer.utils [DEBUG] Acquiring 0
2020-02-24 14:50:32,472 s3transfer.tasks [DEBUG] UploadSubmissionTask(transfer_id=0, {'transfer_future': <s3transfer.futures.TransferFuture object at 0x7fffe77f5780>}) about to wait for the following futures []
2020-02-24 14:50:32,472 s3transfer.tasks [DEBUG] UploadSubmissionTask(transfer_id=0, {'transfer_future': <s3transfer.futures.TransferFuture object at 0x7fffe77f5780>}) done waiting for dependent futures
2020-02-24 14:50:32,473 s3transfer.tasks [DEBUG] Executing task UploadSubmissionTask(transfer_id=0, {'transfer_future': <s3transfer.futures.TransferFuture object at 0x7fffe77f5780>}) with kwargs {'client': <botocore.client.S3 object at 0x7fffe7676a20>, 'config': <boto3.s3.transfer.TransferConfig object at 0x7fffe785cef0>, 'osutil': <s3transfer.utils.OSUtils object at 0x7fffe785cdd8>, 'request_executor': <s3transfer.futures.BoundedExecutor object at 0x7fffe77f5048>, 'transfer_future': <s3transfer.futures.TransferFuture object at 0x7fffe77f5780>}
2020-02-24 14:50:32,473 s3transfer.futures [DEBUG] Submitting task PutObjectTask(transfer_id=0, {'bucket': 'example-bucket', 'key': 'emptyfile.txt', 'extra_args': {}}) to executor <s3transfer.futures.BoundedExecutor object at 0x7fffe77f5048> for transfer request: 0.
2020-02-24 14:50:32,473 s3transfer.utils [DEBUG] Acquiring 0
2020-02-24 14:50:32,474 s3transfer.tasks [DEBUG] PutObjectTask(transfer_id=0, {'bucket': 'example-bucket', 'key': 'emptyfile.txt', 'extra_args': {}}) about to wait for the following futures []
2020-02-24 14:50:32,474 s3transfer.utils [DEBUG] Releasing acquire 0/None
2020-02-24 14:50:32,474 s3transfer.tasks [DEBUG] PutObjectTask(transfer_id=0, {'bucket': 'example-bucket', 'key': 'emptyfile.txt', 'extra_args': {}}) done waiting for dependent futures
2020-02-24 14:50:32,474 s3transfer.tasks [DEBUG] Executing task PutObjectTask(transfer_id=0, {'bucket': 'example-bucket', 'key': 'emptyfile.txt', 'extra_args': {}}) with kwargs {'client': <botocore.client.S3 object at 0x7fffe7676a20>, 'fileobj': <s3transfer.utils.ReadFileChunk object at 0x7fffe77f5a20>, 'bucket': 'example-bucket', 'key': 'emptyfile.txt', 'extra_args': {}}
2020-02-24 14:50:32,476 botocore.hooks [DEBUG] Event before-parameter-build.s3.PutObject: calling handler <function validate_ascii_metadata at 0x7fffe94d3e18>
2020-02-24 14:50:32,476 botocore.hooks [DEBUG] Event before-parameter-build.s3.PutObject: calling handler <function sse_md5 at 0x7fffe94d32f0>
2020-02-24 14:50:32,476 botocore.hooks [DEBUG] Event before-parameter-build.s3.PutObject: calling handler <function convert_body_to_file_like_object at 0x7fffe94d4730>
2020-02-24 14:50:32,476 botocore.hooks [DEBUG] Event before-parameter-build.s3.PutObject: calling handler <function validate_bucket_name at 0x7fffe94d3268>
2020-02-24 14:50:32,476 botocore.hooks [DEBUG] Event before-parameter-build.s3.PutObject: calling handler <bound method S3RegionRedirector.redirect_from_cache of <botocore.utils.S3RegionRedirector object at 0x7fffe78a1c50>>
2020-02-24 14:50:32,476 botocore.hooks [DEBUG] Event before-parameter-build.s3.PutObject: calling handler <bound method S3ArnParamHandler.handle_arn of <botocore.utils.S3ArnParamHandler object at 0x7fffe785c278>>
2020-02-24 14:50:32,476 botocore.hooks [DEBUG] Event before-parameter-build.s3.PutObject: calling handler <function generate_idempotent_uuid at 0x7fffe94d0e18>
2020-02-24 14:50:32,477 botocore.hooks [DEBUG] Event before-call.s3.PutObject: calling handler <function conditionally_calculate_md5 at 0x7fffe94d31e0>
2020-02-24 14:50:32,477 botocore.hooks [DEBUG] Event before-call.s3.PutObject: calling handler <function add_expect_header at 0x7fffe94d3598>
2020-02-24 14:50:32,477 botocore.handlers [DEBUG] Adding expect 100 continue header to request.
2020-02-24 14:50:32,477 botocore.hooks [DEBUG] Event before-call.s3.PutObject: calling handler <bound method S3RegionRedirector.set_request_url of <botocore.utils.S3RegionRedirector object at 0x7fffe78a1c50>>
2020-02-24 14:50:32,477 botocore.hooks [DEBUG] Event before-call.s3.PutObject: calling handler <function inject_api_version_header_if_needed at 0x7fffe94d4840>
2020-02-24 14:50:32,477 botocore.endpoint [DEBUG] Making request for OperationModel(name=PutObject) with params: {'url_path': '/example-bucket/emptyfile.txt', 'query_string': {}, 'method': 'PUT', 'headers': {'User-Agent': 'Boto3/1.12.5 Python/3.6.7 Linux/3.10.0-862.3.2.el7.jump1.x86_64 Botocore/1.15.5 Resource', 'Expect': '100-continue'}, 'body': <s3transfer.utils.ReadFileChunk object at 0x7fffe77f5a20>, 'url': 'https://s3.us-east-2.amazonaws.com/example-bucket/emptyfile.txt', 'context': {'client_region': 'us-east-2', 'client_config': <botocore.config.Config object at 0x7fffe7680320>, 'has_streaming_input': True, 'auth_type': None, 'signing': {'bucket': 'example-bucket'}}}
2020-02-24 14:50:32,478 botocore.hooks [DEBUG] Event request-created.s3.PutObject: calling handler <function signal_not_transferring at 0x7fffe78dfd90>
2020-02-24 14:50:32,478 botocore.hooks [DEBUG] Event request-created.s3.PutObject: calling handler <bound method RequestSigner.handler of <botocore.signers.RequestSigner object at 0x7fffe7676e80>>
2020-02-24 14:50:32,478 botocore.hooks [DEBUG] Event choose-signer.s3.PutObject: calling handler <function set_operation_specific_signer at 0x7fffe94d0d08>
2020-02-24 14:50:32,478 botocore.hooks [DEBUG] Event before-sign.s3.PutObject: calling handler <bound method S3EndpointSetter.set_endpoint of <botocore.utils.S3EndpointSetter object at 0x7fffe785c2e8>>
2020-02-24 14:50:32,479 botocore.utils [DEBUG] Defaulting to S3 virtual host style addressing with path style addressing fallback.
2020-02-24 14:50:32,479 botocore.utils [DEBUG] Checking for DNS compatible bucket for: https://s3.us-east-2.amazonaws.com/example-bucket/emptyfile.txt
2020-02-24 14:50:32,479 botocore.utils [DEBUG] URI updated to: https://example-bucket.s3.us-east-2.amazonaws.com/emptyfile.txt
2020-02-24 14:50:32,479 botocore.auth [DEBUG] Calculating signature using v4 auth.
2020-02-24 14:50:32,479 botocore.auth [DEBUG] CanonicalRequest:
PUT
/emptyfile.txt

host:example-bucket.s3.us-east-2.amazonaws.com
x-amz-content-sha256:e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
x-amz-date:20200224T205032Z

host;x-amz-content-sha256;x-amz-date
e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855
2020-02-24 14:50:32,479 botocore.auth [DEBUG] StringToSign:
AWS4-HMAC-SHA256
20200224T205032Z
20200224/us-east-2/s3/aws4_request
25179a5532ae4de45526bd3205b8d77b8ddf6d2c73220697fc77635a4716f65c
2020-02-24 14:50:32,479 botocore.auth [DEBUG] Signature:
XXXX
2020-02-24 14:50:32,480 botocore.hooks [DEBUG] Event request-created.s3.PutObject: calling handler <function signal_transferring at 0x7fffe78dfe18>
2020-02-24 14:50:32,480 botocore.endpoint [DEBUG] Sending http request: <AWSPreparedRequest stream_output=False, method=PUT, url=https://example-bucket.s3.us-east-2.amazonaws.com/emptyfile.txt, headers={'User-Agent': b'Boto3/1.12.5 Python/3.6.7 Linux/3.10.0-862.3.2.el7.jump1.x86_64 Botocore/1.15.5 Resource', 'Expect': b'100-continue', 'X-Amz-Date': b'20200224T205032Z', 'X-Amz-Content-SHA256': b'e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855', 'Authorization': b'AWS4-HMAC-SHA256 Credential=XXXXX/us-east-2/s3/aws4_request, SignedHeaders=host;x-amz-content-sha256;x-amz-date, Signature=XXXX', 'Content-Length': '0'}>
2020-02-24 14:50:32,481 urllib3.connectionpool [DEBUG] Starting new HTTPS connection (1): example-bucket.s3.us-east-2.amazonaws.com:443
2020-02-24 14:50:32,519 botocore.awsrequest [DEBUG] Waiting for 100 Continue response.
2020-02-24 14:50:32,531 botocore.awsrequest [DEBUG] Received a non 100 Continue response from the server, NOT sending request body.
2020-02-24 14:50:32,531 urllib3.connectionpool [DEBUG] https://example-bucket.s3.us-east-2.amazonaws.com:443 "PUT /emptyfile.txt HTTP/1.1" 400 None
2020-02-24 14:50:32,532 botocore.parsers [DEBUG] Response headers: {'x-amz-request-id': '810C625F52FBD9A9', 'x-amz-id-2': 'eQKHswqMZ2Gq8GaEKpnEVvl1jZVguBgw9C0HfL8nLQtu8UW0ZCw2iy1TyzkCmEQxDvrcYlsBvlw=', 'Content-Type': 'application/xml', 'Transfer-Encoding': 'chunked', 'Date': 'Mon, 24 Feb 2020 20:50:32 GMT', 'Connection': 'close', 'Server': 'AmazonS3'}
2020-02-24 14:50:32,532 botocore.parsers [DEBUG] Response body:
b'<?xml version="1.0" encoding="UTF-8"?>\n<Error><Code>InvalidRequest</Code><Message>Content-MD5 HTTP header is required for Put Object requests with Object Lock parameters</Message><RequestId>810C625F52FBD9A9</RequestId><HostId>eQKHswqMZ2Gq8GaEKpnEVvl1jZVguBgw9C0HfL8nLQtu8UW0ZCw2iy1TyzkCmEQxDvrcYlsBvlw=</HostId></Error>'
2020-02-24 14:50:32,532 botocore.hooks [DEBUG] Event needs-retry.s3.PutObject: calling handler <botocore.retryhandler.RetryHandler object at 0x7fffe78a1ac8>
2020-02-24 14:50:32,532 botocore.retryhandler [DEBUG] No retry needed.
2020-02-24 14:50:32,533 botocore.hooks [DEBUG] Event needs-retry.s3.PutObject: calling handler <bound method S3RegionRedirector.redirect_from_error of <botocore.utils.S3RegionRedirector object at 0x7fffe78a1c50>>
2020-02-24 14:50:32,535 s3transfer.tasks [DEBUG] Exception raised.
Traceback (most recent call last):
  File "/tmp/.venv/lib64/python3.6/site-packages/s3transfer/tasks.py", line 126, in __call__
    return self._execute_main(kwargs)
  File "/tmp/.venv/lib64/python3.6/site-packages/s3transfer/tasks.py", line 150, in _execute_main
    return_value = self._main(**kwargs)
  File "/tmp/.venv/lib64/python3.6/site-packages/s3transfer/upload.py", line 692, in _main
    client.put_object(Bucket=bucket, Key=key, Body=body, **extra_args)
  File "/tmp/.venv/lib64/python3.6/site-packages/botocore/client.py", line 316, in _api_call
    return self._make_api_call(operation_name, kwargs)
  File "/tmp/.venv/lib64/python3.6/site-packages/botocore/client.py", line 626, in _make_api_call
    raise error_class(parsed_response, operation_name)
botocore.exceptions.ClientError: An error occurred (InvalidRequest) when calling the PutObject operation: Content-MD5 HTTP header is required for Put Object requests with Object Lock parameters
2020-02-24 14:50:32,536 s3transfer.utils [DEBUG] Releasing acquire 0/None
Traceback (most recent call last):
  File "/tmp/.venv/lib64/python3.6/site-packages/boto3/s3/transfer.py", line 279, in upload_file
    future.result()
  File "/tmp/.venv/lib64/python3.6/site-packages/s3transfer/futures.py", line 106, in result
    return self._coordinator.result()
  File "/tmp/.venv/lib64/python3.6/site-packages/s3transfer/futures.py", line 265, in result
    raise self._exception
  File "/tmp/.venv/lib64/python3.6/site-packages/s3transfer/tasks.py", line 126, in __call__
    return self._execute_main(kwargs)
  File "/tmp/.venv/lib64/python3.6/site-packages/s3transfer/tasks.py", line 150, in _execute_main
    return_value = self._main(**kwargs)
  File "/tmp/.venv/lib64/python3.6/site-packages/s3transfer/upload.py", line 692, in _main
    client.put_object(Bucket=bucket, Key=key, Body=body, **extra_args)
  File "/tmp/.venv/lib64/python3.6/site-packages/botocore/client.py", line 316, in _api_call
    return self._make_api_call(operation_name, kwargs)
  File "/tmp/.venv/lib64/python3.6/site-packages/botocore/client.py", line 626, in _make_api_call
    raise error_class(parsed_response, operation_name)
botocore.exceptions.ClientError: An error occurred (InvalidRequest) when calling the PutObject operation: Content-MD5 HTTP header is required for Put Object requests with Object Lock parameters

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/tmp/.venv/lib64/python3.6/site-packages/boto3/s3/inject.py", line 131, in upload_file
    extra_args=ExtraArgs, callback=Callback)
  File "/tmp/.venv/lib64/python3.6/site-packages/boto3/s3/transfer.py", line 287, in upload_file
    filename, '/'.join([bucket, key]), e))
boto3.exceptions.S3UploadFailedError: Failed to upload /abspath/emptyfile.txt to example-bucket/emptyfile.txt: An error occurred (InvalidRequest) when calling the PutObject operation: Content-MD5 HTTP header is required for Put Object requests with Object Lock parameters

@wimglenn
Copy link
Contributor Author

wimglenn commented Feb 24, 2020

I put a breakpoint in botocore (within handlers.py:calculate_md5) and see this:

In [1]: request_dict
Out[1]: 
{'url_path': '/example-bucket/emptyfile.txt',
 'query_string': {},
 'method': 'PUT',
 'headers': {'User-Agent': '...'},
 'body': <s3transfer.utils.ReadFileChunk at 0x7fffe6955e20>,
 'url': 'https://s3.us-east-2.amazonaws.com/example-bucket/emptyfile.txt',
 'context': {'client_region': 'us-east-2',
  'client_config': <botocore.config.Config at 0x7fffe78fccd0>,
  'has_streaming_input': True,
  'auth_type': None,
  'signing': {'bucket': 'example-bucket'}}}

In [2]: request_dict["body"]
Out[2]: <s3transfer.utils.ReadFileChunk at 0x7fffe6955e20>

In [3]: bool(request_dict["body"])
Out[3]: False

Issue is fixed with this patch:

diff --git botocore/handlers.py botocore/handlers.py
index daa22886..b6a4b276 100644
--- botocore/handlers.py
+++ botocore/handlers.py
@@ -186,7 +186,7 @@ def json_decode_template_body(parsed, **kwargs):

 def calculate_md5(params, **kwargs):
     request_dict = params
-    if request_dict['body'] and 'Content-MD5' not in params['headers']:
+    if 'body' in request_dict and 'Content-MD5' not in params['headers']:
         body = request_dict['body']
         if isinstance(body, (bytes, bytearray)):
             binary_md5 = _calculate_md5_from_bytes(body)

@swetashre
Copy link
Contributor

@wimglenn - I have not correctly configured the retention period that's why i was not getting error. After configured it correctly i am able to reproduce the issue.

@swetashre swetashre added the bug This issue is a confirmed bug. label Feb 26, 2020
wimglenn added a commit to wimglenn/botocore that referenced this issue Feb 27, 2020
@wimglenn
Copy link
Contributor Author

wimglenn commented Mar 5, 2020

@swetashre Any update on this? We are limping along with a locally patched botocore for the moment.

@jangrewe
Copy link

@swetashre we're experiencing the same issue. Any chance to the the fix from @wimglenn merged, please?

wimglenn added a commit to wimglenn/botocore that referenced this issue Oct 17, 2020
nateprewitt pushed a commit that referenced this issue Nov 16, 2020
* Include Content-MD5 header even when body is empty. Closes #1979
@github-actions
Copy link

⚠️COMMENT VISIBILITY WARNING⚠️

Comments on closed issues are hard for our team to see.
If you need more assistance, please either tag a team member or open a new issue that references this one.
If you wish to keep having a conversation with other community members under this issue feel free to do so.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug This issue is a confirmed bug. s3
Projects
None yet
3 participants