Simulate S3 BadDigest and InvalidDigest errors #2562

charleswhchan · 2020-06-15T18:41:08Z

Type of request: This is a ...

[ ] bug report
[x] feature request

Detailed description

As a Developer,
I want to be able to simulate BadDigest and InvalidDigest errors,
So that I can test the error handling logic

Expected behavior

https://docs.aws.amazon.com/AmazonS3/latest/API/ErrorResponses.html
BadDigest: The Content-MD5 you specified did not match what we received.
InvalidDigest: The Content-MD5 you specified is not valid.

Actual behavior

localstack returns InvalidDigest when MD5 does not match payload.

whummer · 2020-06-16T22:16:57Z

Thanks for reporting @charleswhchan . Not sure I fully understand your use case - can you please elaborate a bit? I think it should already be possible to pass the Content-MD5 header on certain S3 API requests. See here for the code that checks the user-specified MD5 hash and compares it to the actual computed hash of the request payload:

localstack/localstack/services/s3/s3_listener.py

Lines 706 to 714 in a60eb05

    
           def check_content_md5(data, headers): 
        
               actual = md5(strip_chunk_signatures(data)) 
        
               expected = headers['Content-MD5'] 
        
               try: 
        
                   expected = to_str(codecs.encode(base64.b64decode(expected), 'hex')) 
        
               except Exception: 
        
                   expected = '__invalid__' 
        
               if actual != expected: 
        
                   return error_response('The Content-MD5 you specified was invalid', 'InvalidDigest', status_code=400)

Can you please provide a small, isolated example of what you're trying to achieve? Thanks!

charleswhchan · 2020-06-17T00:03:32Z

Hi @whummer, thanks for looking into this.

What I am trying to do is write a test to write to S3 with localstack and produce conditions that would lead to BadDigest and InvalidDigest errors. This is so that I can check the error handling is working correctly in my code.

Based on my understanding of the AWS REST API.
In the code you listed above, when "actual != expected", localstack should return BadDigest as the checksum does not match.
Whereas if the content of "Context-MD5" is not a base64 encoded MD5, then the API should return InvalidDigest.

Does this help?

whummer · 2020-06-20T19:05:38Z

Ok, thanks for the clarification @charleswhchan . This should be fixed in #2592 - can you please give it a try with the latest version of the Docker image. The code now distinguishes between BadDigest and InvalidDigest, as per the AWS REST API spec. Hope that helps - please report here if the behavior is still not fully correct. Thanks

charleswhchan · 2020-06-21T22:54:55Z

Hi @whummer.

I just tried the latest docker image (link) and it seems to broken in a different way -- it is now returning BadDigest all the time.

I am using go-lang, and here are my 2 test cases:

Test 1.
checksum := "not base64 encoded checksum"

Expect InvalidDigest
Actual 0.11.1: InvalidDigest ✅
Actual latest: BadDigest ❌

Test 2

wrongCheckSum := md5.Sum([]byte("wrong checksum"))
checksum := base64.StdEncoding.EncodeToString(wrongCheckSum[:])

Expect BadDigest
Actual 0.11.1: InvalidDigest ❌
Actual latest: BadDigest ✅

Note: I also run the same test against S3 and MinIO to to confirm the expected behaviour.

whummer · 2020-06-26T21:08:40Z

Hi @charleswhchan , thanks for the update. Haven't been able to reproduce this issue, unfortunately. We've added an extended set of tests here which tests for both - InvalidDigest and BadDigest errors: https://github.com/localstack/localstack/blob/master/tests/integration/test_s3.py#L698-L715 Can you please post an isolated, self-contained example code snippet that can be directly executed to reproduce the issue you're facing? Thanks

charleswhchan · 2020-06-27T22:35:36Z

Hi @whummer : I really appreciate your effort in troubleshooting this issue 👍

After reviewing your tests and comparing to my test, I found something interesting that triggers the error. I have created a gist

Specifically, if you refer to the checksum used in Test #3:

	fmt.Println("Test #3: InvalidDigest ---------------------------------------")
	contentMD5 = "not base64 encoded checksum"
	uploadPartInput = &s3.UploadPartInput{
		Body:       aws.ReadSeekCloser(bytes.NewReader(data)),
		Bucket:     aws.String(bucket),
		Key:        aws.String(key),
		ContentMD5: aws.String(contentMD5),
		PartNumber: aws.Int64(int64(partNumber)),
		UploadId:   aws.String(uploadID),
	}

	fmt.Println("UploadPart ...")
	_, err = svc.UploadPart(uploadPartInput)
	fmt.Printf("expect InvalidDigest; actual %+v\n", err)

Output

Test #3: InvalidDigest ---------------------------------------
UploadPart ...
expect InvalidDigest; actual BadDigest: The Content-MD5 you specified did not match what we received.

I did a bit more investigation and it appears it might have to something with the length of the contentMD5?
If the length is <= 19, localstack would return InvalidDigest (correctly): contentMD5 = "1234567890123456789"
Otherwise, localstack would return BadDigest? contentMD5 = "12345678901234567890"

Strange problem. I hope this information helps 😃

whummer · 2020-06-28T11:13:39Z

Thanks for the update @charleswhchan , we were able to reproduce this issue based on the gist you provided. 👍

The issue was related to whitespaces. Coincidentally, the digest string you were using for testing is technically a valid base64 encoded string, if your strip off the whitespaces:

>>> base64.b64decode('not base64 encoded checksum')
b'\x9e\x8b[j\xc7\xba\xe1\xe9\xdc\xa1\xd7\x9dr\x17\x9c\x92\xcb\xa6'

If you add one more character, then the string is not valid base64 anymore:

>>> base64.b64decode('not base64 encoded checksum1')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File ".../3.8.2/lib/python3.8/base64.py", line 87, in b64decode
    return binascii.a2b_base64(s)
binascii.Error: Invalid base64-encoded string: number of data characters (25) cannot be 1 more than a multiple of 4

This should be now (finally) fixed in #2635 - we are now considering whitespaces as illegal base64 encoding. Please report here if the problem persists with the latest version. Thanks

charleswhchan · 2020-06-28T22:11:27Z

Thanks @whummer, I can confirm with the latest image, the not base64 encoded checksum case is now fixed (and returns InvalidDigest). 👏👍

I did mention another case in my last post (not in gist): 12345678901234567890 returns BadDigest while 1234567890123456789 returns InvalidDigest. It don't really matter to me but I am not sure if you want to take a look also?

whummer added enhancement status: triage needed Requires evaluation by maintainers labels Jun 16, 2020

whummer self-assigned this Jun 20, 2020

whummer mentioned this issue Jun 20, 2020

Fix error responses for S3 Content-MD5 header #2592

Merged

whummer closed this as completed Jun 20, 2020

whummer reopened this Jun 22, 2020

whummer added the priority: high label Jun 22, 2020

whummer added the status: resolved/fixed Resolved via a fix or an implementation label Jun 27, 2020

whummer mentioned this issue Jun 28, 2020

Add support for DynamoDB Global Tables #2635

Merged

whummer closed this as completed Jun 28, 2020

alexrashed added type: feature New feature, or improvement to an existing feature and removed type: enhancement labels Apr 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Simulate S3 BadDigest and InvalidDigest errors #2562

Simulate S3 BadDigest and InvalidDigest errors #2562

charleswhchan commented Jun 15, 2020

whummer commented Jun 16, 2020

charleswhchan commented Jun 17, 2020

whummer commented Jun 20, 2020

charleswhchan commented Jun 21, 2020 •

edited

whummer commented Jun 26, 2020

charleswhchan commented Jun 27, 2020

whummer commented Jun 28, 2020

charleswhchan commented Jun 28, 2020

Simulate S3 BadDigest and InvalidDigest errors #2562

Simulate S3 BadDigest and InvalidDigest errors #2562

Comments

charleswhchan commented Jun 15, 2020

Type of request: This is a ...

Detailed description

Expected behavior

Actual behavior

whummer commented Jun 16, 2020

charleswhchan commented Jun 17, 2020

whummer commented Jun 20, 2020

charleswhchan commented Jun 21, 2020 • edited

whummer commented Jun 26, 2020

charleswhchan commented Jun 27, 2020

whummer commented Jun 28, 2020

charleswhchan commented Jun 28, 2020

charleswhchan commented Jun 21, 2020 •

edited