Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simulate S3 BadDigest and InvalidDigest errors #2562

Closed
charleswhchan opened this issue Jun 15, 2020 · 8 comments
Closed

Simulate S3 BadDigest and InvalidDigest errors #2562

charleswhchan opened this issue Jun 15, 2020 · 8 comments
Assignees
Labels
priority: high status: resolved/fixed Resolved via a fix or an implementation status: triage needed Requires evaluation by maintainers type: feature New feature, or improvement to an existing feature

Comments

@charleswhchan
Copy link

Type of request: This is a ...

[ ] bug report
[x] feature request

Detailed description

As a Developer,
I want to be able to simulate BadDigest and InvalidDigest errors,
So that I can test the error handling logic

Expected behavior

https://docs.aws.amazon.com/AmazonS3/latest/API/ErrorResponses.html
BadDigest: The Content-MD5 you specified did not match what we received.
InvalidDigest: The Content-MD5 you specified is not valid.

Actual behavior

localstack returns InvalidDigest when MD5 does not match payload.

@whummer
Copy link
Member

whummer commented Jun 16, 2020

Thanks for reporting @charleswhchan . Not sure I fully understand your use case - can you please elaborate a bit? I think it should already be possible to pass the Content-MD5 header on certain S3 API requests. See here for the code that checks the user-specified MD5 hash and compares it to the actual computed hash of the request payload:

def check_content_md5(data, headers):
actual = md5(strip_chunk_signatures(data))
expected = headers['Content-MD5']
try:
expected = to_str(codecs.encode(base64.b64decode(expected), 'hex'))
except Exception:
expected = '__invalid__'
if actual != expected:
return error_response('The Content-MD5 you specified was invalid', 'InvalidDigest', status_code=400)
Can you please provide a small, isolated example of what you're trying to achieve? Thanks!

@whummer whummer added enhancement status: triage needed Requires evaluation by maintainers labels Jun 16, 2020
@charleswhchan
Copy link
Author

Hi @whummer, thanks for looking into this.

What I am trying to do is write a test to write to S3 with localstack and produce conditions that would lead to BadDigest and InvalidDigest errors. This is so that I can check the error handling is working correctly in my code.

Based on my understanding of the AWS REST API.
In the code you listed above, when "actual != expected", localstack should return BadDigest as the checksum does not match.
Whereas if the content of "Context-MD5" is not a base64 encoded MD5, then the API should return InvalidDigest.

Does this help?

@whummer
Copy link
Member

whummer commented Jun 20, 2020

Ok, thanks for the clarification @charleswhchan . This should be fixed in #2592 - can you please give it a try with the latest version of the Docker image. The code now distinguishes between BadDigest and InvalidDigest, as per the AWS REST API spec. Hope that helps - please report here if the behavior is still not fully correct. Thanks

@whummer whummer closed this as completed Jun 20, 2020
@charleswhchan
Copy link
Author

charleswhchan commented Jun 21, 2020

Hi @whummer.

I just tried the latest docker image (link) and it seems to broken in a different way -- it is now returning BadDigest all the time.

I am using go-lang, and here are my 2 test cases:

Test 1.
checksum := "not base64 encoded checksum"

  • Expect InvalidDigest
  • Actual 0.11.1: InvalidDigest
  • Actual latest: BadDigest

Test 2

wrongCheckSum := md5.Sum([]byte("wrong checksum"))
checksum := base64.StdEncoding.EncodeToString(wrongCheckSum[:])
  • Expect BadDigest
  • Actual 0.11.1: InvalidDigest
  • Actual latest: BadDigest

Note: I also run the same test against S3 and MinIO to to confirm the expected behaviour.

@whummer
Copy link
Member

whummer commented Jun 26, 2020

Hi @charleswhchan , thanks for the update. Haven't been able to reproduce this issue, unfortunately. We've added an extended set of tests here which tests for both - InvalidDigest and BadDigest errors: https://github.com/localstack/localstack/blob/master/tests/integration/test_s3.py#L698-L715 Can you please post an isolated, self-contained example code snippet that can be directly executed to reproduce the issue you're facing? Thanks

@whummer whummer added the status: resolved/fixed Resolved via a fix or an implementation label Jun 27, 2020
@charleswhchan
Copy link
Author

Hi @whummer : I really appreciate your effort in troubleshooting this issue 👍

After reviewing your tests and comparing to my test, I found something interesting that triggers the error. I have created a gist

Specifically, if you refer to the checksum used in Test #3:

	fmt.Println("Test #3: InvalidDigest ---------------------------------------")
	contentMD5 = "not base64 encoded checksum"
	uploadPartInput = &s3.UploadPartInput{
		Body:       aws.ReadSeekCloser(bytes.NewReader(data)),
		Bucket:     aws.String(bucket),
		Key:        aws.String(key),
		ContentMD5: aws.String(contentMD5),
		PartNumber: aws.Int64(int64(partNumber)),
		UploadId:   aws.String(uploadID),
	}

	fmt.Println("UploadPart ...")
	_, err = svc.UploadPart(uploadPartInput)
	fmt.Printf("expect InvalidDigest; actual %+v\n", err)

Output

Test #3: InvalidDigest ---------------------------------------
UploadPart ...
expect InvalidDigest; actual BadDigest: The Content-MD5 you specified did not match what we received.

I did a bit more investigation and it appears it might have to something with the length of the contentMD5?
If the length is <= 19, localstack would return InvalidDigest (correctly): contentMD5 = "1234567890123456789"
Otherwise, localstack would return BadDigest? contentMD5 = "12345678901234567890"

Strange problem. I hope this information helps 😃

@whummer
Copy link
Member

whummer commented Jun 28, 2020

Thanks for the update @charleswhchan , we were able to reproduce this issue based on the gist you provided. 👍

The issue was related to whitespaces. Coincidentally, the digest string you were using for testing is technically a valid base64 encoded string, if your strip off the whitespaces:

>>> base64.b64decode('not base64 encoded checksum')
b'\x9e\x8b[j\xc7\xba\xe1\xe9\xdc\xa1\xd7\x9dr\x17\x9c\x92\xcb\xa6'

If you add one more character, then the string is not valid base64 anymore:

>>> base64.b64decode('not base64 encoded checksum1')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File ".../3.8.2/lib/python3.8/base64.py", line 87, in b64decode
    return binascii.a2b_base64(s)
binascii.Error: Invalid base64-encoded string: number of data characters (25) cannot be 1 more than a multiple of 4

This should be now (finally) fixed in #2635 - we are now considering whitespaces as illegal base64 encoding. Please report here if the problem persists with the latest version. Thanks

@whummer whummer closed this as completed Jun 28, 2020
@charleswhchan
Copy link
Author

Thanks @whummer, I can confirm with the latest image, the not base64 encoded checksum case is now fixed (and returns InvalidDigest). 👏👍

I did mention another case in my last post (not in gist): 12345678901234567890 returns BadDigest while 1234567890123456789 returns InvalidDigest. It don't really matter to me but I am not sure if you want to take a look also?

@alexrashed alexrashed added type: feature New feature, or improvement to an existing feature and removed type: enhancement labels Apr 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority: high status: resolved/fixed Resolved via a fix or an implementation status: triage needed Requires evaluation by maintainers type: feature New feature, or improvement to an existing feature
Projects
None yet
Development

No branches or pull requests

3 participants