Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds support for checksums in streamed request trailers #962

Merged
merged 5 commits into from
Nov 29, 2022

Conversation

terricain
Copy link
Collaborator

Description of Change

Adds support for S3 streaming uploads to support checksums in appended on the end of the request stream (trailers)

Running a snippet like the following will use streaming uploads and also the new SHA256 checksums

async def main():
    session = get_session()
    async with session.create_client('s3', region_name='eu-west-2') as s3_client:
        data = b'test1234'
        digest = hashlib.sha256(data).digest()

        resp = await s3_client.put_object(Bucket="BUCKET", Key='foobarbaz', Body=data, ChecksumAlgorithm='SHA256')
        sha256_trailer_checksum = base64.b64decode(resp['ChecksumSHA256'])

        assert digest == sha256_trailer_checksum

Note - when using Moto or an S3 endpoint not using HTTPS, the checksum will be added to the header, not the trailers as the headers are signed, the trailer is not (hence the requirements for HTTPS)

The problem comes with botocore.httpchecksum.AwsChunkedWrapper ending up as the body passed to aiohttp and it not liking it. We now subclass that wrapper class and add an async iterator implementation.

Fixes #954
Fixes terricain/aioboto3#265

Assumptions

That if "Transfer-Encoding": "Chunked" is always removed from the header dict before being passed to aiohttp is ok. (aiohttp reallllllly doesnt like it if you send it streaming body, and already set the transfer encoding :/ ). This ends up being injected from multiple places, mainly some of the checksum code as wells somewhere deep in the aws prepare_request() functions,


AWS Page - https://aws.amazon.com/blogs/aws/new-additional-checksum-algorithms-for-amazon-s3/

@terricain
Copy link
Collaborator Author

@thehesiod am not too pleased with headers.pop('Transfer-Encoding', '') in httpsession.py you think theres a better way to do it?

@terricain terricain force-pushed the terrycain/s3-request-checksum branch from fdfea39 to 1705f45 Compare August 10, 2022 22:29
@terricain
Copy link
Collaborator Author

So when I run the S3 tests locally without the changes, I get the same number failed as with the changes, so I don't think they're related :)

@terricain
Copy link
Collaborator Author

terricain commented Aug 10, 2022

Yeah it looks like moto is unhappy for some reason. I ran the server standalone and pointed the tests at it and they seem to be getting 404'd at least on creating multipart uploads, even using boto3.


Update:

So if you downgrade Werkzeug to 2.1.2 and Flask to 2.1.3, all the tests pass, is related to getmoto/moto#5341

@thehesiod
Copy link
Collaborator

thanks for all the research Terry! hopefully get to this soon, as it is getting that last release out it's 3am and I need to wake up at 7am, yowzers ;)

@thehesiod
Copy link
Collaborator

looks like they pinned werkzeug + flask and I bumped us to the latest moto, may want to try with what we have and rebase to master

@thehesiod
Copy link
Collaborator

just read a bit about this feature, very cool! This will solve the whole async file streaming support, since otherwise it didn't make much sense. I'm very interested in getting this working asap so will put further priority on reviewing this

@thehesiod
Copy link
Collaborator

I could have used this feature several times in the last few years :)

@terricain terricain force-pushed the terrycain/s3-request-checksum branch from 1705f45 to 7eeb54a Compare August 25, 2022 22:47
@terricain
Copy link
Collaborator Author

Yeah the coverage isnt going to be great as we can't test this unless moto runs with TLS. We could make that would though I think that should be done separately.

@thehesiod
Copy link
Collaborator

hey @terrycain for the delay, been nuts. Hey some good news I got my first sponsor for this project, can I share a few bucks with you?

@terricain
Copy link
Collaborator Author

Sure :D always up for some free money this is my paypal link else I've finally had the will to setup the github sponsors stuff :).

@terricain terricain force-pushed the terrycain/s3-request-checksum branch from 7eeb54a to c1c5f91 Compare November 9, 2022 17:30
@coolacid
Copy link

I dislike being "that guy" but wanted to check in to see what needs to happen to get this in? Looking forward to this fix.

@terricain
Copy link
Collaborator Author

@thehesiod can we get this merged ignoring codecov?

@thehesiod
Copy link
Collaborator

hey sorry been nuts, got covid and perhaps RSV too from our kids one of my highest priorities is getting this in, really want to look at this in detail

@thehesiod
Copy link
Collaborator

ok changes look good, just want to dig into the Transfer-Encoding issue

@thehesiod thehesiod self-requested a review November 29, 2022 05:46
@thehesiod
Copy link
Collaborator

ok figured out the chunk thing, and added support for running https tests, unfortunately moto doesn't appear to support the algs yet so still need to xfail for now.

@aio-libs aio-libs deleted a comment from codecov bot Nov 29, 2022
@codecov
Copy link

codecov bot commented Nov 29, 2022

Codecov Report

Merging #962 (f0f911a) into master (551343c) will increase coverage by 0.12%.
The diff coverage is 94.05%.

@@            Coverage Diff             @@
##           master     #962      +/-   ##
==========================================
+ Coverage   84.98%   85.11%   +0.12%     
==========================================
  Files          50       50              
  Lines        4529     4602      +73     
==========================================
+ Hits         3849     3917      +68     
- Misses        680      685       +5     
Flag Coverage Δ
unittests 85.11% <94.05%> (+0.12%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
aiobotocore/httpchecksum.py 65.85% <92.00%> (+40.85%) ⬆️
tests/test_basic_s3.py 85.95% <92.30%> (+0.23%) ⬆️
tests/mock_server.py 74.75% <95.45%> (ø)
aiobotocore/client.py 86.26% <100.00%> (+0.07%) ⬆️
aiobotocore/httpsession.py 79.20% <100.00%> (+0.68%) ⬆️
tests/conftest.py 91.95% <100.00%> (+0.08%) ⬆️
tests/moto_server.py 81.91% <100.00%> (+0.39%) ⬆️
tests/test_patches.py 96.07% <100.00%> (ø)

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@thehesiod thehesiod merged commit 8d9d71a into master Nov 29, 2022
@thehesiod thehesiod deleted the terrycain/s3-request-checksum branch November 29, 2022 07:30
@terricain
Copy link
Collaborator Author

🎉

@coolacid
Copy link

Confirmed this fixed setting checksum algorithm when using S3FS. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Multipart Uploads using Checksums S3 ChecksumAlgorithm
3 participants