Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[>=4.3.1] Chunked uploads are broken with urllib3 v2 #734

Closed
sathieu opened this issue Jul 3, 2023 · 6 comments · Fixed by #739
Closed

[>=4.3.1] Chunked uploads are broken with urllib3 v2 #734

sathieu opened this issue Jul 3, 2023 · 6 comments · Fixed by #739

Comments

@sathieu
Copy link

sathieu commented Jul 3, 2023

With vcrpy 5.0.0, when updating urllib3 from 1.26.16 to 2.0.3, chunked uploads are failing.

This is because the body appears to vcrpy unchunked:

(Pdb) self.cassette.requests[18].url == self._vcr_request.url
True
(Pdb) self.cassette.requests[18].body                         
b'45\r\nBlob bf9500cc6aa4fac1a75c958496c9e9caeceacb256113e28836fc5b3bd448f706\r\n0\r\n\r\n'
(Pdb) [c for c in self._vcr_request.body]
[b'Blob bf9500cc6aa4fac1a75c958496c9e9caeceacb256113e28836fc5b3bd448f706']
(Pdb) self._vcr_request.headers          
{'User-Agent': 'gitlabracadabra/1.13.0', 'Docker-Distribution-Api-Version': 'registry/2.0', 'Authorization': 'Bearer eyJhbGciOiJSUzI1NiIsImtpZCI6IlJJWTM6SUhGMjpZUlhVOlJGTVI6NEVRRDpJWVZDOjZMSzM6MzVJTDpGRFJGOklKUE46Q0MySjpMTEIzIiwidHlwIjoiSldUIn0.eyJhY2Nlc3MiOlt7InR5cGUiOiJyZXBvc2l0b3J5IiwibmFtZSI6InJvb3QvdGVzdF9yZWdpc3RyeS9vcGVyYXRvci1mcmFtZXdvcmsvb2xtIiwiYWN0aW9ucyI6WyJwdXNoIiwicHVsbCJdfV0sImp0aSI6ImM5ZGQ1YWQxLWQ0M2ItNDI0Yi1iZDUwLWU4NzJlYjYxZjYwNSIsImF1ZCI6ImNvbnRhaW5lcl9yZWdpc3RyeSIsInN1YiI6InJvb3QiLCJpc3MiOiJnaXRsYWItaXNzdWVyIiwiaWF0IjoxNjE3MTkzMjE5LCJuYmYiOjE2MTcxOTMyMTQsImV4cCI6MTYxNzE5MzUxOX0.b6PKFISAEPnDnfTKR8XVIHUOc6JYomUOB-u_w3QQIqF8gkb92n199zvFSwcpOZolcXjk1DvCfMtUkOvklzAgXyIwNcBx7t_o_SO4mCqWEnbfWxzhJKKSPm6YLAC0IoQbxCsPfA4CK7XEOCJU7eTLaCTqLvq6GvmpjtyLF-KgZV8eUV4zPDNYkhvUqcJXSQR_UHqQ4k32h8itBGYzfhykHSP2B2ZvbBDHLYmBlsDFtc7LJVUEEvKKhCR1VQppwgIoHwJwu7OR3R6I5_XO23NwBbPYvAQZC9twPBKNvoPfcToMzSF_keuAVodUgAyYMUHmnFPw5jX0-PcPt0L9j6a9rA', 'Cookie': 'perf_bar_enabled=true; experimentation_subject_id=eyJfcmFpbHMiOnsibWVzc2FnZSI6IkltSmpaRFZrTXpRMkxXWXlPR010TkRZek1DMWhNMll5TFRNMlpEYzNZVFJrWWpsa09DST0iLCJleHAiOm51bGwsInB1ciI6ImNvb2tpZS5leHBlcmltZW50YXRpb25fc3ViamVjdF9pZCJ9fQ%3D%3D--403d642cd6f4aaa4365664b26bec713ff089b313', 'Transfer-Encoding': 'chunked'}
(Pdb) self.cassette.requests[18].headers                     
{'Authorization': 'Bearer eyJhbGciOiJSUzI1NiIsImtpZCI6IlJJWTM6SUhGMjpZUlhVOlJGTVI6NEVRRDpJWVZDOjZMSzM6MzVJTDpGRFJGOklKUE46Q0MySjpMTEIzIiwidHlwIjoiSldUIn0.eyJhY2Nlc3MiOlt7InR5cGUiOiJyZXBvc2l0b3J5IiwibmFtZSI6InJvb3QvdGVzdF9yZWdpc3RyeS9vcGVyYXRvci1mcmFtZXdvcmsvb2xtIiwiYWN0aW9ucyI6WyJwdXNoIiwicHVsbCJdfV0sImp0aSI6ImM5ZGQ1YWQxLWQ0M2ItNDI0Yi1iZDUwLWU4NzJlYjYxZjYwNSIsImF1ZCI6ImNvbnRhaW5lcl9yZWdpc3RyeSIsInN1YiI6InJvb3QiLCJpc3MiOiJnaXRsYWItaXNzdWVyIiwiaWF0IjoxNjE3MTkzMjE5LCJuYmYiOjE2MTcxOTMyMTQsImV4cCI6MTYxNzE5MzUxOX0.b6PKFISAEPnDnfTKR8XVIHUOc6JYomUOB-u_w3QQIqF8gkb92n199zvFSwcpOZolcXjk1DvCfMtUkOvklzAgXyIwNcBx7t_o_SO4mCqWEnbfWxzhJKKSPm6YLAC0IoQbxCsPfA4CK7XEOCJU7eTLaCTqLvq6GvmpjtyLF-KgZV8eUV4zPDNYkhvUqcJXSQR_UHqQ4k32h8itBGYzfhykHSP2B2ZvbBDHLYmBlsDFtc7LJVUEEvKKhCR1VQppwgIoHwJwu7OR3R6I5_XO23NwBbPYvAQZC9twPBKNvoPfcToMzSF_keuAVodUgAyYMUHmnFPw5jX0-PcPt0L9j6a9rA', 'Cookie': 'experimentation_subject_id=eyJfcmFpbHMiOnsibWVzc2FnZSI6IkltSmpaRFZrTXpRMkxXWXlPR010TkRZek1DMWhNMll5TFRNMlpEYzNZVFJrWWpsa09DST0iLCJleHAiOm51bGwsInB1ciI6ImNvb2tpZS5leHBlcmltZW50YXRpb25fc3ViamVjdF9pZCJ9fQ%3D%3D--403d642cd6f4aaa4365664b26bec713ff089b313; perf_bar_enabled=true', 'Docker-Distribution-Api-Version': 'registry/2.0', 'Transfer-Encoding': 'chunked'}

Notes:

@hartwork
Copy link
Collaborator

hartwork commented Jul 3, 2023

Hi @sathieu, I have not been successful in reproducing this issue yet. Could you provide a minimal reproducer in Python with VCR.py 5.0.0 and chunked transfer that works with urllib3 v2 but not v2?


In case the log at https://gitlab.com/gitlabracadabra/gitlabracadabra/-/jobs/4563346377 disappears, I am attaching a full raw copy here: full_job.log

@sathieu
Copy link
Author

sathieu commented Jul 5, 2023

@hartwork Here is a minimal reproducer:

testcase.tar.gz

works with requirements-good.txt, fails with requirements-bad.txt.

@hartwork
Copy link
Collaborator

hartwork commented Jul 6, 2023

Hi @sathieu thanks for these files, well done! 🙏

I confirm that this works with urllib3 v1 but not v2. My understanding is that the request matcher that compres request bodies does not consider these arguably equal bodies as equal.

Comparision of bodies is done as follows:

vcrpy/vcr/matchers.py

Lines 51 to 57 in 4f70152

def body(r1, r2):
transformer = _get_transformer(r1)
r2_transformer = _get_transformer(r2)
if transformer != r2_transformer:
transformer = _identity
if transformer(read_body(r1)) != transformer(read_body(r2)):
raise AssertionError

So both sides are transformed before comparison. Which transformer that is is looked up here…

vcrpy/vcr/matchers.py

Lines 99 to 104 in 4f70152

def _get_transformer(request):
for checker, transformer in _checker_transformer_pairs:
if checker(request.headers):
return transformer
else:
return _identity

…and it is fed by this block here:

vcrpy/vcr/matchers.py

Lines 85 to 92 in 4f70152

_checker_transformer_pairs = (
(
_header_checker("application/x-www-form-urlencoded"),
lambda body: urllib.parse.parse_qs(body.decode("ascii")),
),
(_header_checker("application/json"), _transform_json),
(lambda request: _xml_header_checker(request) and _xmlrpc_header_checker(request), xmlrpc.client.loads),
)

Now that we have that: Maybe one way to fix this is to add a new transformer there? I have a demo of something like that on a new branch issue-734-fix-body-matcher-for-chunked-requests now. Could you have a look? What do you think?

One more thing: method no_read that you added — where can I learn more about that and its purpose, is that some common API? Thanks!

@sathieu
Copy link
Author

sathieu commented Jul 7, 2023

@hartwork Thanks !

Adding a transformer would work, but the question is why the body is unchuncked with the Transfer-Encoding: chunked header still present? Maybe urllib3/urllib3#2565 (see also urllib3/urllib3#2291 (comment), especially If Content-Length can be determined, set Content-Length and proceed not chunked.)?

NB: this no_read method was a test (for the read method). It was not needed, and you can ignore it.

@hartwork hartwork changed the title chunked uploads are broken with urllib3 v2 [>=4.3.1] Chunked uploads are broken with urllib3 v2 Jul 7, 2023
@hartwork
Copy link
Collaborator

hartwork commented Jul 7, 2023

@sathieu thanks for your reply!

I have fixed multiple bugs in the demo in branch issue-734-fix-body-matcher-for-chunked-requests in the meantime.

Regarding the difference, with urllib3 v1 request.body is of type bytearray while with v2 it's of type Stream, i.e. your custom class.

@hartwork
Copy link
Collaborator

hartwork commented Jul 8, 2023

@sathieu I'm at a point now where the code is ready for proper review and testing: it's new pull request #739. If someone has ideas for a simpler solution, happy to do something simpler.

hartwork added a commit that referenced this issue Jul 23, 2023
…-chunked-requests

Fix body matcher for chunked requests (fixes #734)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants