-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Standardize HTTPResponse.read(X) behavior regardless of compression #2798
Conversation
Co-authored-by: Franek Magiera <framagie@gmail.com>
077d3b3
to
34ff983
Compare
ee8fe37
to
9e83b03
Compare
9e83b03
to
75735fe
Compare
Co-authored-by: Franek Magiera <framagie@gmail.com> When asking for amt bytes, we'll always return that amount unless there is not enough data left. When no data is left, we return b'' to signal EOF. This is the only situation where we can return an empty number of bytes.
75735fe
to
a7933eb
Compare
4f9d876
to
9674b7a
Compare
The buffer was "losing" in test_requesting_large_resources_via_ssl, but only because there was no compression.
e272839
to
9f2e2a9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great, only had a few nits and two question then we can ship!
src/urllib3/response.py
Outdated
while fetched < n: | ||
remaining = n - fetched | ||
chunk = self.buffer.popleft() | ||
if remaining < len(chunk): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
len(chunk) is calculated 3 times, we can cache this value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I usually don't add micro optimizations without benchmarking, but this time I was lazy: b24d1da
(#2798)
src/urllib3/response.py
Outdated
left_chunk, right_chunk = chunk[:remaining], chunk[remaining:] | ||
ret.write(left_chunk) | ||
self.buffer.appendleft(right_chunk) | ||
self._size -= len(left_chunk) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I'm reading the logic right, len(left_chunk) will always be "remaining", right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed in b24d1da
(#2798)
src/urllib3/response.py
Outdated
|
||
def get(self, n: int) -> bytes: | ||
if not self.buffer: | ||
raise ValueError("buffer is empty") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Should this be a RuntimeError
instead of ValueError
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed in 0dc26a3
(#2798).
|
||
data = self._decode(data, decode_content, flush_decoder) | ||
if not data and len(self._decoded_buffer) == 0: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is one of the locations you're referencing for #2799 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
9f2e2a9
to
c9cf398
Compare
My implementation of #2800 breaks the requests test suite, not sure why yet. |
@pquentin I wonder if we need to change the call to read() with decode_content=False inside of the "drain" method on HTTPResponse? I know Requests allows for raw data access. |
079811f
to
b518302
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks excellent, let's merge!
The deadsnakes PPA no longer supports 3.11-dev it seems.
Using the findings from #2787, I reimplemented #2712 to only do one copy at worst, using 4GiB of memory for a 2GiB read. This is much better than #2712 which was doing three copies at worst, and two copies when using memory view.
More importantly, this is equivalent to the current solution where we also have one copy at worst when calling
self._decode()
! So the blockers from #2718 are gone.Closes #709 (given the decision made in #2769), Closes #2712, Closes #2128
Open questions:
read(decode_content=True)
followed byread(decode_content=False)
? #2800