yield streamed bytes as soon as they arrive #2125

djrobstep · 2020-12-31T05:17:41Z

Right now, calling read with amt = None (or requests.iter_content(chunk_size=None)) seems to wait for the entire response contents to arrive before yielding, which contradicts the docs in both urllib3 and requests. This is discussed in #2123.

For instance, this change makes the following correctly yield bytes as they arrive, provided CHUNK_SIZE is set to -1:

import requests

URL = 'https://httpbin.org/drip?duration=2'

r = requests.get(URL, stream=True)

for x in r.iter_content(chunk_size=CHUNK_SIZE):
    print(f'response: {x}')

The requests docs say this should already be happening if CHUNK_SIZE is set to None, but it doesn't:

chunk_size must be of type int or None. A value of None will function differently depending on the value of stream. stream=True will read data as it arrives in whatever size the chunks are received. If stream=False, data is returned as a single chunk

And so do the docs for HTTPResponse's stream( that this value is passed through to:

How much of the content to read. The generator will return up to this much data per iteration, but may return less. This is particularly likely when using compressed data. However, the empty string will never be returned.

But in fact None forces the entire response to be read before yielding.

I was hesitant to make a change that would affect the existing behaviour of None or positive integers, so here you specify a chunk size of -1 to activate this "stream on any bytes" behaviour. Hopefully this is suitable.

sethmlarson · 2022-03-05T01:43:57Z

Sorry that this PR has stalled completely for over a year and now has conflicts. Unless there's someone on our team that wants to push this one over the finish line I may close it.

I'm not sure I love the .read(-1) being used to mean .read1(), shouldn't we instead implement HTTPResponse.read1()?

pquentin · 2022-04-26T05:27:11Z

Agreed that read1() is the way to go here.

sethmlarson · 2022-04-26T11:37:08Z

Going to close this PR in favor of one implementing read1().

Method suggested in urllib3#2125 as a nicer way to get resp.stream(None) working for non-chunked responses. Passes a parameter down the _*_read chain so that their implemention can be reused.

yield streamed bytes as soon as they arrive

80642af

Base automatically changed from master to main January 16, 2021 20:06

Silverfishnet approved these changes Apr 21, 2022

View reviewed changes

sethmlarson closed this Apr 26, 2022

smason mentioned this pull request Nov 8, 2023

Support read1() method in HTTPResponse #3186

Merged

smason mentioned this pull request Nov 27, 2023

make HTTPResponse.stream use read1 when amt=None #3216

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

yield streamed bytes as soon as they arrive #2125

yield streamed bytes as soon as they arrive #2125

djrobstep commented Dec 31, 2020 •

edited

Loading

sethmlarson commented Mar 5, 2022

pquentin commented Apr 26, 2022

sethmlarson commented Apr 26, 2022

yield streamed bytes as soon as they arrive #2125

yield streamed bytes as soon as they arrive #2125

Conversation

djrobstep commented Dec 31, 2020 • edited Loading

sethmlarson commented Mar 5, 2022

pquentin commented Apr 26, 2022

sethmlarson commented Apr 26, 2022

djrobstep commented Dec 31, 2020 •

edited

Loading