-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
No Retry in HTTPResponse.read() method. #542
Comments
I think you're right, we do need something like this. Especially because we ultimately want to make I agree it makes sense to attach the retry object to the response, but keep in mind that it's immutable, so you'd need to replace the response on retries somehow. I'm not sure the API you proposed would work in this scenario. Ideally, the response should know to retry based on the request's retry configuration (which would be easier if the request's retry object is attached to the response). I'm not 100% sure that would be feasible for a response to trigger a request implicitly with the way the direction of responsibility is laid out in the library, but it's worth exploring first. |
Wouldn't my last line do that? The |
You are setting |
My example is a little wrong, it'd like be |
@dstufft Ah I see, yea that might work. Ideally it would be implicit, based on the response's retries object rather than necessarily explicit? |
I'm not sure I follow, maybe it would help if I annotated this? # Make our original request, any connection errors that occur during this will
# be handled by the built in urllib3 retries.
response = pool.urlopen(...)
# Now we want to read the data from the response, we use a loop here so that
# this block of code can be repeated until we either run out of retries or
# we successfully read the entire response. It is important that end users
# ensure to have read the *entire* body of the response in this loop or they
# otherwise ensure that the response has been closed within the body of this
# loop.
while not response.closed:
# *Everything* that works with data from the response must be within this
# look, this includes initializing any buffers, etc. If any of this is done
# outside of the loop then we will leak state between retries.
data = []
# We need a try: except block to catch any of the errors that we want to
# retry on that might occur while reading the response.
try:
# Actually read our data from the response
read_data = response.read(4096)
while read_data:
data.append(read_data)
read_data.response.read(4096)
# We catch an object here which lists all of the normal retry exceptions,
# this will ensure that end users do not need to maintain a list of
# different exceptions themselves.
except response.retries.exceptions as exc:
# We've encountered an error that we can retry on, so we'll call the
# response.retry() method which will implicit make a new connection,
# get a new response object and return that response object. The retry
# method will also handle checking if we have an retries left and if
# not it will raise a MaxRetryError
response = response.retry(exc)
# If we got here, then it means we've successfully read all of the data off
# the response and we do not want to repeat this anymore. urllib3 will
# close the response when it detects it's read the last bit of data so
# there is no need for anything else to happen here, the loop condition
# will evaluate to False and we'll exit this loop and continue on with the
# execution of the program.
# Now we've handled the response and gotten all of our data including handling
# retries. So do something with our data.
with open(...) as fp:
for chunk in data:
fp.write(chunk) |
Note: I dropped the check if the retries are exhausted from the loop conditional. We don't want to skip the loop if we've used up all of our retries trying to connect to the server but on our very last attempt we successfully connected. |
+1, I think this is a valuable recipe that we should probably add to the docs once it is supported. |
As an additional thing, we can provide a helper function that will wrap up some of this but it will force people to write a function that does the actual reading. The urllib3 provided function would look something like: def read_with_retries(response, reader):
while not response.closed:
try:
return reader(response)
except response.retries.exceptions as exc:
response = response.retry(exc) You can see that The end user would then do something like this: def reader(response):
data = []
read_data = response.read(4096)
while read_data:
data.append(read_data)
read_data = response.read(4096)
return data
response = pool.urlopen(...)
data = read_with_retries(response, reader)
with open(...) as fp:
for chunk in data:
fp.write(chunk) |
any progress on this? |
Nope |
It appears that the retry support in urllib3 doesn't cover the HTTPResponse.read() method. This means that if you pass
preload_content=False
tourlopen()
as requests does, that any errors that occur whileread()
ing the body will not automatically be retried. Doing this transparently would be difficult to do since urllib3 doesn't know what the calling library has done with the parts of the body it's already received.What would be nice though is to provide the
Retry()
object on the response object and provide instructions for how the caller of the library can use it to implement retries themselves. Perhaps this could look something like:Thoughts?
The text was updated successfully, but these errors were encountered: