Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PoolManager retries argument is not passed down to requests #2475

Open
vbarbaresi opened this issue Oct 30, 2021 · 3 comments
Open

PoolManager retries argument is not passed down to requests #2475

vbarbaresi opened this issue Oct 30, 2021 · 3 comments

Comments

@vbarbaresi
Copy link

Subject

The documentation states that retries argument can be passed to a PoolManager to be used as default in pool requests:
https://urllib3.readthedocs.io/en/latest/reference/urllib3.util.html#utilities

It seems that retries is actually ignored in the pool requests

Environment

OS macOS-10.15.3-x86_64-i386-64bit
Python 3.8.6
urllib3 1.26.7

Also reproduced in development mode in urllib3 main branch

Steps to Reproduce

from urllib3 import Retry, PoolManager
from urllib3.exceptions import MaxRetryError

retries = Retry(connect=0, read=0, redirect=0)
http = PoolManager(retries=retries)

# Max redirect retries not honored
response = http.request("GET", "http://nghttp2.org/httpbin/redirect/2")
assert response.status == 200
print("Should have raised MaxRetryError")

try:
    http.request("GET", "http://nghttp2.org/httpbin/redirect/3", retries=retries)
except MaxRetryError:
    print("Max retry honored")

Expected Behavior

I would expect the first request to raise MaxRetryError

Actual Behavior

The first request actually use a default retry strategy instead of the pool's strategy

diff --git a/src/urllib3/poolmanager.py b/src/urllib3/poolmanager.py
index a36f96ae..70ebc65e 100644
--- a/src/urllib3/poolmanager.py
+++ b/src/urllib3/poolmanager.py
@@ -453,7 +453,7 @@ class PoolManager(RequestMethods):
         if response.status == 303:
             method = "GET"

-        retries = kw.get("retries")
+        retries = kw.get("retries") or self.connection_pool_kw.get("retries")
         if not isinstance(retries, Retry):
             retries = Retry.from_int(retries, redirect=redirect)

This fixes the issue. Does that seem like the right way to fix this? I can write a PR and a unit test.
Or maybe connection_pool_kw arguments should be merged in a more generic place like _merge_pool_kwargs().

@misha778
Copy link

Thank you! it's a pity that doesn't help for requests, it doesn't seem to be using the poolmanager urlopen method.

@misha778
Copy link

misha778 commented Nov 11, 2021

with a timeout error, retry is working fine

retry = Retry(total=10)
http = urllib3.PoolManager(retries=retry, timeout=2)

r = http.request("GET", "https://httpbin.org/delay/5")

@vbarbaresi
Copy link
Author

Indeed, for timeouts it goes through a different codepath in connectionpool.urlopen()

It default on self.retries when retries is not passed as keyword argument:

if not isinstance(retries, Retry):
retries = Retry.from_int(retries, redirect=redirect, default=self.retries)

So this only affects other type of errors like redirects (in poolmanager.py)

vbarbaresi added a commit to vbarbaresi/trafilatura that referenced this issue Dec 30, 2023
…dling

Fixes issue adbar#450

After setting `MAX_REDIRECTS` to 5, I could fetch the original URL from the issue:
`trafilatura -u https://www.hydrogeninsight.com/production/breaking-us-reveals-the-seven-regional-hydrogen-hubs-to-receive-7bn-of-government-funding/2-1-1534596`

I also fixed this old issue: adbar#128
The underlying urllib3 bug has not been fixed: urllib3/urllib3#2475

I had to pass the retry strategy to the actual request method: it doesn't propagate from the pool maanger
adbar added a commit to adbar/trafilatura that referenced this issue Jan 4, 2024
…dling (#461)

* introduce `MAX_REDIRECTS` config setting and fix urllib3 redirect handling

Fixes issue #450

After setting `MAX_REDIRECTS` to 5, I could fetch the original URL from the issue:
`trafilatura -u https://www.hydrogeninsight.com/production/breaking-us-reveals-the-seven-regional-hydrogen-hubs-to-receive-7bn-of-government-funding/2-1-1534596`

I also fixed this old issue: #128
The underlying urllib3 bug has not been fixed: urllib3/urllib3#2475

I had to pass the retry strategy to the actual request method: it doesn't propagate from the pool maanger

* use httpbin everywhere instead of httpbun

* restore MAX_REDIRECTS defaut config value to 2

* reset global objects after test_fetch to avoid side-effects

* pin lxml to < 5

* test no_ssl True & False to fix coverage

* move _reset_global_objects() to the test file

* rewrite assignment in multiple lines for readability

---------

Co-authored-by: Adrien Barbaresi <adbar@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants