Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Discussion] Don’t retry response status codes from handle_httpstatus_list #3675

Closed

Conversation

Gallaecio
Copy link
Member

Given a request with 'handle_httpstatus_list': [404] in a project with RETRY_HTTP_CODES = […, 404, …], there is currently no way to prevent retries of that specific request only for 404 responses.

These changes solve that issue by copying some code from the Redirect middleware into the Retry middleware, making the latter take handle_httpstatus_list and handle_httpstatus_all into account for retries.

These changes are backward incompatible. Not only do they change the current behavior, but they also make the previous behavior impossible without replacing the built-in Redirect middleware.

What are your thoughts? Should I go ahead and complete the pull request as is (add docs and tests)? Should we follow a backward-compatible approach, such as supporting a new dont_retry_httpstatus_list? Can you think of a better approach? Or is it better for people to rewrite the Redirect middleware when they need this behavior?

@codecov
Copy link

codecov bot commented Mar 8, 2019

Codecov Report

Merging #3675 into master will not change coverage.
The diff coverage is 100%.

@@           Coverage Diff           @@
##           master    #3675   +/-   ##
=======================================
  Coverage   84.52%   84.52%           
=======================================
  Files         167      167           
  Lines        9410     9410           
  Branches     1397     1397           
=======================================
  Hits         7954     7954           
  Misses       1199     1199           
  Partials      257      257
Impacted Files Coverage Δ
scrapy/downloadermiddlewares/retry.py 95.83% <100%> (ø) ⬆️

if (request.meta.get('dont_retry', False) or
response.status in getattr(spider, 'handle_httpstatus_list', []) or
response.status in request.meta.get('handle_httpstatus_list', []) or
request.meta.get('handle_httpstatus_all', False)):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could keep it back compatible by only taking handle_httpstatus_all into consider when there is a settings key, for instance "DONT_RETRY_ HANDLE_HTTPSTATUS_ALL = True"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants