Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve http status all on http error middleware #4694

Merged
merged 2 commits into from Apr 1, 2021

Conversation

Jgaldos
Copy link

@Jgaldos Jgaldos commented Jul 23, 2020

Fixes #3851

on this spider

from scrapy import Spider, Request


class HttpStatusSpider(Spider):
    name = "http_status"

    def start_requests(self):
        http_status = [200, 300, 400, 500]
        urls = [
            f"http://httpbin.org/status/{status}" for status in http_status
        ]
        for url in urls:
            yield Request(
                url=url,
                callback=self.parse,
                meta={
                    "handle_httpstatus_all": False
                }
            )

    def parse(self, response, **kwargs):
        print(f"response url: {response.url}")

previous the fix was printing all elements on the list, after the fix just print 200 status

@codecov
Copy link

codecov bot commented Jul 24, 2020

Codecov Report

Merging #4694 into master will increase coverage by 0.91%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master    #4694      +/-   ##
==========================================
+ Coverage   86.29%   87.21%   +0.91%     
==========================================
  Files         160      160              
  Lines        9646     9813     +167     
  Branches     1416     1447      +31     
==========================================
+ Hits         8324     8558     +234     
+ Misses       1062      994      -68     
- Partials      260      261       +1     
Impacted Files Coverage Δ
scrapy/spidermiddlewares/httperror.py 100.00% <100.00%> (ø)
scrapy/core/downloader/webclient.py 94.48% <0.00%> (-3.48%) ⬇️
scrapy/utils/trackref.py 82.85% <0.00%> (-2.86%) ⬇️
scrapy/core/scraper.py 87.11% <0.00%> (-0.86%) ⬇️
scrapy/cmdline.py 67.20% <0.00%> (-0.55%) ⬇️
scrapy/utils/defer.py 95.45% <0.00%> (-0.20%) ⬇️
scrapy/pipelines/media.py 97.16% <0.00%> (-0.14%) ⬇️
scrapy/pqueues.py 99.08% <0.00%> (ø)
scrapy/utils/gz.py 96.87% <0.00%> (ø)
scrapy/exporters.py 100.00% <0.00%> (ø)
... and 34 more

Copy link
Member

@Gallaecio Gallaecio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a test for this?

scrapy/spidermiddlewares/httperror.py Outdated Show resolved Hide resolved
@Jgaldos Jgaldos force-pushed the improve-httpstatus-all-meta branch from ffda6c3 to 9ff77a6 Compare August 7, 2020 01:42
@Jgaldos Jgaldos force-pushed the improve-httpstatus-all-meta branch from 9ff77a6 to 13181ba Compare August 7, 2020 01:43
@Jgaldos Jgaldos requested a review from Gallaecio August 7, 2020 01:45
@eLRuLL
Copy link
Member

eLRuLL commented Aug 7, 2020

@Jgaldos could you please improve the documentation too? as @kmike mentioned we could be more descriptive here

@Jgaldos Jgaldos force-pushed the improve-httpstatus-all-meta branch 2 times, most recently from 21034b3 to dde335d Compare August 28, 2020 15:07
@Jgaldos Jgaldos force-pushed the improve-httpstatus-all-meta branch from dde335d to a41c205 Compare August 31, 2020 13:59
@eLRuLL
Copy link
Member

eLRuLL commented Aug 31, 2020

@Gallaecio what should be the procedure to merge this? Are we waiting for another reviewer to accept and merge this? Please let me know if I can help in any way

@Gallaecio
Copy link
Member

All reviews are more than welcome, but yes, at the moment we require 2 core developer approvals to merge.

@kmike kmike added this to the 2.5 milestone Apr 1, 2021
@kmike
Copy link
Member

kmike commented Apr 1, 2021

Thanks @Jgaldos!

We should make sure to document it properly in the release notes (cc @wRAR).

@kmike kmike merged commit f0c8d31 into scrapy:master Apr 1, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

HttpErrorMiddleware not honoring handle_httpstatus_all meta as documented
4 participants