Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Download handlers: from_crawler factory method, take crawler in __init__ #4126

Merged
merged 15 commits into from Jan 24, 2020

Conversation

elacuesta
Copy link
Member

@elacuesta elacuesta commented Nov 5, 2019

The inspiration for this came because of the proposed "bytes received" signal (#3793). With this change, the handlers could access the signal manager stored in the crawler, allowing them to trigger signals.

I guess this could lead to compatibility issues with custom handlers that inherit from the built-in ones, I figured that shouldn't be a problem if we're changing the main version number to 2.

@elacuesta elacuesta added this to the v2.0 milestone Nov 5, 2019
@codecov
Copy link

@codecov codecov bot commented Nov 5, 2019

Codecov Report

Merging #4126 into master will increase coverage by 0.05%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master    #4126      +/-   ##
==========================================
+ Coverage      84%   84.05%   +0.05%     
==========================================
  Files         165      165              
  Lines        9675     9687      +12     
  Branches     1447     1447              
==========================================
+ Hits         8127     8142      +15     
+ Misses       1294     1293       -1     
+ Partials      254      252       -2
Impacted Files Coverage Δ
scrapy/core/downloader/handlers/http11.py 93.07% <100%> (+0.08%) ⬆️
scrapy/core/downloader/handlers/datauri.py 93.33% <100%> (-0.79%) ⬇️
scrapy/core/downloader/handlers/s3.py 65.67% <100%> (+2.76%) ⬆️
scrapy/core/downloader/handlers/http10.py 100% <100%> (ø) ⬆️
scrapy/core/downloader/handlers/file.py 100% <100%> (ø) ⬆️
scrapy/core/downloader/handlers/ftp.py 98.36% <100%> (+0.08%) ⬆️
scrapy/core/downloader/handlers/__init__.py 83.63% <100%> (ø) ⬆️
scrapy/exporters.py 100% <0%> (ø) ⬆️
scrapy/core/downloader/__init__.py 90.83% <0%> (+1.52%) ⬆️
scrapy/utils/trackref.py 85.71% <0%> (+2.85%) ⬆️

Copy link
Member

@Gallaecio Gallaecio left a comment

There seem to be two affected lines not covered by the tests. I’m not sure how hard it would be to reach them, so up to you.

scrapy/core/downloader/handlers/datauri.py Outdated Show resolved Hide resolved
@elacuesta elacuesta force-pushed the from_crawler_downloader_handlers branch from 3a9a447 to c73094a Compare Nov 5, 2019
@elacuesta elacuesta force-pushed the from_crawler_downloader_handlers branch from d8210d2 to e43f37f Compare Dec 2, 2019
@elacuesta elacuesta closed this Dec 3, 2019
@elacuesta elacuesta reopened this Dec 3, 2019
dhcls,
self._crawler.settings,
self._crawler,
)
Copy link
Member

@kmike kmike Dec 23, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

scrapy/core/downloader/handlers/s3.py Outdated Show resolved Hide resolved
tests/test_downloader_handlers.py Outdated Show resolved Hide resolved
@kmike
Copy link
Member

@kmike kmike commented Dec 23, 2019

Hey! create_instance for download handlers is a great new feature, +1 to add it.

I think we should try more to preserve backwards compatibility; 2.0 is not a reason to make backwards incompatible changes without deprecation period. We're dropping Python 2 support in Scrapy 2.0, but Scrapy already worked with Python 3 for years.

@elacuesta
Copy link
Member Author

@elacuesta elacuesta commented Dec 23, 2019

Hi @kmike! Good point about the deprecation period, thanks 👍
Updated the PR, left a note about the S3 handler, would love you hear your thoughts about it.

tests/test_downloader_handlers.py Outdated Show resolved Hide resolved
@elacuesta elacuesta changed the title Download handlers: from_crawler factory method, take crawler instead of settings in __init__ Download handlers: from_crawler factory method, take crawler in __init__ Dec 24, 2019
@kmike kmike merged commit 7a62bd3 into scrapy:master Jan 24, 2020
2 checks passed
@kmike
Copy link
Member

@kmike kmike commented Jan 24, 2020

Thanks for all the updates @elacuesta!

@elacuesta elacuesta deleted the from_crawler_downloader_handlers branch Jan 25, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants