Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG+1] New signal for reqeuests reached downloader #3393

Conversation

@whalebot-helmsman
Copy link
Contributor

@whalebot-helmsman whalebot-helmsman commented Aug 17, 2018

New signal added with documentation to be emitted for requests reached downloader. It is done to provide better integration between scheduler and downloader without adding much complexity, based on previous discussions: #2435 #2474

Signal name is discussable. Documentation can be improved.

@codecov
Copy link

@codecov codecov bot commented Aug 17, 2018

Codecov Report

Merging #3393 into master will increase coverage by <.01%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##           master    #3393      +/-   ##
==========================================
+ Coverage   84.46%   84.47%   +<.01%     
==========================================
  Files         167      167              
  Lines        9362     9366       +4     
  Branches     1390     1390              
==========================================
+ Hits         7908     7912       +4     
  Misses       1199     1199              
  Partials      255      255
Impacted Files Coverage Δ
scrapy/signals.py 100% <100%> (ø) ⬆️
scrapy/core/downloader/__init__.py 88.97% <100%> (+0.08%) ⬆️
scrapy/spidermiddlewares/offsite.py 100% <0%> (ø) ⬆️
@whalebot-helmsman
Copy link
Contributor Author

@whalebot-helmsman whalebot-helmsman commented Aug 28, 2018

Made a performance comparison with new signal and without one. Full log available https://gist.github.com/whalebot-helmsman/427908453e28e4d91ba24062a3a1aa05

==> /home/hotdox/exc/master <==
2018-08-28 09:11:37 [scrapy.core.engine] INFO: Spider closed (closespider_itemcount)

The results of the benchmark are (all speeds in items/sec) : 


Test = 'Book Spider' Iterations = '10'


Mean : 86.57660125097868 Median : 86.4017169447081 Std Dev : 0.8022010534209844


==> /home/hotdox/exc/branch <==
2018-08-28 09:09:11 [scrapy.core.engine] INFO: Spider closed (closespider_itemcount)

The results of the benchmark are (all speeds in items/sec) : 


Test = 'Book Spider' Iterations = '10'


Mean : 86.94572398605807 Median : 86.71087801245771 Std Dev : 0.9639990239084966

No substantial impact on performance

.. signal:: request_reached_downloader
.. function:: request_reached_downloader(request, spider)

Sent when a :class:`~scrapy.http.Request`, reached downloader.

This comment has been minimized.

@kmike

kmike Aug 28, 2018
Member

typo: comma is not needed here

@kmike kmike changed the title New signal for reqeuests reached downloader [MRG+1] New signal for reqeuests reached downloader Aug 28, 2018
@kmike kmike added this to the v1.6 milestone Aug 28, 2018
@kmike
Copy link
Member

@kmike kmike commented Aug 28, 2018

👍 I think that's a good feature which may be used not only for custom schedulers; overhead doesn't seem significant. Signal name and the implementation look fine.

@dangra dangra merged commit 4da0b59 into scrapy:master Sep 5, 2018
3 checks passed
3 checks passed
codecov/patch 100% of diff hit (target 84.46%)
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@kmike kmike mentioned this pull request Sep 24, 2018
@whalebot-helmsman whalebot-helmsman deleted the whalebot-helmsman:singal-request-added-to-downloader-slot branch Jul 18, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

3 participants