-
Notifications
You must be signed in to change notification settings - Fork 10.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow disabling the AutoThrottle extension for a given slot #6246
Conversation
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## master #6246 +/- ##
=======================================
Coverage 88.90% 88.90%
=======================================
Files 161 161
Lines 11790 11792 +2
Branches 1913 1913
=======================================
+ Hits 10482 10484 +2
Misses 980 980
Partials 328 328
|
@Gallaecio During preparation of #5328 compatibility with Autothrottle extenstion didn't mentioned as requirement. It tested on default settings (with disabled autothrottle extension) Updating AutoThrottle extension to make it compatible with While disabling throttle on specific slot +- looks like applying of And there is no need to.. make backward incompatible changes to scrapy/core downloader slot to implement this. Technically nothing prevents us to update |
So, what you are suggesting is to implement an AutoThrottle-specific setting, e.g. However, my end goal here, the reason why I implemented this, is to be able to disable AutoThrottle on slots created by scrapy-zyte-api, so that we can implement latency reporting without triggering AutoThrottle. For that, I don’t think a setting would be a good choice. I am completely open to suggestions, though. I really do not like modifying a core component to control an extension behavior, it is simply the cleanest thing I could come up with. |
I mean to something like
This will require to.. update params for all download slots with
My suggestion is to update.. method of
This approach doesn't require to change both downloader and autothrottle ext (only zyte api middleware code required to update) |
What about exposing a |
@Gallaecio
With approach I proposed earlier - for end user It will be possible to receive expected result only after updating version of zyteapi middleware. This approach (adding another setting on scrapy code level) as well as updates from this PR also require to update scrapy to the latest version(that include this change) that.. may not be possible for some projects. |
Yes, but if I understood correctly, you are suggesting to monkey-patch Scrapy code from the extension, which I would like to avoid.
I think this is OK, since the only thing we need this for is to support latency reporting for responses coming for scrapy-zyte-api, and most users will probably not care about that. |
As far as I know scrapy-zyte-api (handler) doesn't set scrapy/scrapy/extensions/throttle.py Lines 64 to 66 in 02b97f9
Scrapy-zyte-api(handler) has it's own logic that duplicate logic of retrymiddleware and autothrottle extension (as mentioned on scrapy-plugins/scrapy-zyte-api#99 (comment). In this case I propose to.. update scrapy-zyte-api.. to write (or to not write) something in |
This was not done on purpose, though, it was something we accidentally forgot to implement, i.e. a bug. And by the time we realized it, we were already exploiting this bug to prevent AutoThrottle from affecting Zyte API traffic. But we do want to fix the bug, i.e. implement download_latency, only we need to first figure out a way to handle AutoThrottle that does not depend on download_latency not being defined.
I think this is fine.
I think putting a value in |
No description provided.