-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding Round Robin Queue #21
Conversation
Codecov Report
@@ Coverage Diff @@
## master #21 +/- ##
=======================================
- Coverage 98.52% 97.53% -1%
=======================================
Files 3 4 +1
Lines 204 243 +39
Branches 26 34 +8
=======================================
+ Hits 201 237 +36
- Misses 1 2 +1
- Partials 2 4 +2
Continue to review full report at Codecov.
|
.gitignore
Outdated
@@ -0,0 +1,101 @@ | |||
# Byte-compiled / optimized / DLL files |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please, remove this file. This kind of files are part of developer environment, if we have something project specific we add it here.
I like the idea and implementation. Let's merge as soon as my |
Thanks @dangra: I removed the |
@cathalgarvey LGTM and release but we need to fix travis-ci build which seems broken due to missing pypy binary. |
broken travis-ci builds addressed by #22 |
This queue (with tests) is to solve the issues raised scrapy/scrapy#2474 and scrapy/scrapy#1802
I would like a domain scheduler implemented here which scrapes in a domain-smart way: by round-robin cycling through the domains. This has two benefits:
CONCURRENT_REQUESTS_PER_IP
type restrictions.This implements the proposed solution in scrapy/scrapy#1802. I would like to merge the round-robin queue first, and then merge in the changes from in the domain scheduler into scrapy.