Add DOWNLOAD_DELAY=0.5 to Scrapy config #109
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
At the time of writing, oh-bugimporters has difficulty downloading
all the bugs it wants to from github.com.
@ehashman discovered that GitHub throttles API requests after
5000 per hour.
The Scrapy DOWNLOAD_DELAY setting affects only "consecutive pages
from the same website", so we should still see a sizeable amount
of parallelism in our crawling after this change. However,
since this setting applies to all domains, we might still
see a general slowdown.