Skip to content

Commit

Permalink
Merge pull request #132 from oltarasenko/Ziinc-patch-2
Browse files Browse the repository at this point in the history
enhanced docs for :concurrent_requests_per_domain
  • Loading branch information
oltarasenko committed Nov 2, 2020
2 parents a96b1e6 + 9a53078 commit e497f58
Showing 1 changed file with 7 additions and 5 deletions.
12 changes: 7 additions & 5 deletions documentation/configuration.md
Expand Up @@ -67,11 +67,13 @@ Defines a minimal amount of items which needs to be scraped by the spider within

default: 4

The maximum number of concurrent (ie. simultaneous) requests that will be performed by the Crawly workers.
Indicates the number of Crawly workers that will be used to make simultaneous requests on a per-spider basis.

NOTE: Worker's speed if often limited by the speed of the actual HTTP client and
network bandwidth. Crawly itself would not allow one worker to send more than
4 requests per minute.
Each Crawly worker is rate-limited to a maximum of 4 requests per minute. Hence, by default, the number of requests possible would be `4 x 4 = 16 req/min`.

In order to increase the number of requests made per minute, set this option to a higher value. The default value is intentionally set at 4 for ethical crawling reasons.

NOTE: A worker's speed if often limited by the speed of the actual HTTP client and network bandwidth, and may be less than 4 requests per minute.

### retry :: Keyword list

Expand Down Expand Up @@ -138,4 +140,4 @@ The full list of overridable settings:
- fetcher,
- retry,
- middlewares,
- pipelines
- pipelines

0 comments on commit e497f58

Please sign in to comment.