-
Notifications
You must be signed in to change notification settings - Fork 10.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Document that the crawl order is BFO for small numbers of start requests #3621
Conversation
Codecov Report
@@ Coverage Diff @@
## master #3621 +/- ##
==========================================
- Coverage 85.59% 85.03% -0.56%
==========================================
Files 164 164
Lines 9545 9545
Branches 1430 1430
==========================================
- Hits 8170 8117 -53
- Misses 1128 1175 +47
- Partials 247 253 +6
|
16272b6
to
0411dda
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me 😄
docs/faq.rst
Outdated
in most cases. If you do want to crawl in true `BFO order`_, you can do it by | ||
in most cases. | ||
|
||
However, when the number of start request is small it always `crawls in BFO |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it is correct: if there is a single request in start_requests, but then spider schedules many requests (e.g. after extracting links), behavior would be as usual, maybe besides first few requests.
0411dda
to
783de8a
Compare
I’ve switched to a completely new wording that I hope is more accurate and clear. |
Thanks @Gallaecio, I like the clarity of the explanation! |
Fixes #1739