[MRG] process robots.txt once #896
Merged
Conversation
pablohoffman
added a commit
that referenced
this pull request
Oct 2, 2014
[MRG] process robots.txt once
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Currently RobotsTxtMiddleware processes requests to robots.txt it is sending. If an initial request was a redirect then 2 requests to robots.txt are sent. To fix that I've added
dont_obey_robotstxt
Request.meta flag; initially it was private, but as it can be useful for users it is made public. Also, an unusedself._spider_netlocs
set is deleted.