Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG] process robots.txt once #896

Merged
merged 2 commits into from Oct 2, 2014
Merged

[MRG] process robots.txt once #896

merged 2 commits into from Oct 2, 2014

Conversation

@kmike
Copy link
Member

@kmike kmike commented Sep 22, 2014

Currently RobotsTxtMiddleware processes requests to robots.txt it is sending. If an initial request was a redirect then 2 requests to robots.txt are sent. To fix that I've added dont_obey_robotstxt Request.meta flag; initially it was private, but as it can be useful for users it is made public. Also, an unused self._spider_netlocs set is deleted.

pablohoffman added a commit that referenced this pull request Oct 2, 2014
[MRG] process robots.txt once
@pablohoffman pablohoffman merged commit 5835224 into master Oct 2, 2014
1 check passed
1 check passed
continuous-integration/travis-ci The Travis CI build passed
Details
@kmike kmike deleted the robotstxt-once branch Mar 13, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

2 participants
You can’t perform that action at this time.