Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Verify SPIDER_MANAGER_CLASS interface while loading it in CrawlerRunner #1148

Merged
merged 1 commit into from Apr 13, 2015

Conversation

@curita
Copy link
Member

@curita curita commented Apr 10, 2015

Discussed in #873.

pablohoffman added a commit that referenced this pull request Apr 13, 2015
Verify SPIDER_MANAGER_CLASS interface while loading it in CrawlerRunner
@pablohoffman pablohoffman merged commit 71c0afa into scrapy:master Apr 13, 2015
1 check passed
1 check passed
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@curita curita added this to the Scrapy 1.0 milestone Apr 14, 2015
@curita curita deleted the curita:verify-spidermanager-interface branch Apr 14, 2015
@kmike
Copy link
Member

@kmike kmike commented Apr 16, 2015

This change broke some SpiderManager implementations which don't implement find_by_request method. find_by_request is only useful for scrapy fetch, scrapy parse and scrapy shell commands; crawling should work fine without this method.

@@ -42,3 +43,19 @@ class CustomSettingsSpider(DefaultSpider):

self.assertFalse(settings.frozen)
self.assertTrue(crawler.settings.frozen)


def SpiderManagerWithWrongInterface(object):

This comment has been minimized.

@kmike

kmike Apr 16, 2015
Member

why def?

This comment has been minimized.

@curita

curita Apr 16, 2015
Author Member

Oh god, why did we merged this (should be 'class', got a little too functional there). I'm not even sure how the test is working.

@curita
Copy link
Member Author

@curita curita commented Apr 16, 2015

About find_by_request, I'm not sure, I kept it in the interface on purpose (It was there before my changes in the Crawler API), I'd like that any SpiderManager implemented all needed functions to run all commands. We're doing this instead of letting it raise an exception later on because some method is not implemented, and that's going to happen eventually if we remove find_by_request from the interface (but I grant that another use-case, probably more important, for checking the interface is to verify that the SpiderManager is not using the old interface, and that could be done without find_by_request since it's present in the old and the new interface).

I still prefer this check being done over the complete interface, but we could change it.

@curita
Copy link
Member Author

@curita curita commented Apr 16, 2015

Now that I think of it again maybe we could issue a warning instead of an exception if SpiderManager (soon to be SpiderLoader) doesn't comply with the interface, seems like it's the more tolerant alternative.

Another option could be to raise an exception if the absolutely needed methods to run are not implemented (which I think are from_settings and load, list seems like it's only needed for scrapy list) and then issue a warning if the other two methods are not there.

I like both options (but the first one slightly more).

@MojoJolo
Copy link

@MojoJolo MojoJolo commented Apr 17, 2015

@kmike pointed me into this issue. Looks like my problem is related.

I'm having an error and getting The find_by_request attribute was not provided. message when executing scrapy crawl or even scrapy version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

4 participants