Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
Fetching contributors…

Cannot retrieve contributors at this time

51 lines (35 sloc) 1.445 kb
= SEP-015: !ScrapyManager and !SpiderManager API refactoring =
[[PageOutline(2-5,Contents)]]
||'''SEP:'''||15||
||'''Title:'''||!ScrapyManager and !SpiderManger API refactoring||
||'''Author:'''||Insophia Team||
||'''Created:'''||2010-03-10||
||'''Status'''||Final||
== Introduction ==
This SEP proposes a refactoring of !ScrapyManager and !SpiderManager APIs.
== !SpiderManager ==
* get(spider_name) -> Spider instance
* find_by_request(request) -> list of spider names
* list() -> list of spider names
* remove: fromdomain(), fromurl()
== !ScrapyManager ==
* crawl_request(request, spider=None)
* calls !SpiderManager.find_by_request(request) if spider is None
* fails if len(spiders returned) != 1
* crawl_spider(spider)
* calls spider.start_requests()
* crawl_spider_name(spider_name)
* calls !SpiderManager.get(spider_name)
* calls spider.start_requests()
* crawl_url(url)
* calls spider.make_requests_from_url()
* remove crawl(), runonce()
Instead of using runonce(), commands (such as crawl/parse) would call crawl_* and then start().
== Changes to Commands ==
* if is_url(arg):
* calls !ScrapyManager.crawl_url(arg)
* else:
* calls !ScrapyManager.crawl_spider_name(arg)
== Pending issues ==
* should we rename !ScrapyManager.crawl_* to schedule_* or add_* ?
* !SpiderManager.find_by_request or !SpiderManager.search(request=request) ?
Jump to Line
Something went wrong with that request. Please try again.