-
Notifications
You must be signed in to change notification settings - Fork 10.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MRG+1] Enable genspider command outside project folder #2052
Conversation
Current coverage is 83.32%@@ master #2052 diff @@
==========================================
Files 161 161
Lines 8678 8682 +4
Methods 0 0
Messages 0 0
Branches 1272 1274 +2
==========================================
+ Hits 7231 7234 +3
Misses 1196 1196
- Partials 251 252 +1
|
+1, this is cool! Only thing missing now is to update docs. :)
The short description for genspider should also be updated, and it would be nice to mention the meaning of the arguments when using standalon (like, how |
This PR fails when setting the template for crawl, xmlfeed or csvfeed, because those templates include code to import the items module. I'm going to have a look on how to fix it and then I update the PR. |
I was looking at how to make this work for the 'crawl', 'xmlfeed' and 'csvfeed' templates. The issue is that those templates import the We could solve that by removing the import and uses of the Item class in the spider code, as it is in the 'basic' template. The crawl template would become: import scrapy
from scrapy.linkextractors import LinkExtractor
from scrapy.spiders import CrawlSpider, Rule
class $classname(CrawlSpider):
name = '$name'
allowed_domains = ['$domain']
start_urls = ['http://www.$domain/']
rules = (
Rule(LinkExtractor(allow=r'Items/'), callback='parse_item', follow=True),
)
def parse_item(self, response):
pass Or, we could employ a template engine, such as jinja2, but it looks like an overkill. Thoughts? |
+1 on changing the |
@stummjr looks good! |
The build is failing because of unrelated coverage error. |
sorry, I should've said |
Thanks @stummjr ! |
[backport][1.1] Enable genspider command outside project folder (PR #2052)
This PR enables the
genspider
CLI command even when the working dir is not a scrapy project.The rationale behind is that some users (including me) are used to create standalone spiders and run it with the
runspider
command, because it's a quick and convenient way to fire up simple spiders. Havinggenspider
available would make it even quicker.