Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tutorial needs an update #107

Closed
johnmee opened this issue Apr 3, 2012 · 4 comments
Closed

Tutorial needs an update #107

johnmee opened this issue Apr 3, 2012 · 4 comments

Comments

@johnmee
Copy link

johnmee commented Apr 3, 2012

I'm just trying to run through the tutorial and am thinking quite a few things have changed since it was last revised...

For example:

  • it talks about creating a directory "dmoz/spiders" which contrasts with the comment in "spiders/init.py" suggesting we run "scrapy genspider ...".
  • the code in the tutorial subclasses "BaseSpider", yet the genspider creates almost identical but subclasses "CrawlSpider"
  • doing as instructed (creating dmoz/spiders/dmoz_spider.py and running scrapy crawl...) fails with "Spider not found: dmoz" but following the comments in the codebase works ok.

That area seems to be the only bit that has changed significantly.

@barraponto
Copy link
Contributor

It seems scrapy genspider defaults to the crawler template, yet I'd rather use -t basic...

@pablohoffman
Copy link
Member

Scrapy tutorial doesn't cover genspider command. That's more of an advanced command for when you're dealing with project with lots of spiders and have predefined templates for some of them.

I don't understand the last point. Are you saying the code in the tutorial is wrong? Which part? Did you know there is a fully functional code in the scrapy/dirbot github project?

@johnmee
Copy link
Author

johnmee commented Apr 16, 2012

You're right. The tutorial doesn't cover the genspider command. When I followed the tutorial and got to the "Our first spider" section it conflicted with the comment in spiders/init "To create first spider ... use scrapy genspider...". This caused me to lose confidence in the tute. Would it be better to have the tutorial go straight to showing how to use genspider? or at least mention that you're demonstrating first principles and genspider is for later.

With the last point, from memory, I didn't appreciate at the time that the commands available vary with respect to the current working directory. It made more sense after I came to that, so perhaps put more emphasis on the text "go to the project’s top level directory" and why that's relevant.

No big drama. Just keen to help because the tutorial is great and Scrapy even better!

pablohoffman added a commit that referenced this issue Apr 19, 2012
…r command, which may be considered as an advanced feature. refs #107
pablohoffman added a commit that referenced this issue Apr 19, 2012
…r command, which may be considered as an advanced feature. refs #107
pablohoffman added a commit that referenced this issue Apr 19, 2012
…ds are available when run from project directory. refs #107
@pablohoffman
Copy link
Member

@johnmee thanks for your suggestions, I've pretty much implemented all of them (see referenced commits)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants