New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MRG+1] some improvements to overview page #1106
[MRG+1] some improvements to overview page #1106
Conversation
In the ``parse`` callback, we scrape the links to the questions and | ||
yield a few more requests to be processed, registering for them | ||
the method ``parse_question`` as the callback to be called when the | ||
requests are complete. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think here we should explain that these requests are scheduled and processed asynchronously.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, let me fix that!
|
||
For more information about XPath see the `XPath reference`_. | ||
Here you notice one of the main advantages about Scrapy: requests are |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there was a single request in start_urls; it should be easier to see the advantage for requests sent from parse method
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I was in doubt about that one... Better move it one paragraph down.
@eliasdorneles a good overview, I like it 👍 I'm trying to attack it from a position of a person who can hack together a spider using requests + concurrent.futures + pyquery + json. Please don't take it as a criticism :) Why should such person bother with Scrapy? |
Don't worry @kmike, I appreciate your feedback a good deal, always good points! :) |
Hey @kmike -- I've just updated addressing your concerns and did some more editing. |
provide any API or mechanism to access that info programmatically. Scrapy can | ||
help you extract that information. | ||
Once you're ready to dive in more, you can :ref:`follow the tutorial | ||
and build a full-blown Scrapy project <intro-tutorial>`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this note can be moved to the end - it is unclear if users should continue reading the overview, of if they should go to the tutorial. It seems the reason you've put it here is that in addition to 'scrapy runspider' there is 'scrapy crawl' with a full-blown project support, and you wanted to mention it. We can add project support to a list of Scrapy advantages - Scrapy helps to organize the code, so that projects with tens and hundreds of spiders are still manageable.
//cc @pablohoffman @shaneaevans @dangra and everyone else - thoughts? Use https://github.com/eliasdorneles/scrapy/blob/overview-page-improvements/docs/intro/overview.rst link to read it. I think this introduction is nearly perfect :) |
Everyone Else here. Looks good, I like it. Well, meh, I actually clicked on that AAWS link, thought I would get some info on how to extract API data. Amazon is big enough, maybe this could point to somewhere else, like something on http://www.programmableweb.com/ ? I would ask for a single, tiny, Some minor things like misplaced or missing commas, but that can be ignored. |
[MRG+1] some improvements to overview page
Hey @nyov -- since this is already merged, please feel free to send a PR to fix those commas or whatever. =) |
Hey folks!
Here is my proposal for addressing issue #609 (and replaces PR #1023, while keeping some ideas).
Since the overview is "the pitch", I tried my best to make it short and to the point.
Summary of the changes:
Note: the example spider is also showcasing the features from PRs #1081 and #1086, that I assume will be also in Scrapy 1.0 release.
So, what do you think, does this look good?
Thank you!