[MRG+1] some improvements to overview page #1106

eliasdorneles · 2015-03-25T22:38:57Z

Hey folks!

Here is my proposal for addressing issue #609 (and replaces PR #1023, while keeping some ideas).

Since the overview is "the pitch", I tried my best to make it short and to the point.

Summary of the changes:

Added example spider showcasing both scraping and crawling (link following)
Wrote an explanation of what the code does, without delving much into details
Summarized table of features in the end, and reordered them based on my gut feeling
Cut some text
Cut some more

Note: the example spider is also showcasing the features from PRs #1081 and #1086, that I assume will be also in Scrapy 1.0 release.

So, what do you think, does this look good?

Thank you!

kmike · 2015-03-26T14:54:59Z

docs/intro/overview.rst

+In the ``parse`` callback, we scrape the links to the questions and
+yield a few more requests to be processed, registering for them
+the method ``parse_question`` as the callback to be called when the
+requests are complete.


I think here we should explain that these requests are scheduled and processed asynchronously.

Right, let me fix that!

kmike · 2015-03-26T15:23:16Z

docs/intro/overview.rst


-For more information about XPath see the `XPath reference`_.
+Here you notice one of the main advantages about Scrapy: requests are


there was a single request in start_urls; it should be easier to see the advantage for requests sent from parse method

Yeah, I was in doubt about that one... Better move it one paragraph down.

kmike · 2015-03-26T15:31:13Z

@eliasdorneles a good overview, I like it 👍

I'm trying to attack it from a position of a person who can hack together a spider using requests + concurrent.futures + pyquery + json. Please don't take it as a criticism :) Why should such person bother with Scrapy?

eliasdorneles · 2015-03-26T16:11:52Z

Don't worry @kmike, I appreciate your feedback a good deal, always good points! :)

eliasdorneles · 2015-03-26T17:27:40Z

Hey @kmike -- I've just updated addressing your concerns and did some more editing.
Can you please have a look again?
Thank you!

kmike · 2015-03-26T17:38:43Z

docs/intro/overview.rst

-provide any API or mechanism to access that info programmatically.  Scrapy can
-help you extract that information.
+Once you're ready to dive in more, you can :ref:`follow the tutorial
+and build a full-blown Scrapy project <intro-tutorial>`.


I think this note can be moved to the end - it is unclear if users should continue reading the overview, of if they should go to the tutorial. It seems the reason you've put it here is that in addition to 'scrapy runspider' there is 'scrapy crawl' with a full-blown project support, and you wanted to mention it. We can add project support to a list of Scrapy advantages - Scrapy helps to organize the code, so that projects with tens and hundreds of spiders are still manageable.

…g flow

kmike · 2015-03-26T18:38:52Z

//cc @pablohoffman @shaneaevans @dangra and everyone else - thoughts? Use https://github.com/eliasdorneles/scrapy/blob/overview-page-improvements/docs/intro/overview.rst link to read it.

I think this introduction is nearly perfect :)
+1 to merge it once we have the required PRs merged.

nyov · 2015-03-27T10:31:23Z

Everyone Else here. Looks good, I like it.

Well, meh, I actually clicked on that AAWS link, thought I would get some info on how to extract API data. Amazon is big enough, maybe this could point to somewhere else, like something on http://www.programmableweb.com/ ?

I would ask for a single, tiny, response.xpath query in the spider example, just to let old-timers know they aren't deprecated yet :)

Some minor things like misplaced or missing commas, but that can be ignored.

[MRG+1] some improvements to overview page

eliasdorneles · 2015-03-27T20:11:31Z

Hey @nyov -- since this is already merged, please feel free to send a PR to fix those commas or whatever. =)

some improvements to overview page

32423d4

kmike reviewed Mar 26, 2015
View reviewed changes

added bit about async requests, improved phrasing

8f4a268

kmike reviewed Mar 26, 2015
View reviewed changes

addressing comments from the review plus further editing

76e3bf1

kmike reviewed Mar 26, 2015
View reviewed changes

eliasdorneles added 3 commits March 26, 2015 15:26

addressing more review comments, to avoid ambiguity on desired readin…

13d0ecd

…g flow

fixing indentation

729861c

fix community link

7402e27

moved example data to a better place

4dcecc9

kmike changed the title ~~some improvements to overview page~~ [MRG+1] some improvements to overview page Mar 26, 2015

curita added a commit that referenced this pull request Mar 27, 2015

Merge pull request #1106 from eliasdorneles/overview-page-improvements

f4e241a

[MRG+1] some improvements to overview page

curita merged commit f4e241a into scrapy:master Mar 27, 2015

kmike mentioned this pull request May 17, 2016

[DOC][Overview] Use idiomatic .extract_first() #1994

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MRG+1] some improvements to overview page #1106

[MRG+1] some improvements to overview page #1106

eliasdorneles commented Mar 25, 2015

kmike Mar 26, 2015

eliasdorneles Mar 26, 2015

kmike Mar 26, 2015

eliasdorneles Mar 26, 2015

kmike commented Mar 26, 2015

eliasdorneles commented Mar 26, 2015

eliasdorneles commented Mar 26, 2015

kmike Mar 26, 2015

kmike commented Mar 26, 2015

nyov commented Mar 27, 2015

eliasdorneles commented Mar 27, 2015


		For more information about XPath see the `XPath reference`_.
		Here you notice one of the main advantages about Scrapy: requests are

[MRG+1] some improvements to overview page #1106

[MRG+1] some improvements to overview page #1106

Conversation

eliasdorneles commented Mar 25, 2015

kmike Mar 26, 2015

Choose a reason for hiding this comment

eliasdorneles Mar 26, 2015

Choose a reason for hiding this comment

kmike Mar 26, 2015

Choose a reason for hiding this comment

eliasdorneles Mar 26, 2015

Choose a reason for hiding this comment

kmike commented Mar 26, 2015

eliasdorneles commented Mar 26, 2015

eliasdorneles commented Mar 26, 2015

kmike Mar 26, 2015

Choose a reason for hiding this comment

kmike commented Mar 26, 2015

nyov commented Mar 27, 2015

eliasdorneles commented Mar 27, 2015