Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG+1] docs: update overview spider code to use toscrape.com and minor changes #2249

Merged
merged 3 commits into from Sep 16, 2016

Conversation

eliasdorneles
Copy link
Member

So, this will replace the spider example code from the overview that
scrapes questions from StackOverflow by a spider scraping quotes (much
like the one in the tutorial), and upates the text around it to be
consistent.

There are also minor wording changes plus a small Sphinx/reST syntax fix
on the features list at the bottom (it was creating a definition list,
causing one line to be bold).

Does this look good?
Thanks!

So, this will replace the spider example code from the overview that
scrapes questions from StackOverflow by a spider scraping quotes (much
like the one in the tutorial), and upates the text around it to be
consistent.

There are also minor wording changes plus a small Sphinx/reST syntax fix
on the features list at the bottom (it was creating a definition list,
causing one line to be bold).
of them as they finish.
an argument. In the ``parse`` callback we loop through the quote elements
using a CSS Selector, yield a Python dict with the extracted quote text and author,
look for a link to the next page and schedules another request using the same
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and schedules another request using the same: schedules should be in 1st person of plural.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed!

@codecov-io
Copy link

codecov-io commented Sep 15, 2016

Current coverage is 83.36% (diff: 100%)

Merging #2249 into master will not change coverage

Powered by Codecov. Last update 2f60f2a...75531e4

Copy link
Contributor

@redapple redapple left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

'link': response.url,
}
next_page = response.css('li.next a::attr("href")').extract_first()
if next_page:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think next_page is None could be slightly better because an empty string is a valid relative URL

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed!

@redapple redapple changed the title docs: update overview spider code to use toscrape.com and minor changes [MRG+1] docs: update overview spider code to use toscrape.com and minor changes Sep 16, 2016
@eliasdorneles
Copy link
Member Author

hey @kmike, I'll merge this to move things forward, feel free to point out if you have any other concern.
thanks for reviewing!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants