Review web scraping lesson and get it ready for publication #35

ostephens · 2017-05-31T11:01:36Z

The web scraping lesson https://github.com/data-lessons/library-webscraping was initially developed by @timtomch

The contents of @timtomch's lesson has been copied to this repository to be reviewed and amended before it is ready for publication as part of the library carpentry materials.

Use https://github.com/data-lessons/library-webscraping/issues for issues to be worked on during the sprint

drjwbaker · 2017-06-04T14:10:54Z

When you (someone) gets a minute, please report back on progress during the sprint and close this issue. Ta!

ldko · 2017-06-05T15:38:23Z

During the 2017 sprint we worked on setup and README files and outlined a plan to:

Remove Chrome extensions section to focus more on scraping with Python as a way to learn and apply programming concepts rather than using an available tool that may not be supported for very long.
Add CSS Selector examples in addition to the XPATH examples that are there as it will lead into the episodes about selecting items with BeautifulSoup.
Replace Scrapy instructions with Requests and BeautifulSoup (#6, #14).
Laid out a general outline of how the teaching of BeautifulSoup (#11, #12 might go--following the structure of a scraping tutorial used by University of Oklahoma but using the UN site to get data about Security Council resolutions for the examples.

We also discussed modifying where/how ethics are brought up in the lesson and possible benefits of using URLs from archive.org Wayback Machine for the scraping examples, since those should be static as opposed to using a live production site that may change at any time. Also, we set up a GitHub Project in the library-webscraping repo for tracking progress.

Due to having fewer people participating in work on the web scraping lesson on day 2 of the sprint, there was not much done to actually make the changes to the content structure that were proposed the first day.

ostephens added the mozsprint label May 31, 2017

weaverbel closed this as completed Jun 1, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Review web scraping lesson and get it ready for publication #35

Review web scraping lesson and get it ready for publication #35

ostephens commented May 31, 2017

drjwbaker commented Jun 4, 2017

ldko commented Jun 5, 2017

Review web scraping lesson and get it ready for publication #35

Review web scraping lesson and get it ready for publication #35

Comments

ostephens commented May 31, 2017

drjwbaker commented Jun 4, 2017

ldko commented Jun 5, 2017