This repository contains the source code for the fourteen examples included in the book Practical Web Scraping for Data Science: Best Practices and Examples with Python by Seppe vanden Broucke and Bart Baesens.
See http://www.webscrapingfordatascience.com/ for more information, or buy the book on Amazon.
The following examples are included and explained in the book and available here under python-examples
:
- Scraping Hacker News, see
hacker-news
folder - Using the Hacker News API, see
hacker-news
folder - Quotes to Scrape, see
quotes-to-scrape
folder - Books to Scrape, see
books-to-scrape
folder - Scraping GitHub Stars, see
github
folder - Scraping Mortgage Rates, see
mortgage-rates
folder - Scraping and Visualizing IMDB Ratings, see
imdb
folder - Scraping IATA Airline Information, see
iata
folder - Scraping and Analyzing Web Forum Interactions, see
web-forum
folder - Collecting and Clustering a Fashion Data Set, see
fashion-clustering
folder - Sentiment Analysis of Scraped Amazon Reviews, see
product-reviews
folder - Scraping and Analyzing News Articles, see
news-articles
folder - Scraping and Analyzing a Wikipedia Graph, see
wikipedia-graph
folder - Scraping and Visualizing a Board Members Graph, see
board-members
folder - Breaking CAPTCHA’s Using Deep Learning, see
captcha-cracking
folder