# Intro to Web Scraping

## Dessert First, A Taste of Things To Come

You know what the words "web" and "scraping" mean, but perhaps you don't know what it means to put them together. We'll take a quick look at a page with [a list of quotes](http://quotes.toscrape.com/). 

![quotes](images/quotes.png)

Suppose we want to be able to extract the text of the quotes on the page so we could start our own inspirational story!

We're going to need to install two packages, `requests`, and `BeautifulSoup`. You can install them by typing:

    pip install requests beautifulsoup4
    
We'll go into more detail about the packages later, but with these packages installed, the code below is a simple script that extracts the quotes from the first page of the site.

In [3]:
import requests
from bs4 import BeautifulSoup

# Download the HTML that makes up the page
page = requests.get('http://quotes.toscrape.com/')

# Let BeautifulSoup parse the content
soup = BeautifulSoup(page.content, 'html.parser')

# Find all the quotes. Returns span element <span class="text" itemprop="text">...</span>
quotes = soup.find_all('span', class_='text')

# Use the .text attribute to get just the text inside the elements. Slicing `[1:-1]` removes the quotation marks.
quotes = [quote.text[1:-1] for quote in quotes]

# Look at the output
quotes

['The world as we have created it is a process of our thinking. It cannot be changed without changing our thinking.',
 'It is our choices, Harry, that show what we truly are, far more than our abilities.',
 'There are only two ways to live your life. One is as though nothing is a miracle. The other is as though everything is a miracle.',
 'The person, be it gentleman or lady, who has not pleasure in a good novel, must be intolerably stupid.',
 "Imperfection is beauty, madness is genius and it's better to be absolutely ridiculous than absolutely boring.",
 'Try not to become a man of success. Rather become a man of value.',
 'It is better to be hated for what you are than to be loved for what you are not.',
 "I have not failed. I've just found 10,000 ways that won't work.",
 "A woman is like a tea bag; you never know how strong it is until it's in hot water.",
 'A day without sunshine is like, you know, night.']

As you can see, the quotes match what we see on the site.

## What is Web Scraping?

Web scraping is a method in which you can scrape data from any page on the internet. The process usually consists of three parts:

1. Download the page content.
2. Extract the data you need.
3. Store the data somewhere.
