#Webscraping Quiz

We are going to webscrape: https://quotes.toscrape.com/
You will have an expanding set of tasks that starts with collecting a

In [7]:
import requests
from bs4 import BeautifulSoup

#Tasks:

4 Levels:<br>
Level 1 (B): Grab the first quote from the front page. Print out the quote. <br>
Level 2 (B+): Grab all the quotes from the front page. Store quotes in a list or dictionary (see Level 4).<br>
Level 3 (A): Grab all the quotes from all 10 pages of the website. Store quotes in a list or dictionary (see Level 4)<br>
Level 4 (A+): Create a dictionary where the keys are people and the values are a list of quotes by that person. Use all quotes from Level 3<br>

Full points given to programmatic solutions that effectively use BeautifulSoup.<br><br>
Use an intelligent combination of code and text blocks to demonstrate your understanding of using Python and BeautifulSoup to solve this problem.

Variables for later

In [2]:
site = "https://quotes.toscrape.com"
page_append = "/page/{page}" # Use with str.format(page=n)

Level 1

Requests the main page of the website and uses BeautifulSoup to find the first element with the quote class.

In [8]:
resp = requests.get(site, timeout=20)
soup = BeautifulSoup(resp.text, "html.parser")

quote = soup.find("div", class_="quote").find("span", class_="text").get_text()

quote

'“The world as we have created it is a process of our thinking. It cannot be changed without changing our thinking.”'

Level 2

Same thing as level 1, but instead it finds all elements with the quote class and iterates over them to put them into a list.

In [4]:
resp = requests.get(site, timeout=20)
soup = BeautifulSoup(resp.text, "html.parser")

quotes = [q.find("span", class_="text").get_text() for q in soup.find_all("div", class_="quote")]

quotes

['“The world as we have created it is a process of our thinking. It cannot be changed without changing our thinking.”',
 '“It is our choices, Harry, that show what we truly are, far more than our abilities.”',
 '“There are only two ways to live your life. One is as though nothing is a miracle. The other is as though everything is a miracle.”',
 '“The person, be it gentleman or lady, who has not pleasure in a good novel, must be intolerably stupid.”',
 "“Imperfection is beauty, madness is genius and it's better to be absolutely ridiculous than absolutely boring.”",
 '“Try not to become a man of success. Rather become a man of value.”',
 '“It is better to be hated for what you are than to be loved for what you are not.”',
 "“I have not failed. I've just found 10,000 ways that won't work.”",
 "“A woman is like a tea bag; you never know how strong it is until it's in hot water.”",
 '“A day without sunshine is like, you know, night.”']

Level 3

Does the same thing as level 2 but 10 times.
Creates a list of lists where every list represents a different page on the website.

In [5]:
quotes_by_page = []

for i in range(10):
    new_url = site + page_append.format(page=i+1)
    
    resp = requests.get(new_url, timeout=20)
    soup = BeautifulSoup(resp.text, "html.parser")

    quotes_by_page.append([q.find("span", class_="text").get_text() for q in soup.find_all("div", class_="quote")])

quotes_by_page

[['“The world as we have created it is a process of our thinking. It cannot be changed without changing our thinking.”',
  '“It is our choices, Harry, that show what we truly are, far more than our abilities.”',
  '“There are only two ways to live your life. One is as though nothing is a miracle. The other is as though everything is a miracle.”',
  '“The person, be it gentleman or lady, who has not pleasure in a good novel, must be intolerably stupid.”',
  "“Imperfection is beauty, madness is genius and it's better to be absolutely ridiculous than absolutely boring.”",
  '“Try not to become a man of success. Rather become a man of value.”',
  '“It is better to be hated for what you are than to be loved for what you are not.”',
  "“I have not failed. I've just found 10,000 ways that won't work.”",
  "“A woman is like a tea bag; you never know how strong it is until it's in hot water.”",
  '“A day without sunshine is like, you know, night.”'],
 ["“This life is what you make it. No matter

Level 4

Similarly to level 3, it iterates over the 10 pages of the website, but this time it finds the author as well.
Creates a dictionary where the author name points to a list of quotes.

In [13]:
quotes_by_person = {}

for i in range(10):
    new_url = site + page_append.format(page=i+1)
    
    resp = requests.get(new_url, timeout=20)
    soup = BeautifulSoup(resp.text, "html.parser")

    for q in soup.find_all("div", class_="quote"):
        text = q.find("span", class_="text").get_text()
        author = q.find("small", class_="author").get_text()

        if not author in quotes_by_person.keys():
            quotes_by_person[author] = []
        
        quotes_by_person[author].append(text)

quotes_by_person

{'Albert Einstein': ['“The world as we have created it is a process of our thinking. It cannot be changed without changing our thinking.”',
  '“There are only two ways to live your life. One is as though nothing is a miracle. The other is as though everything is a miracle.”',
  '“Try not to become a man of success. Rather become a man of value.”',
  "“If you can't explain it to a six year old, you don't understand it yourself.”",
  '“If you want your children to be intelligent, read them fairy tales. If you want them to be more intelligent, read them more fairy tales.”',
  '“Logic will get you from A to Z; imagination will get you everywhere.”',
  '“Any fool can know. The point is to understand.”',
  '“Life is like riding a bicycle. To keep your balance, you must keep moving.”',
  '“If I were not a physicist, I would probably be a musician. I often think in music. I live my daydreams in music. I see my life in terms of music.”',
  '“Anyone who has never made a mistake has never tried a