# Module 2 Assessment

Welcome to your Mod 2 Assessment. You will be tested for your understanding of concepts and ability to solve problems that have been covered in class and in the curriculum.

Use any libraries you want to solve the problems in the assessment.

You will have up to two hours to complete this assessment.

The sections of the assessment are:

- Accessing Data Through APIs
- Object Oriented Programming
- SQL and Relational Databases
- HTML, CSS and Web Scraping
- Other Database Structures (MongoDB)

In this assessment you will be exploring two datasets: Pokemon and Quotes.

In [1]:
# import the necessary libraries
import requests
import json
import pandas as pd
import sqlite3
from bs4 import BeautifulSoup
import pymongo

## Part 1: Accessing Data Through APIs

In this section we'll be using PokeAPI to get data on Pokemon. Let's first define functions to get information from the API. Provided below is a URL that will get you started with the first 151 Pokemon! Run the cell below to see what we get.

In [22]:
url = 'https://pokeapi.co/api/v2/pokemon/?limit=151'
results = requests.get(url).json()['results']
results

[{'name': 'bulbasaur', 'url': 'https://pokeapi.co/api/v2/pokemon/1/'},
 {'name': 'ivysaur', 'url': 'https://pokeapi.co/api/v2/pokemon/2/'},
 {'name': 'venusaur', 'url': 'https://pokeapi.co/api/v2/pokemon/3/'},
 {'name': 'charmander', 'url': 'https://pokeapi.co/api/v2/pokemon/4/'},
 {'name': 'charmeleon', 'url': 'https://pokeapi.co/api/v2/pokemon/5/'},
 {'name': 'charizard', 'url': 'https://pokeapi.co/api/v2/pokemon/6/'},
 {'name': 'squirtle', 'url': 'https://pokeapi.co/api/v2/pokemon/7/'},
 {'name': 'wartortle', 'url': 'https://pokeapi.co/api/v2/pokemon/8/'},
 {'name': 'blastoise', 'url': 'https://pokeapi.co/api/v2/pokemon/9/'},
 {'name': 'caterpie', 'url': 'https://pokeapi.co/api/v2/pokemon/10/'},
 {'name': 'metapod', 'url': 'https://pokeapi.co/api/v2/pokemon/11/'},
 {'name': 'butterfree', 'url': 'https://pokeapi.co/api/v2/pokemon/12/'},
 {'name': 'weedle', 'url': 'https://pokeapi.co/api/v2/pokemon/13/'},
 {'name': 'kakuna', 'url': 'https://pokeapi.co/api/v2/pokemon/14/'},
 {'name': '

In [11]:
results[0]['url']

'https://pokeapi.co/api/v2/pokemon/1/'

[Read the documentation here](https://pokeapi.co/docs/v2.html) for information on navigating this API and use the API to obtain data to answer the following questions.

### Accessing Data

1. For any **one** Pokemon, retrieve the following information in a dictionary format with the following keys:
    - ID
    - Name
    - Base experience
    - Weight
    - Height
    - Types

The `types` attribute is going to require some manipulation of the data. The API does not respond with the data in the desired format. You might want to write helper functions to convert the format the API responds with to a `list` of `strings`. 

Your output should look like this:

```
{'id': 1, 
'name': 'bulbasaur', 
'base_experience': 64, 
'weight': 69, 
'height': 7, 
'types': ['poison', 'grass']}
```

In [56]:
# you may define any helper functions here
def get_url_from_results(url):
    results = requests.get(url).json()['results']
    url_list = []
    for url in range(0, len(results)):
        url_list.append(results[url]['url'])
    return url_list

get_url_from_results('https://pokeapi.co/api/v2/pokemon/?limit=151')[:5]

['https://pokeapi.co/api/v2/pokemon/1/',
 'https://pokeapi.co/api/v2/pokemon/2/',
 'https://pokeapi.co/api/v2/pokemon/3/',
 'https://pokeapi.co/api/v2/pokemon/4/',
 'https://pokeapi.co/api/v2/pokemon/5/']

In [24]:
def get_name_from_results(res):
    name_list = []
    for name in range(0, len(res)):
        name_list.append(res[name]['name'])
    return name_list

get_name_from_results(results)[:5]

['bulbasaur', 'ivysaur', 'venusaur', 'charmander', 'charmeleon']

In [21]:
url_1 = 'https://pokeapi.co/api/v2/pokemon/1/'
results_1 = requests.get(url_1).json()
results_1

{'abilities': [{'ability': {'name': 'chlorophyll',
    'url': 'https://pokeapi.co/api/v2/ability/34/'},
   'is_hidden': True,
   'slot': 3},
  {'ability': {'name': 'overgrow',
    'url': 'https://pokeapi.co/api/v2/ability/65/'},
   'is_hidden': False,
   'slot': 1}],
 'base_experience': 64,
 'forms': [{'name': 'bulbasaur',
   'url': 'https://pokeapi.co/api/v2/pokemon-form/1/'}],
 'game_indices': [{'game_index': 1,
   'version': {'name': 'white-2',
    'url': 'https://pokeapi.co/api/v2/version/22/'}},
  {'game_index': 1,
   'version': {'name': 'black-2',
    'url': 'https://pokeapi.co/api/v2/version/21/'}},
  {'game_index': 1,
   'version': {'name': 'white',
    'url': 'https://pokeapi.co/api/v2/version/18/'}},
  {'game_index': 1,
   'version': {'name': 'black',
    'url': 'https://pokeapi.co/api/v2/version/17/'}},
  {'game_index': 1,
   'version': {'name': 'soulsilver',
    'url': 'https://pokeapi.co/api/v2/version/16/'}},
  {'game_index': 1,
   'version': {'name': 'heartgold',
    'ur

In [74]:
def get_pokedata(url):
    results_1 = requests.get(url).json()
    id = results_1['id']
    name = results_1['name']
    base = results_1['base_experience']
    weight = results_1['weight']
    height = results_1['height']
    types = results_1['types']
    type_list = []
    for type in range(0, len(types)):
        type_list.append(types[type]['type']['name'])    
    return {'id': id, 'name': name, 'base_experience': base, 'weight': weight, 'height':height, 'types': type_list}
    
get_pokedata('https://pokeapi.co/api/v2/pokemon/1/')


{'id': 1,
 'name': 'bulbasaur',
 'base_experience': 64,
 'weight': 69,
 'height': 7,
 'types': ['poison', 'grass']}

### Processing All The Data

2. Get the same information for the first **151** Pokemon as a list of dictionaries ordered by Pokemon ID. Print the first and last elements of the list. Your output should save the list to a variable and look like this:

```
[{'id': 1, 
'name': 'bulbasaur', 
'base_experience': 64, 
'weight': 69, 
'height': 7, 
'types': ['poison', 'grass']}, 
{'id': 2, 
'name': 'ivysaur', 
'base_experience': 142, 
'weight': 130, 
'height': 10, 
'types': ['poison', 'grass']}, ... ]
```



In [76]:
# """ 

# Assign to "pokedata" the list of 151 dictionaries.
# You may use your function from the previous question.

# """

# Your code here
# url_list = get_url_from_results("https://pokeapi.co/api/v2/pokemon/?limit=151")[0:5]

def get_pokedata_all(url):
    url_list = get_url_from_results(url)
    return [get_pokedata(link) for link in url_list]

get_pokedata_all("https://pokeapi.co/api/v2/pokemon/?limit=151")





pokedata = get_pokedata_all("https://pokeapi.co/api/v2/pokemon/?limit=151")


In [77]:
# printing first and last elements

print(pokedata[0], pokedata[-1])

{'id': 1, 'name': 'bulbasaur', 'base_experience': 64, 'weight': 69, 'height': 7, 'types': ['poison', 'grass']} {'id': 151, 'name': 'mew', 'base_experience': 270, 'weight': 40, 'height': 4, 'types': ['psychic']}


## Part 2: Object Oriented Programming

We're going to use the data gathered in the previous section on APIs for this section on Object Oriented Programming to instantiate Pokemon objects and write instance methods.

### Creating a Class

1. Create a class called `Pokemon` with an `__init__` method to instantiate the following attributes:
    - ID
    - Name
    - Base experience
    - Weight
    - Height
    - Types

In [68]:
# if you were unable to get the data from the API in the right format,
# uncomment the code below to access a JSON file with the list of dictionaries

with open('data/pokemon.json') as f:  
    pokelist = json.load(f)

In [70]:
pokelist[0]

{'id': 1,
 'name': 'bulbasaur',
 'base_experience': 64,
 'weight': 69,
 'height': 7,
 'types': ['poison', 'grass']}

In [135]:
"""
Create your class below with the correct syntax, including an __init__ method.

"""
class Pokemon:
    def __init__(self, ID = None, name = None, base_experience = None, weight = None, height = None, types = None):
        self.ID = id
        self.name = name
        self.base_experience = base_experience
        self.weight = weight
        self.height = height
        self.types = types
        
    def bmi(self):
        height_squared = self.height**2
        bmi = self.weight/height_squared
        return bmi*10

    
        

        


    
### Instantiating Objects

2. Using the data you obtained from the API, instantiate the first, fourth and seventh Pokemon. Assign them to the variables `bulbasaur`, `charmander` and `squirtle`.

In [124]:
list(pokedata[0].values())

[1, 'bulbasaur', 64, 69, 7, ['poison', 'grass']]

In [125]:
list(pokedata[3].values())

[4, 'charmander', 62, 85, 6, ['fire']]

In [126]:
list(pokedata[6].values())

[7, 'squirtle', 63, 90, 5, ['water']]

In [127]:
for item in pokedata[0].values():
    print(item)

1
bulbasaur
64
69
7
['poison', 'grass']


In [136]:
# run this cell to test and check your code
# you may need to edit the attribute variable names if you named them differently!

bulbasaur = Pokemon(1, 'bulbasaur', 64, 69, 7, ['poison', 'grass'])
charmander = Pokemon(4, 'charmander', 62, 85, 6, ['fire'])
squirtle = Pokemon(7, 'squirtle', 63, 90, 5, ['water'])


# def print_pokeinfo(pokemon_object):
#     o = pokemon_object
#     print('ID: ' + str(o.ID) + '\n' +
#           'Name: ' + o.name.title() + '\n' +
#           'Base experience: ' + str(o.exp) + '\n' +
#           'Weight: ' + str(o.weight) + '\n' +
#           'Height: ' + str(o.height) + '\n' +
#           'Types: ' + str(o.types) + '\n' +
#          )


# print_pokeinfo(bulbasaur)
# print_pokeinfo(ivysaur)
# print_pokeinfo(venusaur)

### Instance Methods

3. Write an instance method within the class `Pokemon` to find the BMI of a Pokemon. BMI is defined by $\frac{weight}{height^{2}}$ with weight in **kilograms** and height in **meters**. The height and weight data of Pokemon from the API is in **decimeters** and **hectograms** respectively.


    1 decimeter = 0.1 meters
    1 hectogram = 0.1 kilograms

In [137]:
# run this cell to test and check your code
# you will probably have to rerun the code to instantiate your objects

print(bulbasaur.bmi()) # 14.08
print(charmander.bmi()) # 23.61
print(squirtle.bmi()) # 36

14.081632653061224
23.61111111111111
36.0


## Part 3: SQL and Relational Databases

For this section, we've put the Pokemon data into SQL tables. You won't need to use your list of dictionaries or the JSON file for this section. The schema of `pokemon.db` is as follows:

<img src="data/pokemondb.png" alt="db schema" style="width:500px;"/>

Assign your SQL queries as strings to the variables `q1`, `q2`, etc. and run the cells at the end of this section to print your results as Pandas DataFrames.

- q1: query all columns from `Pokemon` the Pokemon that have base_experience above 200  

  
- q2: query the id, name, type1 and type2 of Pokemon that have **water** types as either their first or second type


- q3: query the average weight of Pokemon by their first type in descending order


- q4: query the Pokemon name, Pokemon type2, and what **type2** has "2xdamage" to


- q5: query the top 5 most common type1s, the minimum height, maximum height, minimum weight and maximum weight of pokemon with those type1s, and what associated type they do "0.5xdamage" to


**Important note on syntax**: use `double quotes ""` when quoting strings **within** your query and wrap the entire query in `single quotes ''` For the column titles that begin with numbers, you need to wrap the column names in double quotes.

In [138]:
cnx = sqlite3.connect('data/pokemon.db')

In [139]:
# q1: query all columns from Pokemon the Pokemon that have base_experience above 200
q1 = 'SELECT * FROM pokemon WHERE base_experience > 200'
pd.read_sql(q1, cnx)

Unnamed: 0,id,name,base_experience,weight,height,type1,type2
0,3,venusaur,236,1000,20,grass,poison
1,6,charizard,240,905,17,fire,flying
2,9,blastoise,239,855,16,water,
3,18,pidgeot,216,395,15,normal,flying
4,26,raichu,218,300,8,electric,
5,31,nidoqueen,227,600,13,poison,ground
6,34,nidoking,227,620,14,poison,ground
7,36,clefable,217,400,13,fairy,
8,45,vileplume,221,186,12,grass,poison
9,62,poliwrath,230,540,13,water,fighting


In [151]:
q6 = 'SELECT * FROM types'
pd.read_sql(q6, cnx)

Unnamed: 0,id,name,2xdamage,0.5xdamage
0,1,normal,,
1,2,fighting,normal,rock
2,3,flying,fighting,fighting
3,4,poison,grass,fighting
4,5,ground,poison,poison
5,6,rock,flying,normal
6,7,bug,grass,fighting
7,8,ghost,ghost,poison
8,9,steel,rock,normal
9,10,fire,bug,bug


In [141]:
# q2: query the id, name, type1 and type2 of Pokemon that have water types as either their first or second type
q2 = """SELECT id, name, type1, type2 FROM pokemon WHERE type1 = 'water' OR type2 = 'water'"""
pd.read_sql(q2, cnx)

Unnamed: 0,id,name,type1,type2
0,7,squirtle,water,
1,8,wartortle,water,
2,9,blastoise,water,
3,54,psyduck,water,
4,55,golduck,water,
5,60,poliwag,water,
6,61,poliwhirl,water,
7,62,poliwrath,water,fighting
8,72,tentacool,water,poison
9,73,tentacruel,water,poison


In [147]:
# q3: query the average weight of Pokemon by their first type in descending order
q3 = 'SELECT AVG(weight), type1 FROM pokemon GROUP BY type1 ORDER BY AVG(weight) DESC'
pd.read_sql(q3, cnx)

Unnamed: 0,AVG(weight),type1
0,876.111111,rock
1,766.0,dragon
2,579.678571,water
3,542.857143,fighting
4,515.625,psychic
5,500.863636,normal
6,480.25,fire
7,480.0,ice
8,452.625,ground
9,317.888889,electric


In [166]:
# q4: query the Pokemon name, Pokemon type2, and what type2 has "2xdamage" to
q4 = 'SELECT pokemon.name, type2, types."2xdamage" FROM pokemon JOIN types ON types.name = pokemon.type2'
pd.read_sql(q4, cnx)

Unnamed: 0,name,type2,2xdamage
0,bulbasaur,poison,grass
1,ivysaur,poison,grass
2,venusaur,poison,grass
3,charizard,flying,fighting
4,butterfree,flying,fighting
5,weedle,poison,grass
6,kakuna,poison,grass
7,beedrill,poison,grass
8,pidgey,flying,fighting
9,pidgeotto,flying,fighting


In [175]:
# q5: query the top 5 most common type1s, the minimum height, maximum height, minimum weight and maximum weight of pokemon with those type1s, and what associated type they do "0.5xdamage" to
q5 = 'SELECT COUNT(type1), type1, MIN(height), MAX(height), MIN(weight), MAX(weight), types."0.5xdamage" FROM pokemon JOIN types ON types.name = pokemon.type2 GROUP BY 2 ORDER BY COUNT(type1)DESC LIMIT 5'
pd.read_sql(q5, cnx)

Unnamed: 0,COUNT(type1),type1,MIN(height),MAX(height),MIN(weight),MAX(weight),0.5xdamage
0,11,grass,4,20,25,1200,fighting
1,10,normal,3,18,18,852,fighting
2,10,water,9,65,360,2350,fighting
3,9,bug,3,15,32,560,fighting
4,9,rock,4,88,75,3000,poison


## Section 4: Web Scraping

### Accessing Data Using BeautifulSoup

Use BeautifulSoup to get quotes, authors, and tags from [Quotes to Read](http://quotes.toscrape.com/).

Before answering these questions, go to the site and inspect the page. Make sure to look at what links there are and how the site is structured.

1. Get the first author and the path for the author's page as a tuple from the [homepage](http://quotes.toscrape.com/).

In [190]:
# Make a get request to retrieve the page
html_page = requests.get('http://quotes.toscrape.com/') 
# Pass the page contents to beautiful soup for parsing
soup = BeautifulSoup(html_page.content, 'html.parser')


name = soup.find('div', class_="quote").find('small', class_="author").text
link = 'http://quotes.toscrape.com' + soup.find('div', class_="quote").find('a').attrs['href']

(name,link)
# # Your code here


('Albert Einstein', 'http://quotes.toscrape.com/author/Albert-Einstein')

In [221]:
[item.text for item in soup.find_all('small', class_="author")]

['Albert Einstein',
 'J.K. Rowling',
 'Albert Einstein',
 'Jane Austen',
 'Marilyn Monroe',
 'Albert Einstein',
 'André Gide',
 'Thomas A. Edison',
 'Eleanor Roosevelt',
 'Steve Martin']

In [233]:
['http://quotes.toscrape.com' +item.find('a').attrs['href'] for item in soup.find_all('div', class_="quote")]

['http://quotes.toscrape.com/author/Albert-Einstein',
 'http://quotes.toscrape.com/author/J-K-Rowling',
 'http://quotes.toscrape.com/author/Albert-Einstein',
 'http://quotes.toscrape.com/author/Jane-Austen',
 'http://quotes.toscrape.com/author/Marilyn-Monroe',
 'http://quotes.toscrape.com/author/Albert-Einstein',
 'http://quotes.toscrape.com/author/Andre-Gide',
 'http://quotes.toscrape.com/author/Thomas-A-Edison',
 'http://quotes.toscrape.com/author/Eleanor-Roosevelt',
 'http://quotes.toscrape.com/author/Steve-Martin']

2. Write a function to get **all** the authors and href links for the authors from the [homepage](http://quotes.toscrape.com/)


In [234]:
def authors(url):
    html_page = requests.get(url) 
    soup = BeautifulSoup(html_page.content, 'html.parser')
    main_list = soup.findAll('div', class_="quote")
    name_list = [item.text for item in soup.find_all('small', class_="author")]
    link_list = ['http://quotes.toscrape.com' +item.find('a').attrs['href'] for item in soup.find_all('div', class_="quote")]
    
    return dict(zip(name_list, link_list))
    
authors('http://quotes.toscrape.com')    
#     '''
#     input: url
    
#     return: a dictionary of of authors and their urls
#             {'author_1':'url_of_author_1', 'author_2':'url_of_author_2' ...}
#     '''
#     pass

{'Albert Einstein': 'http://quotes.toscrape.com/author/Albert-Einstein',
 'J.K. Rowling': 'http://quotes.toscrape.com/author/J-K-Rowling',
 'Jane Austen': 'http://quotes.toscrape.com/author/Jane-Austen',
 'Marilyn Monroe': 'http://quotes.toscrape.com/author/Marilyn-Monroe',
 'André Gide': 'http://quotes.toscrape.com/author/Andre-Gide',
 'Thomas A. Edison': 'http://quotes.toscrape.com/author/Thomas-A-Edison',
 'Eleanor Roosevelt': 'http://quotes.toscrape.com/author/Eleanor-Roosevelt',
 'Steve Martin': 'http://quotes.toscrape.com/author/Steve-Martin'}

In [235]:
# run this cell to test the function
print(authors('http://quotes.toscrape.com/'))
print('\n')
print(authors('http://quotes.toscrape.com/page/3'))

{'Albert Einstein': 'http://quotes.toscrape.com/author/Albert-Einstein', 'J.K. Rowling': 'http://quotes.toscrape.com/author/J-K-Rowling', 'Jane Austen': 'http://quotes.toscrape.com/author/Jane-Austen', 'Marilyn Monroe': 'http://quotes.toscrape.com/author/Marilyn-Monroe', 'André Gide': 'http://quotes.toscrape.com/author/Andre-Gide', 'Thomas A. Edison': 'http://quotes.toscrape.com/author/Thomas-A-Edison', 'Eleanor Roosevelt': 'http://quotes.toscrape.com/author/Eleanor-Roosevelt', 'Steve Martin': 'http://quotes.toscrape.com/author/Steve-Martin'}


{'Pablo Neruda': 'http://quotes.toscrape.com/author/Pablo-Neruda', 'Ralph Waldo Emerson': 'http://quotes.toscrape.com/author/Ralph-Waldo-Emerson', 'Mother Teresa': 'http://quotes.toscrape.com/author/Mother-Teresa', 'Garrison Keillor': 'http://quotes.toscrape.com/author/Garrison-Keillor', 'Jim Henson': 'http://quotes.toscrape.com/author/Jim-Henson', 'Dr. Seuss': 'http://quotes.toscrape.com/author/Dr-Seuss', 'Albert Einstein': 'http://quotes.toscr

### Pagination

3. Get the first author on each of the first 5 pages of quotes. You can get to the next page with the next button at the bottom of the homepage.


In [237]:
for page in range(1,6):
    print(authors('http://quotes.toscrape.com/page/{}'.format(page)))

# Your code here


{'Albert Einstein': 'http://quotes.toscrape.com/author/Albert-Einstein', 'J.K. Rowling': 'http://quotes.toscrape.com/author/J-K-Rowling', 'Jane Austen': 'http://quotes.toscrape.com/author/Jane-Austen', 'Marilyn Monroe': 'http://quotes.toscrape.com/author/Marilyn-Monroe', 'André Gide': 'http://quotes.toscrape.com/author/Andre-Gide', 'Thomas A. Edison': 'http://quotes.toscrape.com/author/Thomas-A-Edison', 'Eleanor Roosevelt': 'http://quotes.toscrape.com/author/Eleanor-Roosevelt', 'Steve Martin': 'http://quotes.toscrape.com/author/Steve-Martin'}
{'Marilyn Monroe': 'http://quotes.toscrape.com/author/Marilyn-Monroe', 'J.K. Rowling': 'http://quotes.toscrape.com/author/J-K-Rowling', 'Albert Einstein': 'http://quotes.toscrape.com/author/Albert-Einstein', 'Bob Marley': 'http://quotes.toscrape.com/author/Bob-Marley', 'Dr. Seuss': 'http://quotes.toscrape.com/author/Dr-Seuss', 'Douglas Adams': 'http://quotes.toscrape.com/author/Douglas-Adams', 'Elie Wiesel': 'http://quotes.toscrape.com/author/Elie

4. Write a function to get all of the quotes from a page.

In [245]:
soup.find('div', class_='quote').span.text

'“The world as we have created it is a process of our thinking. It cannot be changed without changing our thinking.”'

In [260]:
[item.span.text for item in soup.findAll('div', class_='quote')]

['“The world as we have created it is a process of our thinking. It cannot be changed without changing our thinking.”',
 '“It is our choices, Harry, that show what we truly are, far more than our abilities.”',
 '“There are only two ways to live your life. One is as though nothing is a miracle. The other is as though everything is a miracle.”',
 '“The person, be it gentleman or lady, who has not pleasure in a good novel, must be intolerably stupid.”',
 "“Imperfection is beauty, madness is genius and it's better to be absolutely ridiculous than absolutely boring.”",
 '“Try not to become a man of success. Rather become a man of value.”',
 '“It is better to be hated for what you are than to be loved for what you are not.”',
 "“I have not failed. I've just found 10,000 ways that won't work.”",
 "“A woman is like a tea bag; you never know how strong it is until it's in hot water.”",
 '“A day without sunshine is like, you know, night.”']

In [265]:
['http://quotes.toscrape.com' +item.find('a').attrs['href'] for item in soup.find_all('div', class_="quote")]

['http://quotes.toscrape.com/author/Albert-Einstein',
 'http://quotes.toscrape.com/author/J-K-Rowling',
 'http://quotes.toscrape.com/author/Albert-Einstein',
 'http://quotes.toscrape.com/author/Jane-Austen',
 'http://quotes.toscrape.com/author/Marilyn-Monroe',
 'http://quotes.toscrape.com/author/Albert-Einstein',
 'http://quotes.toscrape.com/author/Andre-Gide',
 'http://quotes.toscrape.com/author/Thomas-A-Edison',
 'http://quotes.toscrape.com/author/Eleanor-Roosevelt',
 'http://quotes.toscrape.com/author/Steve-Martin']

In [283]:
def get_some_quotes(url):
    html_page = requests.get(url) 
    soup = BeautifulSoup(html_page.content, 'html.parser')
    quote_list = [item.span.text for item in soup.findAll('div', class_='quote')]
    url_list = ['http://quotes.toscrape.com' +item.find('a').attrs['href'] for item in soup.find_all('div', class_="quote")]
    q_u_list = list(zip(quote_list, url_list))
    final_list = []
    for q,u in q_u_list:
        final_list.append({'quote':q, 'author': u})     
    return final_list

get_some_quotes('http://quotes.toscrape.com/' )
#     '''
#     input: url, number of pages to scrap (just scrape the home page if no argument is passed in)
    
#     return: a list of dictionaries of quotes with their attributes
#             [{'quote':'quote_1_text', 'author':'url_of_author_1'}, 
#             {'quote':'quote_2_text', 'author':'url_of_author_2', 'quote_tags':[list_of_quote_2_tags]}, ...]
#     '''


[{'quote': '“The world as we have created it is a process of our thinking. It cannot be changed without changing our thinking.”',
  'author': 'http://quotes.toscrape.com/author/Albert-Einstein'},
 {'quote': '“It is our choices, Harry, that show what we truly are, far more than our abilities.”',
  'author': 'http://quotes.toscrape.com/author/J-K-Rowling'},
 {'quote': '“There are only two ways to live your life. One is as though nothing is a miracle. The other is as though everything is a miracle.”',
  'author': 'http://quotes.toscrape.com/author/Albert-Einstein'},
 {'quote': '“The person, be it gentleman or lady, who has not pleasure in a good novel, must be intolerably stupid.”',
  'author': 'http://quotes.toscrape.com/author/Jane-Austen'},
 {'quote': "“Imperfection is beauty, madness is genius and it's better to be absolutely ridiculous than absolutely boring.”",
  'author': 'http://quotes.toscrape.com/author/Marilyn-Monroe'},
 {'quote': '“Try not to become a man of success. Rather be

In [293]:
# set the function to a variable to use later
quotes_for_mongo = get_some_quotes('http://quotes.toscrape.com/' )
quotes_for_mongo

[{'quote': '“The world as we have created it is a process of our thinking. It cannot be changed without changing our thinking.”',
  'author': 'http://quotes.toscrape.com/author/Albert-Einstein'},
 {'quote': '“It is our choices, Harry, that show what we truly are, far more than our abilities.”',
  'author': 'http://quotes.toscrape.com/author/J-K-Rowling'},
 {'quote': '“There are only two ways to live your life. One is as though nothing is a miracle. The other is as though everything is a miracle.”',
  'author': 'http://quotes.toscrape.com/author/Albert-Einstein'},
 {'quote': '“The person, be it gentleman or lady, who has not pleasure in a good novel, must be intolerably stupid.”',
  'author': 'http://quotes.toscrape.com/author/Jane-Austen'},
 {'quote': "“Imperfection is beauty, madness is genius and it's better to be absolutely ridiculous than absolutely boring.”",
  'author': 'http://quotes.toscrape.com/author/Marilyn-Monroe'},
 {'quote': '“Try not to become a man of success. Rather be

## Part 5: MongoDB

To do this section, open a connection to a mongo database in the terminal, using `mongod`. You will then **create**, **update**, and **read** from a mongo database.

Create and connect to a mongo database.

In [324]:
myclient = pymongo.MongoClient("mongodb://127.0.0.1:27017/")
mydb = myclient['quote_database']

In [325]:
mycollection = mydb['new_quote_collection']

1. Add the quotes you obtained from the `get_some_quotes` function for the [homepage](http://quotes.toscrape.com/) to the mongo database. (You can also use the JSON file `quotes.json` to insert data into the database) To verify that you've successfully inserted the data, query it to obtain the resulting _ids back from the `results` variable. 

In [296]:
# if you were unable to get the data from webscraping in the right format,
# uncomment the code below to access a JSON file with the list of dictionaries

# with open(r"data/quotes.json", "r") as r:
#     data = json.load(r)

In [308]:
quotes_for_mongo

[{'quote': '“The world as we have created it is a process of our thinking. It cannot be changed without changing our thinking.”',
  'author': 'http://quotes.toscrape.com/author/Albert-Einstein',
  '_id': ObjectId('5d6ea0dfda0bbb82bc0a74ec')},
 {'quote': '“It is our choices, Harry, that show what we truly are, far more than our abilities.”',
  'author': 'http://quotes.toscrape.com/author/J-K-Rowling',
  '_id': ObjectId('5d6ea0dfda0bbb82bc0a74ed')},
 {'quote': '“There are only two ways to live your life. One is as though nothing is a miracle. The other is as though everything is a miracle.”',
  'author': 'http://quotes.toscrape.com/author/Albert-Einstein',
  '_id': ObjectId('5d6ea0dfda0bbb82bc0a74ee')},
 {'quote': '“The person, be it gentleman or lady, who has not pleasure in a good novel, must be intolerably stupid.”',
  'author': 'http://quotes.toscrape.com/author/Jane-Austen',
  '_id': ObjectId('5d6ea0dfda0bbb82bc0a74ef')},
 {'quote': "“Imperfection is beauty, madness is genius and it

In [326]:
def legible_get_some_quotes(url):
    html_page = requests.get(url) 
    soup = BeautifulSoup(html_page.content, 'html.parser')
    quote_list = [item.span.text for item in soup.findAll('div', class_='quote')]
    url_list = [item.text for item in soup.find_all('small', class_="author")]
    q_u_list = list(zip(quote_list, url_list))
    final_list = []
    for q,u in q_u_list:
        final_list.append({'quote':q, 'author': u})     
    return final_list

legible_get_some_quotes('http://quotes.toscrape.com/' )

[{'quote': '“The world as we have created it is a process of our thinking. It cannot be changed without changing our thinking.”',
  'author': 'Albert Einstein'},
 {'quote': '“It is our choices, Harry, that show what we truly are, far more than our abilities.”',
  'author': 'J.K. Rowling'},
 {'quote': '“There are only two ways to live your life. One is as though nothing is a miracle. The other is as though everything is a miracle.”',
  'author': 'Albert Einstein'},
 {'quote': '“The person, be it gentleman or lady, who has not pleasure in a good novel, must be intolerably stupid.”',
  'author': 'Jane Austen'},
 {'quote': "“Imperfection is beauty, madness is genius and it's better to be absolutely ridiculous than absolutely boring.”",
  'author': 'Marilyn Monroe'},
 {'quote': '“Try not to become a man of success. Rather become a man of value.”',
  'author': 'Albert Einstein'},
 {'quote': '“It is better to be hated for what you are than to be loved for what you are not.”',
  'author': 'And

In [327]:
quotes_for_mongo = legible_get_some_quotes('http://quotes.toscrape.com/' )
quotes_for_mongo

[{'quote': '“The world as we have created it is a process of our thinking. It cannot be changed without changing our thinking.”',
  'author': 'Albert Einstein'},
 {'quote': '“It is our choices, Harry, that show what we truly are, far more than our abilities.”',
  'author': 'J.K. Rowling'},
 {'quote': '“There are only two ways to live your life. One is as though nothing is a miracle. The other is as though everything is a miracle.”',
  'author': 'Albert Einstein'},
 {'quote': '“The person, be it gentleman or lady, who has not pleasure in a good novel, must be intolerably stupid.”',
  'author': 'Jane Austen'},
 {'quote': "“Imperfection is beauty, madness is genius and it's better to be absolutely ridiculous than absolutely boring.”",
  'author': 'Marilyn Monroe'},
 {'quote': '“Try not to become a man of success. Rather become a man of value.”',
  'author': 'Albert Einstein'},
 {'quote': '“It is better to be hated for what you are than to be loved for what you are not.”',
  'author': 'And

In [328]:
# use the results variable to confirm the data was inserted
results = mycollection.insert_many(quotes_for_mongo)
results

<pymongo.results.InsertManyResult at 0x12044c388>

2. Query the database for all the quotes written by `'Albert Einstein'`.

In [329]:
q9 = mycollection.find({}, {'quote': 1, 'author': 1})

for item in q9:
    print(item)

{'_id': ObjectId('5d6ea3a6da0bbb82bc0a7502'), 'quote': '“The world as we have created it is a process of our thinking. It cannot be changed without changing our thinking.”', 'author': 'Albert Einstein'}
{'_id': ObjectId('5d6ea3a6da0bbb82bc0a7503'), 'quote': '“It is our choices, Harry, that show what we truly are, far more than our abilities.”', 'author': 'J.K. Rowling'}
{'_id': ObjectId('5d6ea3a6da0bbb82bc0a7504'), 'quote': '“There are only two ways to live your life. One is as though nothing is a miracle. The other is as though everything is a miracle.”', 'author': 'Albert Einstein'}
{'_id': ObjectId('5d6ea3a6da0bbb82bc0a7505'), 'quote': '“The person, be it gentleman or lady, who has not pleasure in a good novel, must be intolerably stupid.”', 'author': 'Jane Austen'}
{'_id': ObjectId('5d6ea3a6da0bbb82bc0a7506'), 'quote': "“Imperfection is beauty, madness is genius and it's better to be absolutely ridiculous than absolutely boring.”", 'author': 'Marilyn Monroe'}
{'_id': ObjectId('5d6e

In [330]:
q1 = mycollection.find({'author': 'Albert Einstein'})

for item in q1:
    print(item)

{'_id': ObjectId('5d6ea3a6da0bbb82bc0a7502'), 'quote': '“The world as we have created it is a process of our thinking. It cannot be changed without changing our thinking.”', 'author': 'Albert Einstein'}
{'_id': ObjectId('5d6ea3a6da0bbb82bc0a7504'), 'quote': '“There are only two ways to live your life. One is as though nothing is a miracle. The other is as though everything is a miracle.”', 'author': 'Albert Einstein'}
{'_id': ObjectId('5d6ea3a6da0bbb82bc0a7507'), 'quote': '“Try not to become a man of success. Rather become a man of value.”', 'author': 'Albert Einstein'}


3. Update Steve Martin's quote with the tags for the quote stored in the variable `steve_martin_tags`.

In [334]:
steve_martin_tags['quote_tags']

['change', 'deep-thoughts', 'thinking', 'world']

In [335]:
steve_martin_tags = {'quote_tags': ['change', 'deep-thoughts', 'thinking', 'world']}
record_to_update = {'author': 'Steve Martin'}
update_steve = {'$set': {'steve_martin_tags': steve_martin_tags['quote_tags']}}
mycollection.update_many(record_to_update, update_steve)
# first_quote_tags = mycollection.find({'author': 'Steve Martin'})


<pymongo.results.UpdateResult at 0x1206733c8>

4. Query the database to confirm that  `'Steve Martin'` is updated with `steve_martin_tags`.

In [336]:
q2 = mycollection.find({'author': 'Steve Martin'})

for item in q2:
    print(item)

{'_id': ObjectId('5d6ea3a6da0bbb82bc0a750b'), 'quote': '“A day without sunshine is like, you know, night.”', 'author': 'Steve Martin', 'steve_martin_tags': ['change', 'deep-thoughts', 'thinking', 'world']}
