# Mod 2 Assessment

### Congratulations on making it to your second assessment! Just a few reminders before you continue:
- This should only take an hour, so be sure to manage your time effectively.
- Read the instructions carefully for _specified variable names_.
- Check your progress by running **`make tests`** in the terminal, </br>
  OR **if you have a PC**, manually run these lines in the terminal:
  - `jupyter nbconvert --to script mod2_assessment.ipynb`
  - `python -m pytest --disable-pytest-warnings -v`

If there is any confusion on a question, please ask for clarification from a coach. </br>Though we can't give you the answer, we can help clear up any misunderstandings and get you back on track.

In [1]:
# You'll need these imports to start.
import pymongo
from bson.json_util import loads
from collections import Counter

# put all of your additional imports here:
import sqlite3
import os
import json
import requests
from bs4 import BeautifulSoup
import pandas as pd

## Section 1: SQL (20 Minutes)
There is a sqlite3 database in `assets/books.db`. The SQL to create this is also in `assets/books.sql`; if you want to run it manually you can also import this using https://sqliteonline.com/, or run the SQL file directly.  Both have the same schema and data.

The schema has three tables. You can explore the schema in the file posted above.  Please answer the following questions.

Connect to the database using sqlite3. <br/>
Assign `conn` to the sqlite3 connection <br/>
Assign `cur`  to the connection's cursor object

In [3]:
file_path = os.path.join('assets', 'books.db')
conn = sqlite3.connect(file_path)
cur = conn.cursor()

#### Querying the DB: 

1. How many pages are in the book "Nine Stories"?

In [5]:
# Assign `answer_1` to your final answer
cur.execute(""" 
SELECT pages
FROM book
WHERE title = "Nine Stories"
;""")

answer_1 = cur.fetchone()[0]
answer_1

600

2. How many authors are from the USA?

In [9]:
# Assign `answer_2` to your final answer
cur.execute(""" 
SELECT COUNT(DISTINCT(author_id)) AS total_authors
FROM author
WHERE country = "USA"
; """)

answer_2 = cur.fetchone()[0]
answer_2

6

3. How many authors does the book "Professional ASP.NET 4.5 in C# and VB" have?

In [11]:
# Assign `answer_3` to your final answer
cur.execute(""" 
SELECT COUNT(author_id)
FROM author
LEFT JOIN book_author
    USING(author_id)
LEFT JOIN book
    USING(book_id)
WHERE book.title="Professional ASP.NET 4.5 in C# and VB"
;""")
answer_3 = cur.fetchone()[0]
answer_3

5

4. How many pages total have been written by non-American authors?

In [17]:
# Assign `answer_4` to your final answer
cur.execute(""" 
SELECT SUM(book.pages)e
FROM book
INNER JOIN book_author 
    USING(book_id)
INNER JOIN author 
    USING(author_id)
WHERE author.country != "USA"
;               """)
answer_4 = cur.fetchone()[0]
answer_4

30003

## Section 2: Object Oriented Programming (10 Minutes)

### Creating a Class
1. Force every new instance of `WeWorkMember` to expect a value which is assigned to the `name` **attribute**.
1. Give every new instance of `WeWorkMember` a `caffeinated` **attribute** that is set to `False`. 
1. Give `WeWorkMember` an **instance method** called `caffeinate()` that prints out "Getting coffee!" and sets the `caffeinated` attribute to `True`.

In [21]:
class WeWorkMember:
    def __init__(self, name):
        self.name = name
        self.caffeinated = False
    
    def caffeinate(self):
        print('Getting coffee!')
        self.caffeinated = True

laurent = WeWorkMember('L')
print(f'{laurent.name} is {laurent.caffeinated}')
laurent.caffeinate() 
print(f'{laurent.name} is {laurent.caffeinated}')

L is False
Getting coffee!
L is True


### Inheriting from a Class

1. Have `Staff` and `Student` inherit all methods from `WeWorkMember`
1. Give `Staff` a **static method** called `cheer()` that prints out "Goooooooo Flatiron Students!"
1. Give `Students` a **class method** called `learn()` that takes in an integer and returns that number +1

In [23]:
class Staff(WeWorkMember):    
    @staticmethod
    def cheer():
        print("Goooooooo Flatiron Students!")

class Student(WeWorkMember):
    @classmethod
    def learn(cls, integer):
        return integer + 1
print(Staff.cheer())
print(Student.learn(2))

Goooooooo Flatiron Students!
None
3


## Section 3: APIs & Web Scraping
### APIs (10 Minutes)
Using the API about RuPaul's Drag Race, tell me how many judges of each **`type`** there were in the **first 50 records** the API returns.

API: http://www.nokeynoshade.party/api/judges </br>
Docs: https://drag-race-api.readme.io/docs/get-all-judges

 - Assign `rupaul_resp` to the response of the API request.
 - Ensure the request only returns the first 50 records
     - note the documentation shows parameters as uppercase, but they should be **lowercase**
     - the judges will have **`id`** ranging from 1 to 53
 - Do the aggregation in pure Python, Pandas, or SQL -- whatever's easiest for you
 - Assign `judge_count` to a dictionary with the number of judges for each type

In [7]:
url = 'http://www.nokeynoshade.party/api/judges'
PARMS = {'limit' : 50,
         'offset': 0}
rupaul_resp = requests.get(url, params=PARMS).json()

In [30]:
rupaul_resp[0]

{'id': 1,
 'name': 'Rupaul',
 'image_url': 'https://vignette.wikia.nocookie.net/logosrupaulsdragrace/images/b/ba/Rupaul_blackpink_final.jpg/revision/latest/scale-to-width-down/350?cb=20110731183922',
 'bio': 'Drag performer, actor, television host, and recording artist',
 'type': 'regular',
 'createdAt': '2018-01-11T00:43:31.258Z',
 'updatedAt': '2018-01-11T00:43:31.258Z'}

In [33]:
judge_count = dict(Counter([judge['type'] for judge in rupaul_resp]))
judge_count

{'regular': 2, 'guest': 47, 'interim': 1}

In [9]:
j_count = {}
for judge in rupaul_resp:
    try:
        j_count[judge['type']] = j_count[judge['type']]+1
    except:
        j_count[judge['type']] = 0
        j_count[judge['type']] = j_count[judge['type']]+1
j_count

{'regular': 2, 'guest': 47, 'interim': 1}

### Web Scraping (15 Minutes)
We want all of the **Music** books from [http://books.toscrape.com/](http://books.toscrape.com/).  Download the page of books in the *music* category and save it to the `book_resp` variable. Then use BeautifulSoup to convert each book on the page into a row in a Pandas dataframe.  The dataframe should consist of five columns:
- FULL book title
- book cover image link (relative is fine)
- book cost (as a float or decimal, not a string)
- inventory status (in stock?)
- **BONUS** Star rating (if you choose not to include this, just fill the column with `NaN`s.

The final result should be a dataframe of 13 rows (not counting header) and 5 columns (not counting index), named `book_df`.

- Set the response to a variable named `book_resp`
- Set the BeautifulSoup() object to a variable named `book_soup`
- Set the final dataframe object to a variable named `book_df`

# Methods

In [125]:
#product_pod = cell.find_all('article', attrs={'class' : "col-xs-6 col-sm-4 col-md-3 col-lg-3"})
def scrap_data(soup):
    data = {
        'title': [],
        'book_cover' : [],
        'book_cost' : [],
        'inventory_status' : [],
        'star_rating': []
    }
    for _ in soup:
        data['title'].append(_.find('h3').find('a').attrs['title'])
        data['book_cover'].append(_.find('div', {'class':'image_container'}).find('a').find('img').attrs['src'])
        data['book_cost'].append(_.find('div', {'class':'product_price'}).find('p', {'class': 'price_color'}).text)

        instock = _.find('div', {'class':'product_price'}).find('p', {'class': 'instock availability'}).text
        instock = ''.join(instock.split())
        data['inventory_status'].append(instock)

        star_rating = _.find_all('p', {'class':'star-rating'})
        data['star_rating'].append(star_rating[0]['class'][1])
    return data

In [135]:
# I do not like this one bit....
book_resp = requests.get('http://books.toscrape.com/catalogue/category/books/music_14/index.html')
book_soup = BeautifulSoup(book_resp.content, 'html.parser')
product_pods = book_soup.find_all('article', attrs={'class' : "product_pod"})
book_df = pd.DataFrame(scrap_data(product_pods))

In [136]:
book_df.head()

Unnamed: 0,title,book_cover,book_cost,inventory_status,star_rating
0,Rip it Up and Start Again,../../../../media/cache/81/c4/81c4a973364e17d0...,£35.02,Instock,Five
1,Our Band Could Be Your Life: Scenes from the A...,../../../../media/cache/54/60/54607fe8945897cd...,£57.25,Instock,Three
2,How Music Works,../../../../media/cache/5c/c8/5cc8e107246cb478...,£37.32,Instock,Two
3,Love Is a Mix Tape (Music #1),../../../../media/cache/a2/6d/a26d8449abb3381e...,£18.03,Instock,One
4,Please Kill Me: The Uncensored Oral History of...,../../../../media/cache/06/f1/06f185c0be2ad6e2...,£31.19,Instock,Four


## Section 4: NoSQL (10 Minutes)

#### Load data from `assets/grades.json` into Mongo 

(this code is written for you)

In [128]:
# you shouldn't need to edit this cell!
db_name = "mod2db"

with open('assets/grades.jsonl') as f:
    # loads() comes from the bson library
    file_data = [loads(line) for line in f.readlines()]

client = pymongo.MongoClient("mongodb://localhost:27017/")
client.drop_database(db_name)
db = client[db_name]
coll = db["testcoll"]

coll.insert_many(file_data)

<pymongo.results.InsertManyResult at 0x2801ca98b88>

#### Answer all of the following questions by querying Mongo and manipulating the results in Python

1. How many records are there total?

In [130]:
# Set `nosql_answer1` to your final answer
nosql_answer1 = coll.count_documents(filter = {})
nosql_answer1

280

2. How many students have taken the class with `class_id` = **29**?

In [131]:
# Set `nosql_answer2` to your final answer
nosql_answer2 = coll.count_documents(filter = {'class_id': 29})
nosql_answer2 

9

**Super bonus question**: </br>For student **12** in class **23**, what grade did they get on their exam?

In [133]:
# Set `nosql_answer3` to your final answer
nosql_answer3 = coll.find_one(filter = {'class_id': 23, 'student_id': 12})['scores']
nosql_answer3[0]['score']

26.9857216299485

In [134]:
# for when you're done with this portion, this deletes the data we added.
client.drop_database("mod2db")

## Assessment submission (2 Minutes)
Please **save** your completed file (`mod2_assessment.ipynb`) and upload it using [this form](https://docs.google.com/forms/d/e/1FAIpQLSf1uGNuz4fyzVz5i3aFTxmMKvH50DEJiN5uRmNFghpmFzoi3g/viewform)