# Getting Data
The topics of week 6 continues to be getting data, in this case using an API to access structured data. 

In this lab notebook you will gain experience reading data from and posting to an API. 


## Autograder Setup 

The next code cell should be uncommented to run the autograder tests when using Colab/DeepNote. If you are using an environment with otter-grader already installed (your own machine, lab machines), then do not uncomment the code.

In [2]:
#!pip install otter-grader

In order to have the data files and test files for this lab, Download the 'tests.zip' file from canvas page for this lab in the same working directory. Next, uncomment and run the following cell. Please comment out before submission. 

In [1]:
#!unzip tests.zip

*Remember that your answers will potentially be evaluated on additional hidden test cases as well as manual code review.* 

## Lab Setup

In [1]:
import requests
import json
import datetime
from io import StringIO
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib as mpl 
%matplotlib inline  

import otter
grader = otter.Notebook()

## API Getting Data

So far we have seen examples of getting data from an API.  These examples make use of GET requests from the API/server. 

Making a HTTP GET request can be done using several python libraries including: 

* httplib 
* urllib 
* requests 

We have been using the `requests` module.

Let's look at another example.

## Example: Google Books

Here we will examine using the Google Books API:  
https://developers.google.com/books/docs/overview

In [2]:
# api-endpoint 
url = "https://www.googleapis.com/books/v1/volumes"
  
isbn = "isbn:0553386794"

# set the parameters to be sent to the API
params = {'q': isbn}

resp = requests.get(url, params)

Look at what the response is? 

How do we then extract the data?

In [3]:
resp

<Response [200]>

In [4]:
dat = resp.json()
dat

{'kind': 'books#volumes',
 'totalItems': 2,
 'items': [{'kind': 'books#volume',
   'id': 'hXNvadj27ekC',
   'etag': '/+iS5ZH6mSM',
   'selfLink': 'https://www.googleapis.com/books/v1/volumes/hXNvadj27ekC',
   'volumeInfo': {'title': 'A Game of Thrones (HBO Tie-in Edition)',
    'subtitle': 'A Song of Ice and Fire: Book One',
    'authors': ['George R. R. Martin'],
    'publisher': 'Bantam',
    'publishedDate': '2011-03-22',
    'description': 'NOW THE ACCLAIMED HBO SERIES GAME OF THRONES—THE MASTERPIECE THAT BECAME A CULTURAL PHENOMENON Winter is coming. Such is the stern motto of House Stark, the northernmost of the fiefdoms that owe allegiance to King Robert Baratheon in far-off King’s Landing. There Eddard Stark of Winterfell rules in Robert’s name. There his family dwells in peace and comfort: his proud wife, Catelyn; his sons Robb, Brandon, and Rickon; his daughters Sansa and Arya; and his bastard son, Jon Snow. Far to the north, behind the towering Wall, lie savage Wildings and 

There is a lot of information here.  Explore the structure of the JSON information. 

In [5]:
# First, we can print it better! 
print(json.dumps(resp.json(), indent=2))

{
  "kind": "books#volumes",
  "totalItems": 2,
  "items": [
    {
      "kind": "books#volume",
      "id": "hXNvadj27ekC",
      "etag": "/+iS5ZH6mSM",
      "selfLink": "https://www.googleapis.com/books/v1/volumes/hXNvadj27ekC",
      "volumeInfo": {
        "title": "A Game of Thrones (HBO Tie-in Edition)",
        "subtitle": "A Song of Ice and Fire: Book One",
        "authors": [
          "George R. R. Martin"
        ],
        "publisher": "Bantam",
        "publishedDate": "2011-03-22",
        "description": "NOW THE ACCLAIMED HBO SERIES GAME OF THRONES\u2014THE MASTERPIECE THAT BECAME A CULTURAL PHENOMENON Winter is coming. Such is the stern motto of House Stark, the northernmost of the fiefdoms that owe allegiance to King Robert Baratheon in far-off King\u2019s Landing. There Eddard Stark of Winterfell rules in Robert\u2019s name. There his family dwells in peace and comfort: his proud wife, Catelyn; his sons Robb, Brandon, and Rickon; his daughters Sansa and Arya; and hi

In [6]:
dat.keys()

dict_keys(['kind', 'totalItems', 'items'])

In [7]:
dat['kind']

'books#volumes'

In [8]:
dat['totalItems']

2

In [9]:
dat['items']

[{'kind': 'books#volume',
  'id': 'hXNvadj27ekC',
  'etag': '/+iS5ZH6mSM',
  'selfLink': 'https://www.googleapis.com/books/v1/volumes/hXNvadj27ekC',
  'volumeInfo': {'title': 'A Game of Thrones (HBO Tie-in Edition)',
   'subtitle': 'A Song of Ice and Fire: Book One',
   'authors': ['George R. R. Martin'],
   'publisher': 'Bantam',
   'publishedDate': '2011-03-22',
   'description': 'NOW THE ACCLAIMED HBO SERIES GAME OF THRONES—THE MASTERPIECE THAT BECAME A CULTURAL PHENOMENON Winter is coming. Such is the stern motto of House Stark, the northernmost of the fiefdoms that owe allegiance to King Robert Baratheon in far-off King’s Landing. There Eddard Stark of Winterfell rules in Robert’s name. There his family dwells in peace and comfort: his proud wife, Catelyn; his sons Robb, Brandon, and Rickon; his daughters Sansa and Arya; and his bastard son, Jon Snow. Far to the north, behind the towering Wall, lie savage Wildings and worse—unnatural things relegated to myth during the centuries-l

`dat['items']` returns a list of items.

In [10]:
type(dat['items'])

list

In [11]:
# We can look at the first item on the list 
dat['items'][0]

{'kind': 'books#volume',
 'id': 'hXNvadj27ekC',
 'etag': '/+iS5ZH6mSM',
 'selfLink': 'https://www.googleapis.com/books/v1/volumes/hXNvadj27ekC',
 'volumeInfo': {'title': 'A Game of Thrones (HBO Tie-in Edition)',
  'subtitle': 'A Song of Ice and Fire: Book One',
  'authors': ['George R. R. Martin'],
  'publisher': 'Bantam',
  'publishedDate': '2011-03-22',
  'description': 'NOW THE ACCLAIMED HBO SERIES GAME OF THRONES—THE MASTERPIECE THAT BECAME A CULTURAL PHENOMENON Winter is coming. Such is the stern motto of House Stark, the northernmost of the fiefdoms that owe allegiance to King Robert Baratheon in far-off King’s Landing. There Eddard Stark of Winterfell rules in Robert’s name. There his family dwells in peace and comfort: his proud wife, Catelyn; his sons Robb, Brandon, and Rickon; his daughters Sansa and Arya; and his bastard son, Jon Snow. Far to the north, behind the towering Wall, lie savage Wildings and worse—unnatural things relegated to myth during the centuries-long summer

In [12]:
'''We can investigate the keys where information is stored for each item'''
dat['items'][1]['volumeInfo'].keys()

dict_keys(['title', 'authors', 'publishedDate', 'description', 'industryIdentifiers', 'readingModes', 'printType', 'averageRating', 'ratingsCount', 'maturityRating', 'allowAnonLogging', 'contentVersion', 'panelizationSummary', 'language', 'previewLink', 'infoLink', 'canonicalVolumeLink'])

In [13]:
# You can start building pretty long lines of code to access information deep 
#  in the structure. 
# Print out the ISBN_10 number for the book 
dat['items'][0]['volumeInfo']['industryIdentifiers'][0]['identifier']

'9780553386790'

## Exercise 1 

Which of the Game of Thrones books is longest?

Get information about each book and print out the title and number of pages.  Then, report the book title and number of pages for the book with that is the longest.  

*Note, the API may return multiple entries for each isbn.  You may use the first entry for information.  If the information is missing a page number it is likely an audiobook, and you should then use the next entry for information.  If no entry has the title and page number information return the title as "no title" and the number of pages as '-1'.*

In [14]:
''' Following is the isbn codes for Game of Thrones books. '''

isbns = ['0553386794', '0553579908', '9780345543981', '9780553582024', '9780553582017']

In [15]:
'''
Iterate for each isbns to finds titles and pages for each item. 
Collect this information in a list. 
You can use "volumeInfo" to gather the information needed.
Print the title + the number of pages in the loop. 

Outside the loop:
- Convert the list to a DataFrame, column names 'Title' and 'NumPages' 
- Report the longest book in longestBookTitle and longestBookNumPages.
'''
ex1list = [] 

for i in isbns:
    # Get the api endpoints
    params = {'q': 'isbn:' + i}
    response = requests.get(url, params)
    data = response.json()
    
    title, pages = "no title", -1
    for item in data['items']:
        if 'pageCount' in item['volumeInfo']:
            title, pages = item['volumeInfo'].get('title', 'no title'), item['volumeInfo']['pageCount']
            break
    if title == 'no title':
        for item in data['items']:
            if 'title' in item['volumeInfo']:
                title, pages = item['volumeInfo']['title'], item['volumeInfo'].get('pageCount', -1)
                break
    ex1list.append([title, pages])
    print(title + " has " + str(pages) + " pages.")
    
ex1df = pd.DataFrame(ex1list, columns=['Title', 'NumPages'])

longestBookTitle = ex1df.loc[ex1df['NumPages'].idxmax()]['Title']
longestBookNumPages = ex1df.loc[ex1df['NumPages'].idxmax()]['NumPages']

A Game of Thrones (HBO Tie-in Edition) has 722 pages.
A Clash of Kings has 0 pages.
A Storm of Swords (HBO Tie-in Edition): A Song of Ice and Fire: Book Three has 1218 pages.
A Feast for Crows has 1106 pages.
A Dance with Dragons has 1154 pages.


In [21]:
grader.check("q1")

ValueError: Tests directory does not exist and no notebook path provided

## Example: Government API - Iceland

Many cities or countries have begun making data available for developers and researchers. 
Iceland has created a single [API](http://docs.apis.is/) that has many endpoints including weather data, concerts, bus, earthquakes, bicycle counters, etc. 

Try out a few different endpoints and gather some data.  *Note, some of the endpoints are not available at this time, you may get a 404 or 500 error*

**Earthquake Information**

In [24]:
resp = requests.get('https://apis.is/earthquake/is')

In [25]:
resp

<Response [200]>

In [26]:
len(resp.json()['results'])

51

Under the `results` key there is a list of items with the earthquake information. 

Let's try to get this list in a DataFrame.

In [27]:
# The JSON was already stored in "resp" and we only want the list of results 
#  under the key "results".  Therefore, we can take this information and 
#  re-serialize the information and let pandas read_json parse in the data 
eq = pd.read_json(json.dumps(resp.json()['results']))
eq.head()

Unnamed: 0,timestamp,latitude,longitude,depth,size,quality,humanReadableLocation
0,2017-10-13 12:07:24+00:00,63.976,-21.949,1.1,0.6,58.73,"6,1 km SV af Helgafelli"
1,2017-10-13 09:50:50+00:00,65.124,-16.288,7.2,0.9,78.51,"6,1 km NA af Herðubreiðartöglum"
2,2017-10-13 09:41:09+00:00,63.945,-21.143,7.4,0.2,33.12,"6,5 km SSA af Hveragerði"
3,2017-10-13 09:37:45+00:00,65.114,-16.3,6.3,1.2,90.01,"5,0 km NA af Herðubreiðartöglum"
4,2017-10-13 09:37:21+00:00,65.113,-16.301,5.9,1.4,90.01,"4,9 km NA af Herðubreiðartöglum"


If you get a "Value Error: protocol not known", then try wrapping the `json.dumps(resp.json()['results'])` in a method to write to a string using `StringIO` 

```python
from io import StringIO
eq = pd.read_json(StringIO(json.dumps(resp.json()['results'])))
```

**Football**

We can get information on football events. 



In [28]:
resp = requests.get('https://apis.is/sports/football')
resp

<Response [200]>

In [29]:
resp.json()

{'results': []}

Looks like no information listed for upcoming football matches. 

**Currency** 

In [30]:
params = {'source' : 'lb'}
resp = requests.get('https://apis.is/currency', params)
resp

<Response [200]>

In [31]:
resp.json()

{'results': [{'shortName': 'ISK',
   'longName': 'Íslensk króna',
   'value': 1,
   'askValue': 1,
   'bidValue': 1,
   'changeCur': 0,
   'changePer': '0.00'},
  {'shortName': 'USD',
   'longName': 'Bandarískur dalur',
   'value': 143.66,
   'askValue': 144.16,
   'bidValue': 143.16,
   'changeCur': -0.770925,
   'changePer': '-0.01'},
  {'shortName': 'GBP',
   'longName': 'Sterlingspund',
   'value': 172.755,
   'askValue': 173.36,
   'bidValue': 172.15,
   'changeCur': -0.184247,
   'changePer': 0},
  {'shortName': 'EUR',
   'longName': 'Evra',
   'value': 152.3,
   'askValue': 152.83,
   'bidValue': 151.77,
   'changeCur': -0.261045,
   'changePer': 0},
  {'shortName': 'CAD',
   'longName': 'Kanadískur dalur',
   'value': 105.9,
   'askValue': 106.27,
   'bidValue': 105.53,
   'changeCur': -0.505571,
   'changePer': 0},
  {'shortName': 'DKK',
   'longName': 'Dönsk króna',
   'value': 20.463,
   'askValue': 20.535,
   'bidValue': 20.391,
   'changeCur': -0.247741,
   'changePer': '-

**Weather Observations**

In [32]:
params = {'stations': '1'}
resp = requests.get('https://apis.is/weather/observations/en', params)
resp


<Response [200]>

In [33]:
resp.json()

{'results': [{'name': 'Reykjavík',
   'time': '2023-02-27 21:00:00',
   'err': '',
   'link': 'http://en.vedur.is/weather/observations/areas/reykjavik/#group=100&station=1',
   'F': '8',
   'FX': '9',
   'FG': '17',
   'D': 'SE',
   'T': '7.5',
   'W': 'Light drizzle',
   'V': '20',
   'N': '100',
   'P': '1025',
   'RH': '80',
   'SNC': '',
   'SND': '',
   'SED': '',
   'RTE': '',
   'TD': '4.3',
   'R': '0.2',
   'id': '1',
   'valid': '1'}]}

A description on what the codings stand for is available in the API documentation:  
http://docs.apis.is/#endpoint-weather

<!-- BEGIN QUESTION -->

## Exercise 2 

Look at forecasts for the Reykjavík weather station (station : 1).  Report the mean temperature and mean wind speed for the next 24 forecasts. 

In [34]:
# Use the Iceland API to collect weather forecasts for the Reykjavik station. 
# Report the mean temperature and mean wind speed for the next 24 forecasts. 

params = {"stations": 1}
url = "http://apis.is/weather/forecasts/en/"

resp = requests.get(url, params=params)

# Get the next 24 forecasts
forecasts = resp.json()['results'][0]['forecast']

# # Calculate the mean temperature and mean wind speed for the next 24 forecasts
meanTemp = sum([float(forecasts[f]["T"]) for f in range(24)]) / 24
meanWS = sum([float(forecasts[f]["F"]) for f in range(24)]) / 24

print("Mean Temperature (deg C): ", meanTemp)
print("Mean Wind Speed (m/s)   : ", meanWS)

Mean Temperature (deg C):  6.833333333333333
Mean Wind Speed (m/s)   :  10.041666666666666


<!-- END QUESTION -->

## Example: iTunes Content 

Apple has a simple [API](https://affiliate.itunes.apple.com/resources/documentation/itunes-store-web-service-search-api/) for looking up iTunes content.

In [35]:
# api-endpoint
url = 'https://itunes.apple.com/search'

# For example let's search for lord of the rings ebooks 
params = {'term': 'lord+of+the+rings', 'entity': 'ebook' }

resp = requests.get(url, params)

In [36]:
resp

<Response [200]>

In [37]:
print(resp.text)




{
 "resultCount":1,
 "results": [
{"artistIds":[482333908], 
"description":"\" With New Line Cinema's production of The Lord of the Rings film trilogy, the popularity of the works of J.R.R. Tolkien is unparalleled. Tolkien's books continue to be bestsellers decades after their original publication. An epic in league with those of Spenser and Malory, The Lord of the Rings trilogy, begun during Hitler's rise to power, celebrates the insignificant individual as hero in the modern world. Jane Chance's critical appraisal of Tolkien's heroic masterwork is the first to explore its \"mythology of power\"–that is, how power, politics, and language interact. Chance looks beyond the fantastic, self-contained world of Middle-earth to the twentieth-century parallels presented in the trilogy.", "trackId":739542595, "trackName":"Lord of the Rings", "genreIds":["10084", "38", "9031"], "releaseDate":"2001-10-26T07:00:00Z", "currency":"USD", "fileSizeBytes":501517, "artistViewUrl":"https://books.appl

<!-- BEGIN QUESTION -->

## Exercise 3

Search for the 50 "The Expanse" e-books (search may return fewer). Create a data frame from the responses containing the `trackName`, `track ID`, `price`, and `averageUserRating`. Sort the results from highest to lowest price.

In [38]:
url = 'https://itunes.apple.com/search'

# """ For example let's search for "The Expanse" ebooks """

params = {'term': 'expanse', 'entity': 'ebook' }
resp = requests.get(url, params) 

response = resp.json()['results']

In [39]:
response

[{'artistIds': [570545592],
  'description': 'Many double-spread photos of trees and the landscape taken in 2012-13. Virtually no text, the pictures have to say it all.',
  'trackId': 627422853,
  'trackName': 'Expanse',
  'genreIds': ['10092', '38', '9007'],
  'releaseDate': '2013-03-24T07:00:00Z',
  'currency': 'USD',
  'fileSizeBytes': 249512361,
  'artistViewUrl': 'https://books.apple.com/us/artist/robin-hull/570545592?uo=4',
  'formattedPrice': 'Free',
  'trackCensoredName': 'Expanse',
  'trackViewUrl': 'https://books.apple.com/us/book/expanse/id627422853?uo=4',
  'artworkUrl100': 'https://is4-ssl.mzstatic.com/image/thumb/Publication/v4/7b/b6/5e/7bb65e69-6b59-6944-52ea-8ab91c0e2b80/9781457945212.jpg/100x100bb.jpg',
  'artworkUrl60': 'https://is4-ssl.mzstatic.com/image/thumb/Publication/v4/7b/b6/5e/7bb65e69-6b59-6944-52ea-8ab91c0e2b80/9781457945212.jpg/60x60bb.jpg',
  'price': 0.0,
  'genres': ['Photography', 'Books', 'Arts & Entertainment'],
  'artistId': 570545592,
  'artistName'

In [40]:
obj = json.loads(resp.text)
# Alternatively 
obj2 = resp.json()
obj2

{'resultCount': 50,
 'results': [{'artistIds': [570545592],
   'description': 'Many double-spread photos of trees and the landscape taken in 2012-13. Virtually no text, the pictures have to say it all.',
   'trackId': 627422853,
   'trackName': 'Expanse',
   'genreIds': ['10092', '38', '9007'],
   'releaseDate': '2013-03-24T07:00:00Z',
   'currency': 'USD',
   'fileSizeBytes': 249512361,
   'artistViewUrl': 'https://books.apple.com/us/artist/robin-hull/570545592?uo=4',
   'formattedPrice': 'Free',
   'trackCensoredName': 'Expanse',
   'trackViewUrl': 'https://books.apple.com/us/book/expanse/id627422853?uo=4',
   'artworkUrl100': 'https://is4-ssl.mzstatic.com/image/thumb/Publication/v4/7b/b6/5e/7bb65e69-6b59-6944-52ea-8ab91c0e2b80/9781457945212.jpg/100x100bb.jpg',
   'artworkUrl60': 'https://is4-ssl.mzstatic.com/image/thumb/Publication/v4/7b/b6/5e/7bb65e69-6b59-6944-52ea-8ab91c0e2b80/9781457945212.jpg/60x60bb.jpg',
   'price': 0.0,
   'genres': ['Photography', 'Books', 'Arts & Entertain

Try using at least two approaches to create the DataFrame, e.g., 

* *Method 1* - Keep track of rows in a list, convert nested lists to DataFrame.  Note, do not create an empty DataFrame and append entries in an iterator (this is not scalable)  
https://stackoverflow.com/questions/13784192/creating-an-empty-pandas-dataframe-and-then-filling-it/41529411#41529411
* *Method 2* - Use pandas `read_json` function to convert JSON to pandas object
* *Method 3* - Use `json_normalize` function that normalizes a semi-structured JSON data into a flat table.   
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.io.json.json_normalize.html

In [41]:
# Method 1 - Capture items in a list, convert to DataFrame
# trackName, track ID, price, and averageUserRating

rows = []
for i in range(50):
    # First collect the entries into a dictionary
    data_dict = {}
    data_dict['trackName'] = response[i]['trackName']
    data_dict['trackId'] = response[i]['trackId']
    data_dict['price'] = response[i].get('price', 0.0)
    data_dict['averageUserRating'] = response[i].get('averageUserRating', 0.0)
    
    # Append the dictionary to a list
    rows.append(data_dict)

# Convert the list of dictionaries (rows) into a dataframe
q3df1 = pd.DataFrame(rows)

# Sort the DataFrame by price in descending order
q3df1.sort_values('price', ascending=False, inplace=True)

q3df1.shape

(50, 4)

In [42]:
q3df1.head(15)

Unnamed: 0,trackName,trackId,price,averageUserRating
17,Termination Shock,1553747973,16.99,4.0
26,Memory's Legion,1571083431,14.99,4.5
3,Hillbilly Elegy,1345038422,12.99,4.5
7,Babylon's Ashes,1063592195,11.99,4.5
8,Persepolis Rising,1215092254,11.99,4.5
9,Tiamat's Wrath,1367091224,11.99,4.5
41,The Eternity Artifact,380490509,11.99,4.5
38,A Very Large Expanse of Sea,1330481470,10.99,4.5
46,Defy the Fates,1451478385,10.99,4.5
43,There Before the Chaos,1344714317,9.99,4.5


In [43]:
# Method 2 - Use pandas read_json function to 
# You may need to use StringIO function
# from io import StringIO  # Already imported at top of file 

json_data = StringIO(json.dumps(response))

q3df2 = pd.read_json(json_data, orient="records")
q3df2 = q3df2[['trackName', 'trackId', 'price', 'averageUserRating']]

# Sort the DataFrame by price in descending order
q3df2.sort_values('price', ascending=False, inplace=True)

# q3df2.head()

q3df2.shape

(50, 4)

In [44]:
q3df2.head(15)

Unnamed: 0,trackName,trackId,price,averageUserRating
17,Termination Shock,1553747973,16.99,4.0
26,Memory's Legion,1571083431,14.99,4.5
3,Hillbilly Elegy,1345038422,12.99,4.5
7,Babylon's Ashes,1063592195,11.99,4.5
8,Persepolis Rising,1215092254,11.99,4.5
9,Tiamat's Wrath,1367091224,11.99,4.5
41,The Eternity Artifact,380490509,11.99,4.5
38,A Very Large Expanse of Sea,1330481470,10.99,4.5
46,Defy the Fates,1451478385,10.99,4.5
43,There Before the Chaos,1344714317,9.99,4.5


In [45]:
# Method 3 - Use json_normalize function
# The json_normalize function normalizes a semi-structured JSON data object into a flat table. 
# https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.json_normalize.html

response = resp.text

# read the response into a pandas dataframe
data = pd.read_json(response)

# Use the json_normalize to flatten the nested JSON structure and extract the results column
q3df3 = pd.json_normalize(data['results'])

# Extract the need columns from the dataset
q3df3 = q3df3[['trackName', 'trackId', 'price', 'averageUserRating']]

# Sort the DataFrame by price in descending order
q3df3.sort_values('price', ascending=False, inplace=True)
q3df3.head()
# q3df3.shape

Unnamed: 0,trackName,trackId,price,averageUserRating
17,Termination Shock,1553747973,16.99,4.0
26,Memory's Legion,1571083431,14.99,4.5
3,Hillbilly Elegy,1345038422,12.99,4.5
7,Babylon's Ashes,1063592195,11.99,4.5
8,Persepolis Rising,1215092254,11.99,4.5


In [46]:
q3df3.head(15)

Unnamed: 0,trackName,trackId,price,averageUserRating
17,Termination Shock,1553747973,16.99,4.0
26,Memory's Legion,1571083431,14.99,4.5
3,Hillbilly Elegy,1345038422,12.99,4.5
7,Babylon's Ashes,1063592195,11.99,4.5
8,Persepolis Rising,1215092254,11.99,4.5
9,Tiamat's Wrath,1367091224,11.99,4.5
41,The Eternity Artifact,380490509,11.99,4.5
38,A Very Large Expanse of Sea,1330481470,10.99,4.5
46,Defy the Fates,1451478385,10.99,4.5
43,There Before the Chaos,1344714317,9.99,4.5


<!-- END QUESTION -->

## Example: TV Shows 

Here we can use an API on tv show information:  
http://api.tvmaze.com/

In [47]:
# We can find the tvmaze id for a show based on the IMDB id. 
id_got = 'tt3032476'
resp = requests.get('http://api.tvmaze.com/lookup/shows?imdb=tt3032476')

In [48]:
resp.json()

{'id': 618,
 'url': 'https://www.tvmaze.com/shows/618/better-call-saul',
 'name': 'Better Call Saul',
 'type': 'Scripted',
 'language': 'English',
 'genres': ['Drama', 'Crime', 'Legal'],
 'status': 'Ended',
 'runtime': 60,
 'averageRuntime': 64,
 'premiered': '2015-02-08',
 'ended': '2022-08-15',
 'officialSite': 'https://www.amc.com/shows/better-call-saul--1002228',
 'schedule': {'time': '21:00', 'days': ['Monday']},
 'rating': {'average': 8.6},
 'weight': 99,
 'network': {'id': 20,
  'name': 'AMC',
  'country': {'name': 'United States',
   'code': 'US',
   'timezone': 'America/New_York'},
  'officialSite': None},
 'webChannel': None,
 'dvdCountry': None,
 'externals': {'tvrage': 37780, 'thetvdb': 273181, 'imdb': 'tt3032476'},
 'image': {'medium': 'https://static.tvmaze.com/uploads/images/medium_portrait/399/998743.jpg',
  'original': 'https://static.tvmaze.com/uploads/images/original_untouched/399/998743.jpg'},
 'summary': '<p><b>Better Call Saul</b> is the prequel to the award-winni

We now know the TVmaze ID for Better Call Saul is **618**. 

## Exercise 4 

Calculate and report the min, mean, and max running time for Better Call Saul episodes by season in a DataFrame. 
Rows are indexed by season number (e.g., 1, 2, 3, ...) and Columns are "Min", "Mean", and "Max". 

Suggestion: 

* Create DataFrame of episodes information 
* Consider using the endpoint - http://www.tvmaze.com/api#show-episode-list
* Then use .groupby function to group by season
* Construct final DataFrame "bcs" to contain the requested information

In [49]:
# Create a DataFrame "bcs" with row index of season number and 
#  columns of "Min", "Mean" and "Max" running time of the episodes for that season.

resp = requests.get('http://api.tvmaze.com/shows/618/episodes')
response = resp.json()

bcs = pd.DataFrame(response)
#data.info()

bcs = bcs[['season', 'runtime']]

# Conver the runtime column from object to numeric values
bcs['runtime'] = pd.to_numeric(bcs['runtime'])

# Group the episodes by season
bcs = bcs.groupby('season').agg({'runtime': ['min', 'mean', 'max']})

# Rename the columns of the dataframe
bcs.columns = ["Min", "Mean", "Max"]

bcs

Unnamed: 0_level_0,Min,Mean,Max
season,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
1,60,60.0,60
2,60,60.0,60
3,70,71.3,77
4,60,60.0,60
5,60,72.3,85
6,60,60.923077,72


In [20]:
grader.check("q4")

ValueError: Tests directory does not exist and no notebook path provided