# 1. Retrieve the data, and examine it

In [29]:
import requests

r = requests.get('http://linserv1.cims.nyu.edu:10000/films?_page=1')
d = r.json() # parses json into dictionary!

In [30]:
#get the first 3 elements
d[:3]

[{'id': '2baf70d1-42bb-4437-b551-e5fed5a87abe',
  'title': 'Castle in the Sky',
  'original_title': '天空の城ラピュタ',
  'original_title_romanised': 'Tenkū no shiro Rapyuta',
  'description': "The orphan Sheeta inherited a mysterious crystal that links her to the mythical sky-kingdom of Laputa. With the help of resourceful Pazu and a rollicking band of sky pirates, she makes her way to the ruins of the once-great civilization. Sheeta and Pazu must outwit the evil Muska, who plans to use Laputa's science to make himself ruler of the world.",
  'director': 'Hayao Miyazaki',
  'producer': 'Isao Takahata',
  'release_date': '1986',
  'running_time': '124',
  'rt_score': '95',
  'people': ['https://ghibliapi.herokuapp.com/people/'],
  'species': ['https://ghibliapi.herokuapp.com/species/af3910a6-429f-4c74-9ad5-dfe1c4aa04f2'],
  'locations': ['https://ghibliapi.herokuapp.com/locations/'],
  'vehicles': ['https://ghibliapi.herokuapp.com/vehicles/'],
  'url': 'https://ghibliapi.herokuapp.com/films/2b

The keys that I am interested in working with to create the report specified above are 
- director
- rt_score
- title

In [31]:
r_2 = requests.get('http://linserv1.cims.nyu.edu:10000/films?_page=2')
d_2 = r_2.json() # parses json into dictionary!

In [32]:
d_2[:3]

[{'id': 'dc2e6bd1-8156-4886-adff-b39e6043af0c',
  'title': 'Spirited Away',
  'original_title': '千と千尋の神隠し',
  'original_title_romanised': 'Sen to Chihiro no kamikakushi',
  'description': 'Spirited Away is an Oscar winning Japanese animated film about a ten year old girl who wanders away from her parents along a path that leads to a world ruled by strange and unusual monster-like animals. Her parents have been changed into pigs along with others inside a bathhouse full of these creatures. Will she ever see the world how it once was?',
  'director': 'Hayao Miyazaki',
  'producer': 'Toshio Suzuki',
  'release_date': '2001',
  'running_time': '124',
  'rt_score': '97',
  'people': ['https://ghibliapi.herokuapp.com/people/'],
  'species': ['https://ghibliapi.herokuapp.com/species/af3910a6-429f-4c74-9ad5-dfe1c4aa04f2'],
  'locations': ['https://ghibliapi.herokuapp.com/locations/'],
  'vehicles': ['https://ghibliapi.herokuapp.com/vehicles/'],
  'url': 'https://ghibliapi.herokuapp.com/films/d

In [35]:
r_3 = requests.get('http://linserv1.cims.nyu.edu:10000/films?_page=3')
d_3 = r_3.json() # parses json into dictionary!

In [39]:
d_3

[{'id': 'd868e6ec-c44a-405b-8fa6-f7f0f8cfb500',
  'title': 'The Red Turtle',
  'original_title': 'レッドタートル ある島の物語',
  'original_title_romanised': 'Reddotātoru aru shima no monogatari',
  'description': 'A man set adrift by a storm wakes up on a beach. He discovers that he is on a deserted island with plenty of fresh water, fruit and a dense bamboo forest. He builds a raft from bamboo and attempts to sail away, but his raft is destroyed by an unseen monster in the sea, forcing him back to the island. He tries again with another, larger raft, but is again foiled by the creature. A third attempt again ends with the raft destroyed, but this time he is confronted by a giant red turtle, which stares at him, and forces him back to the island.',
  'director': 'Michaël Dudok de Wit',
  'producer': 'Toshio Suzuki, Isao Takahata, Vincent Maraval, Pascal Caucheteux, Grégoire Sorlat',
  'release_date': '2016',
  'running_time': '80',
  'rt_score': '93',
  'people': ['https://ghibliapi.herokuapp.com/

In [40]:
r_4 = requests.get('http://linserv1.cims.nyu.edu:10000/films?_page=4')
d_4 = r_4.json() # parses json into dictionary!

In [41]:
d_4[:3]

[]

- We can see that when we try to modify the URL and increment the last number after `page` to other numbers, we get different results containing information about other Studio Ghibli films. Pages 2 and 3 have some data in them, but when I try to access the pages after them, no data is returned, meaning that those pages don't contain any content.

# 2. Load the data into a DataFrame

**a. Make a request to http://linserv1.cims.nyu.edu:10000/films?_page=1 again, but this time, load the result into a DataFrame**

In [42]:
import pandas as pd
import requests

In [44]:
# define the URL and parameters for your API request.
base_url = "http://linserv1.cims.nyu.edu:10000/films"
params = {"_page": 1}  # Initial page number

# create an empty DataFrame to store the collected data.
df = pd.DataFrame()

In [45]:
# set up a loop to make requests and retrieve data until there's no more data.
while True:
    response = requests.get(base_url, params=params)
    
    if response.status_code != 200:
        print(f"Error in API request. Status Code: {response.status_code}")
        break  # Exit the loop if there's an error
    
    data = response.json()  # Assuming the API returns JSON data

    # Check if there's data to add to the DataFrame
    if not data:
        break  # No more data to retrieve

    # Append the data to the DataFrame
    df = df.append(pd.DataFrame(data), ignore_index=True)

    # Increment the page number to get the next page (assuming a "_page" parameter)
    params["_page"] += 1

  df = df.append(pd.DataFrame(data), ignore_index=True)
  df = df.append(pd.DataFrame(data), ignore_index=True)
  df = df.append(pd.DataFrame(data), ignore_index=True)


In [46]:
df

Unnamed: 0,id,title,original_title,original_title_romanised,description,director,producer,release_date,running_time,rt_score,people,species,locations,vehicles,url
0,2baf70d1-42bb-4437-b551-e5fed5a87abe,Castle in the Sky,天空の城ラピュタ,Tenkū no shiro Rapyuta,The orphan Sheeta inherited a mysterious cryst...,Hayao Miyazaki,Isao Takahata,1986,124,95,[https://ghibliapi.herokuapp.com/people/],[https://ghibliapi.herokuapp.com/species/af391...,[https://ghibliapi.herokuapp.com/locations/],[https://ghibliapi.herokuapp.com/vehicles/],https://ghibliapi.herokuapp.com/films/2baf70d1...
1,12cfb892-aac0-4c5b-94af-521852e46d6a,Grave of the Fireflies,火垂るの墓,Hotaru no haka,"In the latter part of World War II, a boy and ...",Isao Takahata,Toru Hara,1988,89,97,[https://ghibliapi.herokuapp.com/people/],[https://ghibliapi.herokuapp.com/species/af391...,[https://ghibliapi.herokuapp.com/locations/],[https://ghibliapi.herokuapp.com/vehicles/],https://ghibliapi.herokuapp.com/films/12cfb892...
2,58611129-2dbc-4a81-a72f-77ddfc1b1b49,My Neighbor Totoro,となりのトトロ,Tonari no Totoro,Two sisters move to the country with their fat...,Hayao Miyazaki,Hayao Miyazaki,1988,86,93,[https://ghibliapi.herokuapp.com/people/986faa...,[https://ghibliapi.herokuapp.com/species/af391...,[https://ghibliapi.herokuapp.com/locations/],[https://ghibliapi.herokuapp.com/vehicles/],https://ghibliapi.herokuapp.com/films/58611129...
3,ea660b10-85c4-4ae3-8a5f-41cea3648e3e,Kiki's Delivery Service,魔女の宅急便,Majo no takkyūbin,"A young witch, on her mandatory year of indepe...",Hayao Miyazaki,Hayao Miyazaki,1989,102,96,[https://ghibliapi.herokuapp.com/people/],[https://ghibliapi.herokuapp.com/species/af391...,[https://ghibliapi.herokuapp.com/locations/],[https://ghibliapi.herokuapp.com/vehicles/],https://ghibliapi.herokuapp.com/films/ea660b10...
4,4e236f34-b981-41c3-8c65-f8c9000b94e7,Only Yesterday,おもひでぽろぽろ,Omoide poro poro,"It’s 1982, and Taeko is 27 years old, unmarrie...",Isao Takahata,Toshio Suzuki,1991,118,100,[https://ghibliapi.herokuapp.com/people/],[https://ghibliapi.herokuapp.com/species/af391...,[https://ghibliapi.herokuapp.com/locations/],[https://ghibliapi.herokuapp.com/vehicles/],https://ghibliapi.herokuapp.com/films/4e236f34...
5,ebbb6b7c-945c-41ee-a792-de0e43191bd8,Porco Rosso,紅の豚,Kurenai no buta,"Porco Rosso, known in Japan as Crimson Pig (Ku...",Hayao Miyazaki,Toshio Suzuki,1992,93,94,[https://ghibliapi.herokuapp.com/people/],[https://ghibliapi.herokuapp.com/species/af391...,[https://ghibliapi.herokuapp.com/locations/],[https://ghibliapi.herokuapp.com/vehicles/],https://ghibliapi.herokuapp.com/films/ebbb6b7c...
6,1b67aa9a-2e4a-45af-ac98-64d6ad15b16c,Pom Poko,平成狸合戦ぽんぽこ,Heisei tanuki gassen Ponpoko,As the human city development encroaches on th...,Isao Takahata,Toshio Suzuki,1994,119,78,[https://ghibliapi.herokuapp.com/people/],[https://ghibliapi.herokuapp.com/species/af391...,[https://ghibliapi.herokuapp.com/locations/],[https://ghibliapi.herokuapp.com/vehicles/],https://ghibliapi.herokuapp.com/films/1b67aa9a...
7,ff24da26-a969-4f0e-ba1e-a122ead6c6e3,Whisper of the Heart,耳をすませば,Mimi wo sumaseba,"Shizuku lives a simple life, dominated by her ...",Yoshifumi Kondō,Toshio Suzuki,1995,111,91,[https://ghibliapi.herokuapp.com/people/],[https://ghibliapi.herokuapp.com/species/af391...,[https://ghibliapi.herokuapp.com/locations/],[https://ghibliapi.herokuapp.com/vehicles/],https://ghibliapi.herokuapp.com/films/ff24da26...
8,0440483e-ca0e-4120-8c50-4c8cd9b965d6,Princess Mononoke,もののけ姫,Mononoke hime,"Ashitaka, a prince of the disappearing Ainu tr...",Hayao Miyazaki,Toshio Suzuki,1997,134,92,[https://ghibliapi.herokuapp.com/people/ba9246...,[https://ghibliapi.herokuapp.com/species/af391...,[https://ghibliapi.herokuapp.com/locations/],[https://ghibliapi.herokuapp.com/vehicles/],https://ghibliapi.herokuapp.com/films/0440483e...
9,45204234-adfd-45cb-a505-a8e7a676b114,My Neighbors the Yamadas,ホーホケキョ となりの山田くん,Hōhokekyo tonari no Yamada-kun,The Yamadas are a typical middle class Japanes...,Isao Takahata,Toshio Suzuki,1999,104,75,[https://ghibliapi.herokuapp.com/people/],[https://ghibliapi.herokuapp.com/species/af391...,[https://ghibliapi.herokuapp.com/locations/],[https://ghibliapi.herokuapp.com/vehicles/],https://ghibliapi.herokuapp.com/films/45204234...


# 3. Report

In [64]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 21 entries, 0 to 20
Data columns (total 15 columns):
 #   Column                    Non-Null Count  Dtype 
---  ------                    --------------  ----- 
 0   id                        21 non-null     object
 1   title                     21 non-null     object
 2   original_title            21 non-null     object
 3   original_title_romanised  21 non-null     object
 4   description               21 non-null     object
 5   director                  21 non-null     object
 6   producer                  21 non-null     object
 7   release_date              21 non-null     object
 8   running_time              21 non-null     object
 9   rt_score                  21 non-null     object
 10  people                    21 non-null     object
 11  species                   21 non-null     object
 12  locations                 21 non-null     object
 13  vehicles                  21 non-null     object
 14  url                       21

- We can see that the rt_score columns is of type `object`, let's convert it to a numeric type as we will be using this column values later for calculation

In [65]:
df['rt_score'] = pd.to_numeric(df['rt_score'], errors='coerce')

In [66]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 21 entries, 0 to 20
Data columns (total 15 columns):
 #   Column                    Non-Null Count  Dtype 
---  ------                    --------------  ----- 
 0   id                        21 non-null     object
 1   title                     21 non-null     object
 2   original_title            21 non-null     object
 3   original_title_romanised  21 non-null     object
 4   description               21 non-null     object
 5   director                  21 non-null     object
 6   producer                  21 non-null     object
 7   release_date              21 non-null     object
 8   running_time              21 non-null     object
 9   rt_score                  21 non-null     int64 
 10  people                    21 non-null     object
 11  species                   21 non-null     object
 12  locations                 21 non-null     object
 13  vehicles                  21 non-null     object
 14  url                       21

- Now, let's create a report

In [67]:
# Group the data by the directors' names
director_group = df.groupby('director')

# Calculate the average Rotten Tomatoes score and count of films for each director
director_report = director_group.agg({'rt_score': 'mean', 'title': 'count'})

In [68]:
director_report

Unnamed: 0_level_0,rt_score,title
director,Unnamed: 1_level_1,Unnamed: 2_level_1
Gorō Miyazaki,62.0,2
Hayao Miyazaki,92.777778,9
Hiromasa Yonebayashi,93.5,2
Hiroyuki Morita,89.0,1
Isao Takahata,90.0,5
Michaël Dudok de Wit,93.0,1
Yoshifumi Kondō,91.0,1


In [69]:
# Rename the columns for clarity
director_report.rename(columns={'rt_score': 'avg_rt_score', 'title': 'count'}, inplace=True)

# Set the director names as the index, and set an index name
director_report.index.name = 'director'

# Sort the report by the average Rotten Tomatoes score in descending order
director_report = director_report.sort_values(by='avg_rt_score', ascending=False)

In [71]:
director_report

Unnamed: 0_level_0,avg_rt_score,count
director,Unnamed: 1_level_1,Unnamed: 2_level_1
Hiromasa Yonebayashi,93.5,2
Michaël Dudok de Wit,93.0,1
Hayao Miyazaki,92.777778,9
Yoshifumi Kondō,91.0,1
Isao Takahata,90.0,5
Hiroyuki Morita,89.0,1
Gorō Miyazaki,62.0,2
