# Project 2: Data Import - Working with Web APIs and JSON (Movies Dataset)

# Project Brief for Self-Coders

Here you´ll have the opportunity to code major parts of Project 2 on your own. If you need any help or inspiration, have a look at the Videos or the Jupyter Notebook with the full code. <br> <br>
Keep in mind that it´s all about __getting the right results/conclusions__. It´s not about finding the identical code. Things can be coded in many different ways. Even if you come to the same conclusions, it´s very unlikely that we have the very same code. 

## Importing Data from JSON files 

1. __Import__ the json files __blockbusters.json__, __blockbusters2.json__, __blockbusters3.json__ and load the datasets into Pandas DataFrames.


In [9]:
import pandas as pd
import json
pd.options.display.max_columns = 30

In [10]:
with open("blockbusters.json") as f:
    data = json.load(f)

In [11]:
data

[{'title': 'Avengers: Endgame',
  'id': 299534,
  'revenue': 2797800564,
  'genres': [{'id': 12, 'name': 'Adventure'},
   {'id': 878, 'name': 'Science Fiction'},
   {'id': 28, 'name': 'Action'}],
  'belongs_to_collection': {'id': 86311,
   'name': 'The Avengers Collection',
   'poster_path': '/yFSIUVTCvgYrpalUktulvk3Gi5Y.jpg',
   'backdrop_path': '/zuW6fOiusv4X9nnW3paHGfXcSll.jpg'},
  'runtime': 181},
 {'title': 'Avatar',
  'id': 19995,
  'revenue': 2787965087,
  'genres': [{'id': 28, 'name': 'Action'},
   {'id': 12, 'name': 'Adventure'},
   {'id': 14, 'name': 'Fantasy'},
   {'id': 878, 'name': 'Science Fiction'}],
  'belongs_to_collection': {'id': 87096,
   'name': 'Avatar Collection',
   'poster_path': '/nslJVsO58Etqkk17oXMuVK4gNOF.jpg',
   'backdrop_path': '/8nCr9W7sKus2q9PLbYsnT7iCkuT.jpg'},
  'runtime': 162},
 {'title': 'Star Wars: The Force Awakens',
  'id': 140607,
  'revenue': 2068223624,
  'genres': [{'id': 28, 'name': 'Action'},
   {'id': 12, 'name': 'Adventure'},
   {'id': 8

In [12]:
df2 = pd.read_json('blockbusters2.json', orient = 'column')
df2.head()

Unnamed: 0,title,id,revenue,genres,belongs_to_collection,runtime
0,Avengers: Endgame,299534,2797800564,"[{'id': 12, 'name': 'Adventure'}, {'id': 878, ...","{'id': 86311, 'name': 'The Avengers Collection...",181
1,Avatar,19995,2787965087,"[{'id': 28, 'name': 'Action'}, {'id': 12, 'nam...","{'id': 87096, 'name': 'Avatar Collection', 'po...",162
2,Star Wars: The Force Awakens,140607,2068223624,"[{'id': 28, 'name': 'Action'}, {'id': 12, 'nam...","{'id': 10, 'name': 'Star Wars Collection', 'po...",136
3,Avengers: Infinity War,299536,2046239637,"[{'id': 12, 'name': 'Adventure'}, {'id': 28, '...","{'id': 86311, 'name': 'The Avengers Collection...",149
4,Titanic,597,1845034188,"[{'id': 18, 'name': 'Drama'}, {'id': 10749, 'n...",,194


In [5]:
df3 = pd.read_json('blockbusters3.json', orient = 'split')
df3.head()

Unnamed: 0,title,id,revenue,genres,belongs_to_collection,runtime
0,Avengers: Endgame,299534,2797800564,"[{'id': 12, 'name': 'Adventure'}, {'id': 878, ...","{'id': 86311, 'name': 'The Avengers Collection...",181
1,Avatar,19995,2787965087,"[{'id': 28, 'name': 'Action'}, {'id': 12, 'nam...","{'id': 87096, 'name': 'Avatar Collection', 'po...",162
2,Star Wars: The Force Awakens,140607,2068223624,"[{'id': 28, 'name': 'Action'}, {'id': 12, 'nam...","{'id': 10, 'name': 'Star Wars Collection', 'po...",136
3,Avengers: Infinity War,299536,2046239637,"[{'id': 12, 'name': 'Adventure'}, {'id': 28, '...","{'id': 86311, 'name': 'The Avengers Collection...",149
4,Titanic,597,1845034188,"[{'id': 18, 'name': 'Drama'}, {'id': 10749, 'n...",,194


## Working with APIs and JSON (Part 1)

2. __Create an account__ on https://www.themoviedb.org/

3. Get your personal __API Key__

4. __API-Request__ (movie module): Load all available information for the movie with __movie id = 140607__ into a Pandas DataFrame. <br> See https://developers.themoviedb.org/3/movies/get-movie-details for more information

In [6]:
import config
import requests
api_key  = 'api_key=' + config.api_key
api_key

'api_key=f87c31e443414f56263b3306d1fada82'

In [7]:
movie_id = 140607

In [8]:
movie_api = "https://api.themoviedb.org/3/movie/{}?"
url = movie_api.format(movie_id) + api_key
url

'https://api.themoviedb.org/3/movie/140607?api_key=f87c31e443414f56263b3306d1fada82'

In [9]:
r = requests.get(url)
r

<Response [200]>

In [10]:
data = r.json()
type(data)

dict

In [11]:
data_df = pd.Series(data)
data_df = data_df.to_frame().T
data_df

Unnamed: 0,adult,backdrop_path,belongs_to_collection,budget,genres,homepage,id,imdb_id,original_language,original_title,overview,popularity,poster_path,production_companies,production_countries,release_date,revenue,runtime,spoken_languages,status,tagline,title,video,vote_average,vote_count
0,False,/k6EOrckWFuz7I4z4wiRwz8zsj4H.jpg,"{'id': 10, 'name': 'Star Wars Collection', 'po...",245000000,"[{'id': 28, 'name': 'Action'}, {'id': 12, 'nam...",http://www.starwars.com/films/star-wars-episod...,140607,tt2488496,en,Star Wars: The Force Awakens,Thirty years after defeating the Galactic Empi...,46.289,/wqnLdwVXoBjKibFRR5U3y0aDUhs.jpg,"[{'id': 1, 'logo_path': '/o86DbpburjxrqAzEDhXZ...","[{'iso_3166_1': 'US', 'name': 'United States o...",2015-12-15,2068223624,136,"[{'english_name': 'English', 'iso_639_1': 'en'...",Released,Every generation has a story.,Star Wars: The Force Awakens,False,7.4,15717


In [24]:
pd.json_normalize(data, sep = "_")

Unnamed: 0,adult,backdrop_path,budget,genres,homepage,id,imdb_id,original_language,original_title,overview,popularity,poster_path,production_companies,production_countries,release_date,revenue,runtime,spoken_languages,status,tagline,title,video,vote_average,vote_count,belongs_to_collection_id,belongs_to_collection_name,belongs_to_collection_poster_path,belongs_to_collection_backdrop_path
0,False,/k6EOrckWFuz7I4z4wiRwz8zsj4H.jpg,245000000,"[{'id': 28, 'name': 'Action'}, {'id': 12, 'nam...",http://www.starwars.com/films/star-wars-episod...,140607,tt2488496,en,Star Wars: The Force Awakens,Thirty years after defeating the Galactic Empi...,46.289,/wqnLdwVXoBjKibFRR5U3y0aDUhs.jpg,"[{'id': 1, 'logo_path': '/o86DbpburjxrqAzEDhXZ...","[{'iso_3166_1': 'US', 'name': 'United States o...",2015-12-15,2068223624,136,"[{'english_name': 'English', 'iso_639_1': 'en'...",Released,Every generation has a story.,Star Wars: The Force Awakens,False,7.4,15717,10,Star Wars Collection,/iTQHKziZy9pAAY4hHEDCGPaOvFC.jpg,/d8duYyyC9J5T825Hg7grmaabfxQ.jpg


In [22]:
pd.json_normalize(data, record_path = 'genres', meta = 'id', meta_prefix = 'movies_')

Unnamed: 0,id,name,movies_id
0,28,Action,140607
1,12,Adventure,140607
2,878,Science Fiction,140607
3,14,Fantasy,140607


## Working with APIs and JSON (Part 2)

5. __API-Request__ (discover module): Load all movies with __release date between 2020-01-01 and 2020-02-29__ into a Pandas DataFrame. <br>
See https://www.themoviedb.org/documentation/api/discover and https://developers.themoviedb.org/3/discover/movie-discover for more information.

In [57]:
date_api = 'https://api.themoviedb.org/3/discover/movie?{}&primary_release_date.gte={}&primary_release_date.lte={}'
date1 = '2020-01-01'
date2 = '2020-02-29'
url2 = date_api.format(api_key, date1,date2)
url2

'https://api.themoviedb.org/3/discover/movie?api_key=f87c31e443414f56263b3306d1fada82&primary_release_date.gte=2020-01-01&primary_release_date.lte=2020-02-29'

In [58]:
r2 =requests.get(url2)

In [59]:
data2 = r2.json()
type(data2)

dict

In [60]:
data2_df = pd.DataFrame(data2['results'])
data2_df.head()

Unnamed: 0,adult,backdrop_path,genre_ids,id,original_language,original_title,overview,popularity,poster_path,release_date,title,video,vote_average,vote_count
0,False,/jiqD14fg7UTZOT6qgvzTmfRYpWI.jpg,"[28, 80]",495764,en,Birds of Prey (and the Fantabulous Emancipatio...,"Harley Quinn joins forces with a singer, an as...",405.833,/h4VB6m0RwcicVEZvzftYZyKXs6K.jpg,2020-02-05,Birds of Prey (and the Fantabulous Emancipatio...,False,7.1,7415
1,False,/3N316jUSdhvPyYTW29G4v9ebbcS.jpg,"[53, 28, 80]",38700,en,Bad Boys for Life,Marcus and Mike are forced to confront new thr...,388.348,/y95lQLnuNKdPAzw9F9Ab8kJ80c3.jpg,2020-01-15,Bad Boys for Life,False,7.2,6225
2,False,/1umKVgbjFG5Cho5ZKTpcvRFJjuJ.jpg,"[35, 53, 80]",609242,es,El robo del siglo,"In 2006, a group of thieves performed what is ...",325.219,/aSGwXbaTMxUhrfXT6xyZKqoklfB.jpg,2020-01-16,The Heist of the Century,False,7.9,484
3,False,/6mKAKhj8POVGqV1GsroS5mGIUe9.jpg,"[14, 28, 12]",666750,en,Dragonheart: Vengeance,"Lukas, a young farmer whose family is killed b...",304.248,/qs6gz6atyQcAvqC6qZaslOjliUG.jpg,2020-02-04,Dragonheart: Vengeance,False,7.0,218
4,False,/5VKquU8PNujrxLmsYGHf2TCRNFQ.jpg,"[878, 28, 12, 9648, 36, 14]",582306,en,Assassin 33 A.D.,When a billionaire gives a group of young scie...,305.144,/8jDvtdH327I8TgX3UPdkAsZF1dA.jpg,2020-01-24,Assassin 33 A.D.,False,5.2,58


In [62]:
data2_df.sort_values(by = 'release_date', ascending = False).head()

Unnamed: 0,adult,backdrop_path,genre_ids,id,original_language,original_title,overview,popularity,poster_path,release_date,title,video,vote_average,vote_count
8,False,/4br4B8C0SRIYcKHUgoaOlGo50MU.jpg,[27],575088,ru,Яга. Кошмар тёмного леса,The young family who moved to a new apartment ...,181.197,/8m5HTXzwewlfXhtZtLlLts53YTW.jpg,2020-02-27,Baba Yaga: Terror of the Dark Forest,False,6.1,124
18,False,/fssCO59bqU5f0zngeYKex0g1vyb.jpg,"[35, 28]",457335,en,Guns Akimbo,An ordinary guy suddenly finds himself forced ...,94.737,/vV23MzddmlZJ6TIXpmRUyGV9961.jpg,2020-02-27,Guns Akimbo,False,6.5,1323
10,False,/a8ppJJIQmEJcLSFfhxupc4aT4KW.jpg,"[18, 28, 53]",571785,ko,사냥의 시간,Four young men who want to leave their dystopi...,184.058,/bkuuvDoPkOJpg0ZDzHkUWt8ZG5A.jpg,2020-02-22,Time to Hunt,False,7.3,177
11,False,/p6ExERRwodksg0fFKzCjmNCR6Hw.jpg,[53],531299,en,Kill Chain,A hotel room shootout between two assassins ki...,173.868,/wy0Xs5mGtD92PyKvsl0lxzbzscG.jpg,2020-02-20,Kill Chain,False,5.5,71
12,False,/cGUxPXVZF5n5P09dnlhWC8bLVp7.jpg,"[18, 53]",505225,en,The Last Thing He Wanted,At the turning point of the Iran-Contra affair...,173.162,/gItrnbEbMBbUrdIkFz8kgS2gkt.jpg,2020-02-14,The Last Thing He Wanted,False,5.0,326


##  Importing and Saving the Movies Dataset (Best Practice)

6. __API-Request__ (movie module): Load all available information for the movies with movie id = [__299534, 19995, 140607, 299536, 597, 135397, 420818, 24428, 168259, 99861, 284054, 12445, 181808, 330457, 351286, 109445, 321612, 260513__] into a Pandas DataFrame and __save the dataset in a local json file__.

In [82]:
movie_id = [19995, 140607, 299536, 597, 135397, 420818, 24428, 168259, 99861, 284054, 12445, 181808, 330457, 351286, 109445, 321612, 260513]

In [83]:
movie_api
url

'https://api.themoviedb.org/3/movie/260513?api_key=f87c31e443414f56263b3306d1fada82'

In [84]:
movie_id1 = '299534'
url = movie_api.format(movie_id1) + api_key
r = requests.get(url)
data = r.json()
data_df = pd.Series(data)
data_df = data_df.to_frame().T
data_df.head()

Unnamed: 0,adult,backdrop_path,belongs_to_collection,budget,genres,homepage,id,imdb_id,original_language,original_title,overview,popularity,poster_path,production_companies,production_countries,release_date,revenue,runtime,spoken_languages,status,tagline,title,video,vote_average,vote_count
0,False,/7RyHsO4yDXtBv1zUU3mTpHeQ0d5.jpg,"{'id': 86311, 'name': 'The Avengers Collection...",356000000,"[{'id': 12, 'name': 'Adventure'}, {'id': 878, ...",https://www.marvel.com/movies/avengers-endgame,299534,tt4154796,en,Avengers: Endgame,After the devastating events of Avengers: Infi...,250.137,/ulzhLuWrPK07P1YkdWQLZnQh1JL.jpg,"[{'id': 420, 'logo_path': '/hUzeosd33nzE5MCNsZ...","[{'iso_3166_1': 'US', 'name': 'United States o...",2019-04-24,2797800564,181,"[{'english_name': 'English', 'iso_639_1': 'en'...",Released,Part of the journey is the end.,Avengers: Endgame,False,8.3,18001


In [85]:
for i in movie_id:
    url = movie_api.format(i) + api_key
    r = requests.get(url)
    data = r.json()
    data = pd.Series(data)
    data = data.to_frame().T
    data_df = pd.concat([data_df, data], axis =0, ignore_index=True)
    

In [87]:
data_df.head(n=3)

Unnamed: 0,adult,backdrop_path,belongs_to_collection,budget,genres,homepage,id,imdb_id,original_language,original_title,overview,popularity,poster_path,production_companies,production_countries,release_date,revenue,runtime,spoken_languages,status,tagline,title,video,vote_average,vote_count
0,False,/7RyHsO4yDXtBv1zUU3mTpHeQ0d5.jpg,"{'id': 86311, 'name': 'The Avengers Collection...",356000000,"[{'id': 12, 'name': 'Adventure'}, {'id': 878, ...",https://www.marvel.com/movies/avengers-endgame,299534,tt4154796,en,Avengers: Endgame,After the devastating events of Avengers: Infi...,250.137,/ulzhLuWrPK07P1YkdWQLZnQh1JL.jpg,"[{'id': 420, 'logo_path': '/hUzeosd33nzE5MCNsZ...","[{'iso_3166_1': 'US', 'name': 'United States o...",2019-04-24,2797800564,181,"[{'english_name': 'English', 'iso_639_1': 'en'...",Released,Part of the journey is the end.,Avengers: Endgame,False,8.3,18001
1,False,/AmHOQ7rpHwiaUMRjKXztnauSJb7.jpg,"{'id': 87096, 'name': 'Avatar Collection', 'po...",237000000,"[{'id': 28, 'name': 'Action'}, {'id': 12, 'nam...",http://www.avatarmovie.com/,19995,tt0499549,en,Avatar,"In the 22nd century, a paraplegic Marine is di...",79.151,/6EiRUJpuoeQPghrs3YNktfnqOVh.jpg,"[{'id': 444, 'logo_path': '/42UPdZl6B2cFXgNUAS...","[{'iso_3166_1': 'US', 'name': 'United States o...",2009-12-10,2787965087,162,"[{'english_name': 'English', 'iso_639_1': 'en'...",Released,Enter the World of Pandora.,Avatar,False,7.5,23289
2,False,/k6EOrckWFuz7I4z4wiRwz8zsj4H.jpg,"{'id': 10, 'name': 'Star Wars Collection', 'po...",245000000,"[{'id': 28, 'name': 'Action'}, {'id': 12, 'nam...",http://www.starwars.com/films/star-wars-episod...,140607,tt2488496,en,Star Wars: The Force Awakens,Thirty years after defeating the Galactic Empi...,53.303,/wqnLdwVXoBjKibFRR5U3y0aDUhs.jpg,"[{'id': 1, 'logo_path': '/o86DbpburjxrqAzEDhXZ...","[{'iso_3166_1': 'US', 'name': 'United States o...",2015-12-15,2068223624,136,"[{'english_name': 'English', 'iso_639_1': 'en'...",Released,Every generation has a story.,Star Wars: The Force Awakens,False,7.4,15740


# +++++++++ See some Hints below +++++++++++++

# ++++++++++++++ Hints +++++++++++++++++++++

__Hints for 1.__ <br>
To load json files you can use 

In [None]:
with open("filename.json") as f:
    data = json.load(f)

and 

In [None]:
pd.DataFrame(data), pd.read_json(filename.json), pd.json_normalize(data)

the json files have the following orientation (important when using pd.read_json()):
- blockbusters.json -> record
- blockbusters2.json -> column
- blockbusters3.json -> split 

__Hints for 4., 5., 6.__<br>
Make API GET-requests with the library requests (import requests):

In [None]:
data = requests.get(url).json()

__Hints for 4. and 6.,__ <br> url structure for movie module:

"https://api.themoviedb.org/3/movie/insert_movie_id?api_key=insert_api_key" (replace "insert_movie_id" with movie id and "insert_api_key" with your personal api-key)

__Hints for 5.__<br>
url structure for discover module:

"https://api.themoviedb.org/3/discover/movie?api_key=insert_api_key&query1&query2..." (replace "insert_api_key" with your personal api-key and add appropriate queries)