# Project 2: Data Import - Working with Web APIs and JSON (Movies Dataset)

# Project Brief for Self-Coders

Here you´ll have the opportunity to code major parts of Project 2 on your own. If you need any help or inspiration, have a look at the Videos or the Jupyter Notebook with the full code. <br> <br>
Keep in mind that it´s all about __getting the right results/conclusions__. It´s not about finding the identical code. Things can be coded in many different ways. Even if you come to the same conclusions, it´s very unlikely that we have the very same code. 

## Importing Data from JSON files 

1. __Import__ the json files __blockbusters.json__, __blockbusters2.json__, __blockbusters3.json__ and load the datasets into Pandas DataFrames.


In [19]:
import pandas as pd
import json
import requests

bb = pd.read_json('blockbusters.json')
bb2 = pd.read_json('blockbusters2.json', orient='records')
bb3 = pd.read_json('blockbusters3.json', orient='split')

#genreDF = pd.json_normalize(data=bb, sep="_", record_path='genres', meta=['title', 'id'], record_prefix='genre')

bb.head()
bb2.head()
bb3.head()

Unnamed: 0,title,id,revenue,genres,belongs_to_collection,runtime
0,Avengers: Endgame,299534,2797800564,"[{'id': 12, 'name': 'Adventure'}, {'id': 878, ...","{'id': 86311, 'name': 'The Avengers Collection...",181
1,Avatar,19995,2787965087,"[{'id': 28, 'name': 'Action'}, {'id': 12, 'nam...","{'id': 87096, 'name': 'Avatar Collection', 'po...",162
2,Star Wars: The Force Awakens,140607,2068223624,"[{'id': 28, 'name': 'Action'}, {'id': 12, 'nam...","{'id': 10, 'name': 'Star Wars Collection', 'po...",136
3,Avengers: Infinity War,299536,2046239637,"[{'id': 12, 'name': 'Adventure'}, {'id': 28, '...","{'id': 86311, 'name': 'The Avengers Collection...",149
4,Titanic,597,1845034188,"[{'id': 18, 'name': 'Drama'}, {'id': 10749, 'n...",,194


## Working with APIs and JSON (Part 1)

2. __Create an account__ on https://www.themoviedb.org/

3. Get your personal __API Key__

4. __API-Request__ (movie module): Load all available information for the movie with __movie id = 140607__ into a Pandas DataFrame. <br> See https://developers.themoviedb.org/3/movies/get-movie-details for more information

In [26]:
apiKey = '6d70064d319ce27ddb6f38e4856eb669'
movieID = 140607
url = 'https://api.themoviedb.org/3/movie/%s?api_key=%s&language=en-US' % (movieID, apiKey)

req = requests.get(url)
data = req.json()
df = pd.Series(data).to_frame()
pd.json_normalize(data, sep='_')

pd.json_normalize(data, record_path='production_companies', meta='title')

Unnamed: 0,id,logo_path,name,origin_country,title
0,1,/o86DbpburjxrqAzEDhXZcyE8pDb.png,Lucasfilm Ltd.,US,Star Wars: The Force Awakens
1,11461,/p9FoEt5shEKRWRKVIlvFaEmRnun.png,Bad Robot,US,Star Wars: The Force Awakens


## Working with APIs and JSON (Part 2)

5. __API-Request__ (discover module): Load all movies with __release date between 2020-01-01 and 2020-02-29__ into a Pandas DataFrame. <br>
See https://www.themoviedb.org/documentation/api/discover and https://developers.themoviedb.org/3/discover/movie-discover for more information.

In [29]:
discoverAPIBase = 'https://api.themoviedb.org/3/discover/movie?api_key=%s' % apiKey
startDate = '2020-01-01'
endDate = '2020-02-29'
query1 = '&primary_release_date.gte=%s&primary_release_date.lte=%s' % (startDate, endDate)

req = requests.get(discoverAPIBase+query1)
data = req.json()

pd.DataFrame(data['results'])

Unnamed: 0,adult,backdrop_path,genre_ids,id,original_language,original_title,overview,popularity,poster_path,release_date,title,video,vote_average,vote_count
0,False,/jiqD14fg7UTZOT6qgvzTmfRYpWI.jpg,"[28, 80]",495764,en,Birds of Prey (and the Fantabulous Emancipatio...,"Harley Quinn joins forces with a singer, an as...",558.568,/h4VB6m0RwcicVEZvzftYZyKXs6K.jpg,2020-02-05,Birds of Prey (and the Fantabulous Emancipatio...,False,7.1,7334
1,False,/3N316jUSdhvPyYTW29G4v9ebbcS.jpg,"[53, 28, 80]",38700,en,Bad Boys for Life,Marcus and Mike are forced to confront new thr...,467.754,/y95lQLnuNKdPAzw9F9Ab8kJ80c3.jpg,2020-01-15,Bad Boys for Life,False,7.2,6170
2,False,/1umKVgbjFG5Cho5ZKTpcvRFJjuJ.jpg,"[35, 53, 80]",609242,es,El robo del siglo,"In 2006, a group of thieves performed what is ...",388.673,/aSGwXbaTMxUhrfXT6xyZKqoklfB.jpg,2020-01-16,The Heist of the Century,False,8.0,481
3,False,/6mKAKhj8POVGqV1GsroS5mGIUe9.jpg,"[14, 28, 12]",666750,en,Dragonheart: Vengeance,"Lukas, a young farmer whose family is killed b...",352.099,/qs6gz6atyQcAvqC6qZaslOjliUG.jpg,2020-02-04,Dragonheart: Vengeance,False,6.9,212
4,False,/gGwA6YErMjiROavfGyxdciQnlTA.jpg,"[18, 53]",596247,es,Pacto de fuga,During the last years of Pinochet's military r...,325.648,/qDFfu73R8uO94ydFtdxEdSfTlg6.jpg,2020-01-23,Pacto de Fuga,False,7.8,55
5,False,/5VKquU8PNujrxLmsYGHf2TCRNFQ.jpg,"[878, 28, 12, 9648, 36, 14]",582306,en,Assassin 33 A.D.,When a billionaire gives a group of young scie...,319.174,/8jDvtdH327I8TgX3UPdkAsZF1dA.jpg,2020-01-24,Assassin 33 A.D.,False,5.2,58
6,False,/ww7eC3BqSbFsyE5H5qMde8WkxJ2.jpg,"[28, 27, 878, 53]",443791,en,Underwater,After an earthquake destroys their underwater ...,227.394,/gzlbb3yeVISpQ3REd3Ga1scWGTU.jpg,2020-01-08,Underwater,False,6.3,2014
7,False,/dT05ycGuf4h1uYYAJttxTFKkfBQ.jpg,"[10752, 18]",662334,es,Chaco,"In 1934, Bolivia is at war with Paraguay. Libo...",225.488,/hCR2i9rK6P4VHMfFw2MT5jDGJcN.jpg,2020-01-28,Chaco,False,7.9,40
8,False,/4br4B8C0SRIYcKHUgoaOlGo50MU.jpg,[27],575088,ru,Яга. Кошмар тёмного леса,The young family who moved to a new apartment ...,212.264,/8m5HTXzwewlfXhtZtLlLts53YTW.jpg,2020-02-27,Baba Yaga: Terror of the Dark Forest,False,6.2,117
9,False,/lsgYcIbcoQeDZXsHYMOnkvk3sn0.jpg,"[18, 53]",505225,en,The Last Thing He Wanted,At the turning point of the Iran-Contra affair...,192.931,/gItrnbEbMBbUrdIkFz8kgS2gkt.jpg,2020-02-14,The Last Thing He Wanted,False,5.0,323


##  Importing and Saving the Movies Dataset (Best Practice)

6. __API-Request__ (movie module): Load all available information for the movies with movie id = [__299534, 19995, 140607, 299536, 597, 135397, 420818, 24428, 168259, 99861, 284054, 12445, 181808, 330457, 351286, 109445, 321612, 260513__] into a Pandas DataFrame and __save the dataset in a local json file__.

In [43]:
movieIDs = [299534, 19995, 140607, 299536, 597, 135397, 420818, 24428, 168259, 99861, 284054, 12445, 181808, 330457, 351286, 109445, 321612, 260513]
jsonList = []
baseURL = 'https://api.themoviedb.org/3/movie/{}?api_key={}&language=en-US'
for movie in movieIDs:
    url = baseURL.format(movie, apiKey)
    req = requests.get(url)
    if req.status_code != 200:
        continue
    else:
        data = req.json()
        jsonList.append(data)
        
df = pd.DataFrame(jsonList)
df.to_json('movies.json', orient='records')

data = json.load(open('movies.json'))
df = pd.json_normalize(data)
df.head()

Unnamed: 0,adult,backdrop_path,budget,genres,homepage,id,imdb_id,original_language,original_title,overview,...,tagline,title,video,vote_average,vote_count,belongs_to_collection.id,belongs_to_collection.name,belongs_to_collection.poster_path,belongs_to_collection.backdrop_path,belongs_to_collection
0,False,/7RyHsO4yDXtBv1zUU3mTpHeQ0d5.jpg,356000000,"[{'id': 12, 'name': 'Adventure'}, {'id': 878, ...",https://www.marvel.com/movies/avengers-endgame,299534,tt4154796,en,Avengers: Endgame,After the devastating events of Avengers: Infi...,...,Part of the journey is the end.,Avengers: Endgame,False,8.3,17838,86311.0,The Avengers Collection,/yQpAleQ1KHebVem2vwWL6VPqILT.jpg,/zuW6fOiusv4X9nnW3paHGfXcSll.jpg,
1,False,/AmHOQ7rpHwiaUMRjKXztnauSJb7.jpg,237000000,"[{'id': 28, 'name': 'Action'}, {'id': 12, 'nam...",http://www.avatarmovie.com/,19995,tt0499549,en,Avatar,"In the 22nd century, a paraplegic Marine is di...",...,Enter the World of Pandora.,Avatar,False,7.5,23201,87096.0,Avatar Collection,/gC3tW9a45RGOzzSh6wv91pFnmFr.jpg,/syGPZuzcHBBHMLiNDN0x0Tms4Fk.jpg,
2,False,/k6EOrckWFuz7I4z4wiRwz8zsj4H.jpg,245000000,"[{'id': 28, 'name': 'Action'}, {'id': 12, 'nam...",http://www.starwars.com/films/star-wars-episod...,140607,tt2488496,en,Star Wars: The Force Awakens,Thirty years after defeating the Galactic Empi...,...,Every generation has a story.,Star Wars: The Force Awakens,False,7.4,15688,10.0,Star Wars Collection,/iTQHKziZy9pAAY4hHEDCGPaOvFC.jpg,/d8duYyyC9J5T825Hg7grmaabfxQ.jpg,
3,False,/lmZFxXgJE3vgrciwuDib0N8CfQo.jpg,300000000,"[{'id': 12, 'name': 'Adventure'}, {'id': 28, '...",https://www.marvel.com/movies/avengers-infinit...,299536,tt4154756,en,Avengers: Infinity War,As the Avengers and their allies have continue...,...,An entire universe. Once and for all.,Avengers: Infinity War,False,8.3,21519,86311.0,The Avengers Collection,/yQpAleQ1KHebVem2vwWL6VPqILT.jpg,/zuW6fOiusv4X9nnW3paHGfXcSll.jpg,
4,False,/6VmFqApQRyZZzmiGOQq2C92jyvH.jpg,200000000,"[{'id': 18, 'name': 'Drama'}, {'id': 10749, 'n...",,597,tt0120338,en,Titanic,101-year-old Rose DeWitt Bukater tells the sto...,...,Nothing on Earth could come between them.,Titanic,False,7.9,19003,,,,,


# +++++++++ See some Hints below +++++++++++++

# ++++++++++++++ Hints +++++++++++++++++++++

__Hints for 1.__ <br>
To load json files you can use 

In [None]:
with open("filename.json") as f:
    data = json.load(f)

and 

In [None]:
pd.DataFrame(data), pd.read_json(filename.json), pd.json_normalize(data)

the json files have the following orientation (important when using pd.read_json()):
- blockbusters.json -> record
- blockbusters2.json -> column
- blockbusters3.json -> split 

__Hints for 4., 5., 6.__<br>
Make API GET-requests with the library requests (import requests):

In [None]:
data = requests.get(url).json()

__Hints for 4. and 6.,__ <br> url structure for movie module:

"https://api.themoviedb.org/3/movie/insert_movie_id?api_key=insert_api_key" (replace "insert_movie_id" with movie id and "insert_api_key" with your personal api-key)

__Hints for 5.__<br>
url structure for discover module:

"https://api.themoviedb.org/3/discover/movie?api_key=insert_api_key&query1&query2..." (replace "insert_api_key" with your personal api-key and add appropriate queries)