# Project 2: Data Import - Working with Web APIs and JSON (Movies Dataset)

# Project Brief for Self-Coders

Here you´ll have the opportunity to code major parts of Project 2 on your own. If you need any help or inspiration, have a look at the Videos or the Jupyter Notebook with the full code. <br> <br>
Keep in mind that it´s all about __getting the right results/conclusions__. It´s not about finding the identical code. Things can be coded in many different ways. Even if you come to the same conclusions, it´s very unlikely that we have the very same code. 

## Importing Data from JSON files 

1. __Import__ the json files __blockbusters.json__, __blockbusters2.json__, __blockbusters3.json__ and load the datasets into Pandas DataFrames.


In [1]:
import pandas as pd
import json

In [2]:
# Paths to the .json files
path1='blockbusters.json'
path2='blockbusters2.json'
path3='blockbusters3.json'

In [10]:
# Open the file with json module
with open(path1) as f:
    data = json.load(f)

In [12]:
# We have 18 movies
len(data)

18

In [8]:
# let's select the first movie
data[0]

{'title': 'Avengers: Endgame',
 'id': 299534,
 'revenue': 2797800564,
 'genres': [{'id': 12, 'name': 'Adventure'},
  {'id': 878, 'name': 'Science Fiction'},
  {'id': 28, 'name': 'Action'}],
 'belongs_to_collection': {'id': 86311,
  'name': 'The Avengers Collection',
  'poster_path': '/yFSIUVTCvgYrpalUktulvk3Gi5Y.jpg',
  'backdrop_path': '/zuW6fOiusv4X9nnW3paHGfXcSll.jpg'},
 'runtime': 181}

In [13]:
# We could bring it to a DataFrame, but it would not give a orgnaized one
df = pd.DataFrame(data)
df.head()

Unnamed: 0,title,id,revenue,genres,belongs_to_collection,runtime
0,Avengers: Endgame,299534,2797800564,"[{'id': 12, 'name': 'Adventure'}, {'id': 878, ...","{'id': 86311, 'name': 'The Avengers Collection...",181
1,Avatar,19995,2787965087,"[{'id': 28, 'name': 'Action'}, {'id': 12, 'nam...","{'id': 87096, 'name': 'Avatar Collection', 'po...",162
2,Star Wars: The Force Awakens,140607,2068223624,"[{'id': 28, 'name': 'Action'}, {'id': 12, 'nam...","{'id': 10, 'name': 'Star Wars Collection', 'po...",136
3,Avengers: Infinity War,299536,2046239637,"[{'id': 12, 'name': 'Adventure'}, {'id': 28, '...","{'id': 86311, 'name': 'The Avengers Collection...",149
4,Titanic,597,1845034188,"[{'id': 18, 'name': 'Drama'}, {'id': 10749, 'n...",,194


Pandas has a module to import json into Dataframes
Source: https://pandas.pydata.org/docs/reference/api/pandas.read_json.html
Indication of expected JSON. Compatible JSON strings can be produced by `to_json()` with a corresponding orient value. The set of possible orients is:
`'records'` : list like `[{column -> value}, ... , {column -> value}]`


In [14]:
df = pd.read_json(path1 , )
df.head()

Unnamed: 0,title,id,revenue,genres,belongs_to_collection,runtime
0,Avengers: Endgame,299534,2797800564,"[{'id': 12, 'name': 'Adventure'}, {'id': 878, ...","{'id': 86311, 'name': 'The Avengers Collection...",181
1,Avatar,19995,2787965087,"[{'id': 28, 'name': 'Action'}, {'id': 12, 'nam...","{'id': 87096, 'name': 'Avatar Collection', 'po...",162
2,Star Wars: The Force Awakens,140607,2068223624,"[{'id': 28, 'name': 'Action'}, {'id': 12, 'nam...","{'id': 10, 'name': 'Star Wars Collection', 'po...",136
3,Avengers: Infinity War,299536,2046239637,"[{'id': 12, 'name': 'Adventure'}, {'id': 28, '...","{'id': 86311, 'name': 'The Avengers Collection...",149
4,Titanic,597,1845034188,"[{'id': 18, 'name': 'Drama'}, {'id': 10749, 'n...",,194


## Working with APIs and JSON (Part 1)

2. __Create an account__ on https://www.themoviedb.org/

3. Get your personal __API Key__

4. __API-Request__ (movie module): Load all available information for the movie with __movie id = 140607__ into a Pandas DataFrame. <br> See https://developers.themoviedb.org/3/movies/get-movie-details for more information

5. __API-Request__ (discover module): Load all movies with __release date between 2020-01-01 and 2020-02-29__ into a Pandas DataFrame. <br>
See https://www.themoviedb.org/documentation/api/discover and https://developers.themoviedb.org/3/discover/movie-discover for more information.

## Working with APIs and JSON (Part 2)

##  Importing and Saving the Movies Dataset (Best Practice)

6. __API-Request__ (movie module): Load all available information for the movies with movie id = [__299534, 19995, 140607, 299536, 597, 135397, 420818, 24428, 168259, 99861, 284054, 12445, 181808, 330457, 351286, 109445, 321612, 260513__] into a Pandas DataFrame and __save the dataset in a local json file__.

# +++++++++ See some Hints below +++++++++++++

# ++++++++++++++ Hints +++++++++++++++++++++

__Hints for 1.__ <br>
To load json files you can use 

In [None]:
with open("filename.json") as f:
    data = json.load(f)

and 

In [None]:
pd.DataFrame(data), pd.read_json(filename.json), pd.json_normalize(data)

the json files have the following orientation (important when using pd.read_json()):
- blockbusters.json -> record
- blockbusters2.json -> column
- blockbusters3.json -> split 

__Hints for 4., 5., 6.__<br>
Make API GET-requests with the library requests (import requests):

In [None]:
data = requests.get(url).json()

__Hints for 4. and 6.,__ <br> url structure for movie module:

"https://api.themoviedb.org/3/movie/insert_movie_id?api_key=insert_api_key" (replace "insert_movie_id" with movie id and "insert_api_key" with your personal api-key)

__Hints for 5.__<br>
url structure for discover module:

"https://api.themoviedb.org/3/discover/movie?api_key=insert_api_key&query1&query2..." (replace "insert_api_key" with your personal api-key and add appropriate queries)