## Step 2 - Data Collection

In this notebook we perform the TMDB API (https://developers.themoviedb.org/3/getting-started/introduction) requests to search for the films already produced.

At the end, we export the generated dataframe in a CSV file, to be used in the following stages of the project.

#### Installation of libraries

In [178]:
!pip install tmdbv3api

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/


#### Importing libraries

In [179]:
# Data Handling
import pandas as pd

# API request
import requests
from tmdbv3api import Movie, TMDb

# JSON manipulation
import json

# Ignore useless warning
import warnings
warnings.filterwarnings("ignore")

#### Setting the TMDB API (https://www.themoviedb.org/settings/api)

In [180]:
tmdb_movie = Movie()
tmdb = TMDb()
tmdb.api_key = '5b23ce0eeb95b91e1fae905d65606841'

#### Checking the data from movies_metadata.csv, from kaggle competition


In [181]:
df = pd.read_csv('/content/movies_metadata.csv')

In [182]:
  df.head(1)

Unnamed: 0,adult,belongs_to_collection,budget,genres,homepage,id,imdb_id,original_language,original_title,overview,...,release_date,revenue,runtime,spoken_languages,status,tagline,title,video,vote_average,vote_count
0,False,"{'id': 10194, 'name': 'Toy Story Collection', ...",30000000,"[{'id': 16, 'name': 'Animation'}, {'id': 35, '...",http://toystory.disney.com/toy-story,862,tt0114709,en,Toy Story,"Led by Woody, Andy's toys live happily in his ...",...,1995-10-30,373554033.0,81.0,"[{'iso_639_1': 'en', 'name': 'English'}]",Released,,Toy Story,False,7.7,5415.0


In [183]:
df.tail(1)
# Last movie in from 2017

Unnamed: 0,adult,belongs_to_collection,budget,genres,homepage,id,imdb_id,original_language,original_title,overview,...,release_date,revenue,runtime,spoken_languages,status,tagline,title,video,vote_average,vote_count
45465,False,,0,[],,461257,tt6980792,en,Queerama,50 years after decriminalisation of homosexual...,...,2017-06-09,0.0,75.0,"[{'iso_639_1': 'en', 'name': 'English'}]",Released,,Queerama,False,0.0,0.0


In [184]:
df.shape
# 45k movies

(45466, 24)

#### 2018 Movies (Unfortunately I was only able to find a reliable source of only american movies). (https://en.wikipedia.org/wiki/List_of_American_films_of_2018)

In [185]:
url = "https://en.wikipedia.org/wiki/List_of_American_films_of_2018"

In [186]:
# The complete list of movies, from 5/jan(Insidious: The Last Key) to 28/dez(Black Mirror: Bandersnatch)
df1 = pd.read_html(url, header=0)[2]
df2 = pd.read_html(url, header=0)[3]
df3 = pd.read_html(url, header=0)[4]
df4 = pd.read_html(url, header=0)[5]

In [187]:
df1.head(1)

Unnamed: 0,Opening,Opening.1,Title,Production company,Cast and crew,.mw-parser-output .tooltip-dotted{border-bottom:1px dotted;cursor:help}Ref.
0,JANUARY,5,Insidious: The Last Key,Universal Pictures / Blumhouse Productions / S...,Adam Robitel (director); Leigh Whannell (scree...,[2]


In [188]:
df4.tail(1)

Unnamed: 0,Opening,Opening.1,Title,Production company,Cast and crew,Ref.
65,DECEMBER,28,Black Mirror: Bandersnatch,Netflix,David Slade (director); Charlie Brooker (scree...,[261]


In [189]:
type(df_url0)

pandas.core.frame.DataFrame

In [190]:
# Merging all df's from the wikipedia link
df_2018 = df1.append(df2.append(df3.append(df4, ignore_index=True), ignore_index=True), ignore_index=True) 

In [191]:
df_2018.head(1)

Unnamed: 0,Opening,Opening.1,Title,Production company,Cast and crew,.mw-parser-output .tooltip-dotted{border-bottom:1px dotted;cursor:help}Ref.,Ref.
0,JANUARY,5,Insidious: The Last Key,Universal Pictures / Blumhouse Productions / S...,Adam Robitel (director); Leigh Whannell (scree...,[2],


In [192]:
df_2018.tail(1)

Unnamed: 0,Opening,Opening.1,Title,Production company,Cast and crew,.mw-parser-output .tooltip-dotted{border-bottom:1px dotted;cursor:help}Ref.,Ref.
271,DECEMBER,28,Black Mirror: Bandersnatch,Netflix,David Slade (director); Charlie Brooker (scree...,,[261]


In [193]:
# Filtering with two columns to be able to use a pandas.DataFrame instead of a pandas.Series
df_2018 = df_2018.filter(["Title", "Opening"], axis = 1) 
type(df_2018)

pandas.core.frame.DataFrame

In [194]:
df_2018.head(5)

Unnamed: 0,Title,Opening
0,Insidious: The Last Key,JANUARY
1,The Strange Ones,JANUARY
2,Stratton,JANUARY
3,Sweet Country,JANUARY
4,The Commuter,JANUARY


In [195]:
df_2018.tail(1)

Unnamed: 0,Title,Opening
271,Black Mirror: Bandersnatch,DECEMBER


In [196]:
df_2018.shape

(272, 2)

In [197]:
def get_id_movie(title_movie):
  result = tmdb_movie.search(title_movie)
  movie_id = result[0].id
  return movie_id

In [198]:
df_2018["id"] = df_2018["Title"].map(lambda x: get_id_movie(str(x)))

In [199]:
# Checking with movie_id
# recommendations = tmdb_movie.recommendations(movie_id=426258)

# for recommendation in recommendations:
#   print(recommendation.title)
#   print(recommendation.overview)

In [200]:
df_2018.drop(columns = ["Opening"], inplace = True)

In [201]:
df_2018.rename(columns = {"Title": "original_title"}, inplace = True)

In [202]:
df_2018.head(1)

Unnamed: 0,original_title,id
0,Insidious: The Last Key,406563


In [203]:
df_2018.tail(1)

Unnamed: 0,original_title,id
271,Black Mirror: Bandersnatch,569547


#### 2019 Movies (Again only american movies) (https://en.wikipedia.org/wiki/List_of_American_films_of_2019)




In [204]:
url_2019 = "https://en.wikipedia.org/wiki/List_of_American_films_of_2019"

In [205]:
# The complete list of movies, from 4/jan(Escape Room) to 27/dez(Clemency)
df_url0_2019 = pd.read_html(url_2019, header = 0)[2]
df_url1_2019 = pd.read_html(url_2019, header = 0)[3]
df_url2_2019 = pd.read_html(url_2019, header = 0)[4]
df_url3_2019 = pd.read_html(url_2019, header = 0)[5]

In [206]:
df_url0_2019.head(1)

Unnamed: 0,Opening,Opening.1,Title,Production company,Cast and crew,Ref.
0,JANUARY,4,Escape Room,Columbia Pictures / Original Film,"Adam Robitel (director); Bragi F. Schut, Maria...",[2]


In [207]:
df_url3_2019.tail(1)

Unnamed: 0,Opening,Opening.1,Title,Production company,Cast and crew,Ref.
69,DECEMBER,27,Clemency,Neon,Chinonye Chukwu (director/screenplay); Alfre W...,[224]


In [208]:
df_2019 = df_url0_2019.append(df_url1_2019.append(df_url2_2019.append(df_url3_2019, ignore_index=False), ignore_index=False), ignore_index=False)

In [209]:
df_2019.head(1)

Unnamed: 0,Opening,Opening.1,Title,Production company,Cast and crew,Ref.
0,JANUARY,4,Escape Room,Columbia Pictures / Original Film,"Adam Robitel (director); Bragi F. Schut, Maria...",[2]


In [210]:
df_2019.tail(1)

Unnamed: 0,Opening,Opening.1,Title,Production company,Cast and crew,Ref.
69,DECEMBER,27,Clemency,Neon,Chinonye Chukwu (director/screenplay); Alfre W...,[224]


In [211]:
df_2019 = df_2019.filter(["Title", "Opening"], axis = 1)
df_2019.head()

Unnamed: 0,Title,Opening
0,Escape Room,JANUARY
1,Rust Creek,JANUARY
2,American Hangman,JANUARY
3,A Dog's Way Home,JANUARY
4,The Upside,JANUARY


In [212]:
df_2019.shape

(242, 2)

In [213]:
df_2019["id"] = df_2019["Title"].map(lambda x: get_id_movie(str(x)))

In [214]:
df_2019.drop(columns=["Opening"], inplace=True)

In [215]:
df_2019.head(1)

Unnamed: 0,Title,id
0,Escape Room,522681


In [216]:
df_2019.rename(columns= {"Title": "original_title"}, inplace=True)

In [217]:
df_2019.head(1)

Unnamed: 0,original_title,id
0,Escape Room,522681


#### Concatenating

In [218]:
dfs_lists = [df_2018, df_2019]

In [219]:
type(dfs_lists)

list

In [220]:
dfs_concat = pd.concat(dfs_lists)

In [221]:
dfs_concat.head(1)

Unnamed: 0,original_title,id
0,Insidious: The Last Key,406563


In [222]:
dfs_concat.tail(1)

Unnamed: 0,original_title,id
69,Clemency,565307


In [223]:
df_2018.shape

(272, 2)

In [224]:
df_2019.shape

(242, 2)

In [225]:
dfs_concat.shape

(514, 2)

#### Kaggle Dataset (https://www.kaggle.com/datasets/tmdb/tmdb-movie-metadata?select=tmdb_5000_credits.csv)

In [226]:
df_movies = pd.read_csv("tmdb_5000_movies.csv")

In [227]:
df_movies.head()

Unnamed: 0,budget,genres,homepage,id,keywords,original_language,original_title,overview,popularity,production_companies,production_countries,release_date,revenue,runtime,spoken_languages,status,tagline,title,vote_average,vote_count
0,237000000,"[{""id"": 28, ""name"": ""Action""}, {""id"": 12, ""nam...",http://www.avatarmovie.com/,19995,"[{""id"": 1463, ""name"": ""culture clash""}, {""id"":...",en,Avatar,"In the 22nd century, a paraplegic Marine is di...",150.437577,"[{""name"": ""Ingenious Film Partners"", ""id"": 289...","[{""iso_3166_1"": ""US"", ""name"": ""United States o...",2009-12-10,2787965087,162.0,"[{""iso_639_1"": ""en"", ""name"": ""English""}, {""iso...",Released,Enter the World of Pandora.,Avatar,7.2,11800
1,300000000,"[{""id"": 12, ""name"": ""Adventure""}, {""id"": 14, ""...",http://disney.go.com/disneypictures/pirates/,285,"[{""id"": 270, ""name"": ""ocean""}, {""id"": 726, ""na...",en,Pirates of the Caribbean: At World's End,"Captain Barbossa, long believed to be dead, ha...",139.082615,"[{""name"": ""Walt Disney Pictures"", ""id"": 2}, {""...","[{""iso_3166_1"": ""US"", ""name"": ""United States o...",2007-05-19,961000000,169.0,"[{""iso_639_1"": ""en"", ""name"": ""English""}]",Released,"At the end of the world, the adventure begins.",Pirates of the Caribbean: At World's End,6.9,4500
2,245000000,"[{""id"": 28, ""name"": ""Action""}, {""id"": 12, ""nam...",http://www.sonypictures.com/movies/spectre/,206647,"[{""id"": 470, ""name"": ""spy""}, {""id"": 818, ""name...",en,Spectre,A cryptic message from Bond’s past sends him o...,107.376788,"[{""name"": ""Columbia Pictures"", ""id"": 5}, {""nam...","[{""iso_3166_1"": ""GB"", ""name"": ""United Kingdom""...",2015-10-26,880674609,148.0,"[{""iso_639_1"": ""fr"", ""name"": ""Fran\u00e7ais""},...",Released,A Plan No One Escapes,Spectre,6.3,4466
3,250000000,"[{""id"": 28, ""name"": ""Action""}, {""id"": 80, ""nam...",http://www.thedarkknightrises.com/,49026,"[{""id"": 849, ""name"": ""dc comics""}, {""id"": 853,...",en,The Dark Knight Rises,Following the death of District Attorney Harve...,112.31295,"[{""name"": ""Legendary Pictures"", ""id"": 923}, {""...","[{""iso_3166_1"": ""US"", ""name"": ""United States o...",2012-07-16,1084939099,165.0,"[{""iso_639_1"": ""en"", ""name"": ""English""}]",Released,The Legend Ends,The Dark Knight Rises,7.6,9106
4,260000000,"[{""id"": 28, ""name"": ""Action""}, {""id"": 12, ""nam...",http://movies.disney.com/john-carter,49529,"[{""id"": 818, ""name"": ""based on novel""}, {""id"":...",en,John Carter,"John Carter is a war-weary, former military ca...",43.926995,"[{""name"": ""Walt Disney Pictures"", ""id"": 2}]","[{""iso_3166_1"": ""US"", ""name"": ""United States o...",2012-03-07,284139100,132.0,"[{""iso_639_1"": ""en"", ""name"": ""English""}]",Released,"Lost in our world, found in another.",John Carter,6.1,2124


#### Evaluating return from JSON

In [228]:
response = requests.get("https://api.themoviedb.org/3/movie/{}?api_key={}". format(406563, tmdb.api_key))

In [229]:
print(response)

<Response [200]>


In [230]:
data_json = response.json()

In [231]:
data_json

{'adult': False,
 'backdrop_path': '/PwI3EfasE9fVuXsmMu9ffJh0Re.jpg',
 'belongs_to_collection': {'backdrop_path': '/5FrPZHgbbmTIq0oxpwSGqu5HyXC.jpg',
  'id': 228446,
  'name': 'Insidious Collection',
  'poster_path': '/w1213HKk1XKSwHiBgjkWghn9biC.jpg'},
 'budget': 10000000,
 'genres': [{'id': 27, 'name': 'Horror'},
  {'id': 9648, 'name': 'Mystery'},
  {'id': 53, 'name': 'Thriller'}],
 'homepage': 'http://www.insidiousmovie.com',
 'id': 406563,
 'imdb_id': 'tt5726086',
 'original_language': 'en',
 'original_title': 'Insidious: The Last Key',
 'overview': 'Parapsychologist Elise Rainier and her team travel to Five Keys, NM, to investigate a man’s claim of a haunting. Terror soon strikes when Rainier realizes that the house he lives in was her family’s old home.',
 'popularity': 52.151,
 'poster_path': '/nb9fc9INMg8kQ8L7sE7XTNsZnUX.jpg',
 'production_companies': [{'id': 11341,
   'logo_path': '/xytTBODEy3p20ksHL4Ftxr483Iv.png',
   'name': 'Stage 6 Films',
   'origin_country': 'US'},
  {'i

#### genrer

In [232]:
dfs_concat.head(1)

Unnamed: 0,original_title,id
0,Insidious: The Last Key,406563


In [233]:
def get_genres(movie_id):
  response = requests.get("https://api.themoviedb.org/3/movie/{}?api_key={}". format(movie_id, tmdb.api_key))
  data_json = response.json()
  return data_json["genres"]

In [234]:
dfs_concat['genres'] = dfs_concat["id"].map(lambda x: get_genres(str(x)))

In [235]:
dfs_concat.head(1)

Unnamed: 0,original_title,id,genres
0,Insidious: The Last Key,406563,"[{'id': 27, 'name': 'Horror'}, {'id': 9648, 'n..."


#### budget

In [236]:
def get_budget(movie_id):
  response = requests.get("https://api.themoviedb.org/3/movie/{}?api_key={}". format(movie_id, tmdb.api_key))
  data_json = response.json()
  return data_json["budget"]

In [237]:
dfs_concat['budget'] = dfs_concat["id"].map(lambda x: get_budget(str(x)))

In [238]:
dfs_concat.head(1)

Unnamed: 0,original_title,id,genres,budget
0,Insidious: The Last Key,406563,"[{'id': 27, 'name': 'Horror'}, {'id': 9648, 'n...",10000000


#### homepage

In [239]:
def get_homepage(movie_id):
  response = requests.get("https://api.themoviedb.org/3/movie/{}?api_key={}". format(movie_id, tmdb.api_key))
  data_json = response.json()
  return data_json["homepage"]

In [240]:
dfs_concat['homepage'] = dfs_concat["id"].map(lambda x: get_homepage(str(x)))

In [241]:
dfs_concat.head(1)

Unnamed: 0,original_title,id,genres,budget,homepage
0,Insidious: The Last Key,406563,"[{'id': 27, 'name': 'Horror'}, {'id': 9648, 'n...",10000000,http://www.insidiousmovie.com


#### keywords

In [242]:
response = requests.get("https://api.themoviedb.org/3/movie/{}/keywords?api_key={}". format(406563, tmdb.api_key))

In [243]:
data_json = response.json()

In [244]:
data_json

{'id': 406563,
 'keywords': [{'id': 2723, 'name': 'medium'},
  {'id': 2849, 'name': 'key'},
  {'id': 3358, 'name': 'haunted house'},
  {'id': 9675, 'name': 'prequel'},
  {'id': 13153, 'name': 'spirit'},
  {'id': 208611, 'name': '1950s'}]}

In [245]:
def get_keywords(movie_id):
  response = requests.get("https://api.themoviedb.org/3/movie/{}/keywords?api_key={}". format(movie_id, tmdb.api_key))
  data_json = response.json()
  return data_json["keywords"]

In [246]:
dfs_concat['keywords'] = dfs_concat['id'].map(lambda x: get_keywords(str(x)))

In [247]:
dfs_concat.head(1)

Unnamed: 0,original_title,id,genres,budget,homepage,keywords
0,Insidious: The Last Key,406563,"[{'id': 27, 'name': 'Horror'}, {'id': 9648, 'n...",10000000,http://www.insidiousmovie.com,"[{'id': 2723, 'name': 'medium'}, {'id': 2849, ..."


#### original_language	

In [248]:
def get_original_language(movie_id):
  response = requests.get("https://api.themoviedb.org/3/movie/{}?api_key={}". format(movie_id, tmdb.api_key))
  data_json = response.json()
  return data_json["original_language"]

In [249]:
dfs_concat['original_language'] = dfs_concat['id'].map(lambda x: get_original_language(str(x)))

In [250]:
dfs_concat.head(1)

Unnamed: 0,original_title,id,genres,budget,homepage,keywords,original_language
0,Insidious: The Last Key,406563,"[{'id': 27, 'name': 'Horror'}, {'id': 9648, 'n...",10000000,http://www.insidiousmovie.com,"[{'id': 2723, 'name': 'medium'}, {'id': 2849, ...",en


#### overview

In [251]:
def get_overview(movie_id): 
  response = requests.get("https://api.themoviedb.org/3/movie/{}?api_key={}". format(movie_id, tmdb.api_key))
  data_json = response.json()
  return data_json["overview"]

In [252]:
dfs_concat['overview'] = dfs_concat['id'].map(lambda x: get_overview(str(x)))

In [253]:
dfs_concat.head(1)

Unnamed: 0,original_title,id,genres,budget,homepage,keywords,original_language,overview
0,Insidious: The Last Key,406563,"[{'id': 27, 'name': 'Horror'}, {'id': 9648, 'n...",10000000,http://www.insidiousmovie.com,"[{'id': 2723, 'name': 'medium'}, {'id': 2849, ...",en,Parapsychologist Elise Rainier and her team tr...


#### popularity

In [254]:
def get_popularity(movie_id): 
  response = requests.get("https://api.themoviedb.org/3/movie/{}?api_key={}". format(movie_id, tmdb.api_key))
  data_json = response.json()
  return data_json["popularity"]

In [255]:
dfs_concat['popularity'] = dfs_concat['id'].map(lambda x: get_popularity(str(x)))

In [256]:
dfs_concat.head(1)

Unnamed: 0,original_title,id,genres,budget,homepage,keywords,original_language,overview,popularity
0,Insidious: The Last Key,406563,"[{'id': 27, 'name': 'Horror'}, {'id': 9648, 'n...",10000000,http://www.insidiousmovie.com,"[{'id': 2723, 'name': 'medium'}, {'id': 2849, ...",en,Parapsychologist Elise Rainier and her team tr...,52.151


#### production_companies	

In [257]:
def get_production_companies(movie_id): 
  response = requests.get("https://api.themoviedb.org/3/movie/{}?api_key={}". format(movie_id, tmdb.api_key))
  data_json = response.json()
  return data_json["production_companies"]

In [258]:
dfs_concat['production_companies'] = dfs_concat['id'].map(lambda x: get_production_companies(str(x)))

In [259]:
dfs_concat.head(1)

Unnamed: 0,original_title,id,genres,budget,homepage,keywords,original_language,overview,popularity,production_companies
0,Insidious: The Last Key,406563,"[{'id': 27, 'name': 'Horror'}, {'id': 9648, 'n...",10000000,http://www.insidiousmovie.com,"[{'id': 2723, 'name': 'medium'}, {'id': 2849, ...",en,Parapsychologist Elise Rainier and her team tr...,52.151,"[{'id': 11341, 'logo_path': '/xytTBODEy3p20ksH..."


#### production_countries

In [260]:
def get_production_countries(movie_id): 
  response = requests.get("https://api.themoviedb.org/3/movie/{}?api_key={}". format(movie_id, tmdb.api_key))
  data_json = response.json()
  return data_json["production_countries"]

In [261]:
dfs_concat['production_countries'] = dfs_concat['id'].map(lambda x: get_production_countries(str(x)))

In [262]:
dfs_concat.head(1)

Unnamed: 0,original_title,id,genres,budget,homepage,keywords,original_language,overview,popularity,production_companies,production_countries
0,Insidious: The Last Key,406563,"[{'id': 27, 'name': 'Horror'}, {'id': 9648, 'n...",10000000,http://www.insidiousmovie.com,"[{'id': 2723, 'name': 'medium'}, {'id': 2849, ...",en,Parapsychologist Elise Rainier and her team tr...,52.151,"[{'id': 11341, 'logo_path': '/xytTBODEy3p20ksH...","[{'iso_3166_1': 'US', 'name': 'United States o..."


#### release_date

In [263]:
def get_release_date(movie_id): 
  response = requests.get("https://api.themoviedb.org/3/movie/{}?api_key={}". format(movie_id, tmdb.api_key))
  data_json = response.json()
  return data_json["release_date"]

In [264]:
dfs_concat['release_date'] = dfs_concat['id'].map(lambda x: get_release_date(str(x)))

In [265]:
dfs_concat.head(1)

Unnamed: 0,original_title,id,genres,budget,homepage,keywords,original_language,overview,popularity,production_companies,production_countries,release_date
0,Insidious: The Last Key,406563,"[{'id': 27, 'name': 'Horror'}, {'id': 9648, 'n...",10000000,http://www.insidiousmovie.com,"[{'id': 2723, 'name': 'medium'}, {'id': 2849, ...",en,Parapsychologist Elise Rainier and her team tr...,52.151,"[{'id': 11341, 'logo_path': '/xytTBODEy3p20ksH...","[{'iso_3166_1': 'US', 'name': 'United States o...",2018-01-03


#### revenue

In [266]:
def get_revenue(movie_id): 
  response = requests.get("https://api.themoviedb.org/3/movie/{}?api_key={}". format(movie_id, tmdb.api_key))
  data_json = response.json()
  return data_json["revenue"]

In [267]:
dfs_concat['revenue'] = dfs_concat['id'].map(lambda x: get_revenue(str(x)))

In [268]:
dfs_concat.head(1)

Unnamed: 0,original_title,id,genres,budget,homepage,keywords,original_language,overview,popularity,production_companies,production_countries,release_date,revenue
0,Insidious: The Last Key,406563,"[{'id': 27, 'name': 'Horror'}, {'id': 9648, 'n...",10000000,http://www.insidiousmovie.com,"[{'id': 2723, 'name': 'medium'}, {'id': 2849, ...",en,Parapsychologist Elise Rainier and her team tr...,52.151,"[{'id': 11341, 'logo_path': '/xytTBODEy3p20ksH...","[{'iso_3166_1': 'US', 'name': 'United States o...",2018-01-03,167184112


#### runtime

In [269]:
def get_runtime(movie_id): 
  response = requests.get("https://api.themoviedb.org/3/movie/{}?api_key={}". format(movie_id, tmdb.api_key))
  data_json = response.json()
  return data_json["runtime"]

In [270]:
dfs_concat['runtime'] = dfs_concat['id'].map(lambda x: get_runtime(str(x)))

In [271]:
dfs_concat.head(1)

Unnamed: 0,original_title,id,genres,budget,homepage,keywords,original_language,overview,popularity,production_companies,production_countries,release_date,revenue,runtime
0,Insidious: The Last Key,406563,"[{'id': 27, 'name': 'Horror'}, {'id': 9648, 'n...",10000000,http://www.insidiousmovie.com,"[{'id': 2723, 'name': 'medium'}, {'id': 2849, ...",en,Parapsychologist Elise Rainier and her team tr...,52.151,"[{'id': 11341, 'logo_path': '/xytTBODEy3p20ksH...","[{'iso_3166_1': 'US', 'name': 'United States o...",2018-01-03,167184112,103


#### spoken_languages	

In [272]:
def get_spoken_languages(movie_id): 
  response = requests.get("https://api.themoviedb.org/3/movie/{}?api_key={}". format(movie_id, tmdb.api_key))
  data_json = response.json()
  return data_json["spoken_languages"]

In [273]:
dfs_concat['spoken_languages'] = dfs_concat['id'].map(lambda x: get_spoken_languages(str(x)))

In [274]:
dfs_concat.head(1)

Unnamed: 0,original_title,id,genres,budget,homepage,keywords,original_language,overview,popularity,production_companies,production_countries,release_date,revenue,runtime,spoken_languages
0,Insidious: The Last Key,406563,"[{'id': 27, 'name': 'Horror'}, {'id': 9648, 'n...",10000000,http://www.insidiousmovie.com,"[{'id': 2723, 'name': 'medium'}, {'id': 2849, ...",en,Parapsychologist Elise Rainier and her team tr...,52.151,"[{'id': 11341, 'logo_path': '/xytTBODEy3p20ksH...","[{'iso_3166_1': 'US', 'name': 'United States o...",2018-01-03,167184112,103,"[{'english_name': 'English', 'iso_639_1': 'en'..."


#### status

In [275]:
def get_status(movie_id): 
  response = requests.get("https://api.themoviedb.org/3/movie/{}?api_key={}". format(movie_id, tmdb.api_key))
  data_json = response.json()
  return data_json["status"]

In [276]:
dfs_concat['status'] = dfs_concat['id'].map(lambda x: get_status(str(x)))

In [277]:
dfs_concat.head(1)

Unnamed: 0,original_title,id,genres,budget,homepage,keywords,original_language,overview,popularity,production_companies,production_countries,release_date,revenue,runtime,spoken_languages,status
0,Insidious: The Last Key,406563,"[{'id': 27, 'name': 'Horror'}, {'id': 9648, 'n...",10000000,http://www.insidiousmovie.com,"[{'id': 2723, 'name': 'medium'}, {'id': 2849, ...",en,Parapsychologist Elise Rainier and her team tr...,52.151,"[{'id': 11341, 'logo_path': '/xytTBODEy3p20ksH...","[{'iso_3166_1': 'US', 'name': 'United States o...",2018-01-03,167184112,103,"[{'english_name': 'English', 'iso_639_1': 'en'...",Released


#### tagline

In [278]:
def get_tagline(movie_id): 
  response = requests.get("https://api.themoviedb.org/3/movie/{}?api_key={}". format(movie_id, tmdb.api_key))
  data_json = response.json()
  return data_json["tagline"]

In [279]:
dfs_concat['tagline'] = dfs_concat['id'].map(lambda x: get_tagline(str(x)))

In [280]:
dfs_concat.head(1)

Unnamed: 0,original_title,id,genres,budget,homepage,keywords,original_language,overview,popularity,production_companies,production_countries,release_date,revenue,runtime,spoken_languages,status,tagline
0,Insidious: The Last Key,406563,"[{'id': 27, 'name': 'Horror'}, {'id': 9648, 'n...",10000000,http://www.insidiousmovie.com,"[{'id': 2723, 'name': 'medium'}, {'id': 2849, ...",en,Parapsychologist Elise Rainier and her team tr...,52.151,"[{'id': 11341, 'logo_path': '/xytTBODEy3p20ksH...","[{'iso_3166_1': 'US', 'name': 'United States o...",2018-01-03,167184112,103,"[{'english_name': 'English', 'iso_639_1': 'en'...",Released,Fear comes home.


#### title

In [281]:
def get_title(movie_id): 
  response = requests.get("https://api.themoviedb.org/3/movie/{}?api_key={}". format(movie_id, tmdb.api_key))
  data_json = response.json()
  return data_json["title"]

In [282]:
dfs_concat['title'] = dfs_concat['id'].map(lambda x: get_title(str(x)))

In [283]:
dfs_concat.head(1)

Unnamed: 0,original_title,id,genres,budget,homepage,keywords,original_language,overview,popularity,production_companies,production_countries,release_date,revenue,runtime,spoken_languages,status,tagline,title
0,Insidious: The Last Key,406563,"[{'id': 27, 'name': 'Horror'}, {'id': 9648, 'n...",10000000,http://www.insidiousmovie.com,"[{'id': 2723, 'name': 'medium'}, {'id': 2849, ...",en,Parapsychologist Elise Rainier and her team tr...,52.151,"[{'id': 11341, 'logo_path': '/xytTBODEy3p20ksH...","[{'iso_3166_1': 'US', 'name': 'United States o...",2018-01-03,167184112,103,"[{'english_name': 'English', 'iso_639_1': 'en'...",Released,Fear comes home.,Insidious: The Last Key


#### vote_average	

In [284]:
def get_vote_average(movie_id): 
  response = requests.get("https://api.themoviedb.org/3/movie/{}?api_key={}". format(movie_id, tmdb.api_key))
  data_json = response.json()
  return data_json["vote_average"]

In [285]:
dfs_concat['vote_average'] = dfs_concat['id'].map(lambda x: get_vote_average(str(x)))

In [286]:
dfs_concat.head(1)

Unnamed: 0,original_title,id,genres,budget,homepage,keywords,original_language,overview,popularity,production_companies,production_countries,release_date,revenue,runtime,spoken_languages,status,tagline,title,vote_average
0,Insidious: The Last Key,406563,"[{'id': 27, 'name': 'Horror'}, {'id': 9648, 'n...",10000000,http://www.insidiousmovie.com,"[{'id': 2723, 'name': 'medium'}, {'id': 2849, ...",en,Parapsychologist Elise Rainier and her team tr...,52.151,"[{'id': 11341, 'logo_path': '/xytTBODEy3p20ksH...","[{'iso_3166_1': 'US', 'name': 'United States o...",2018-01-03,167184112,103,"[{'english_name': 'English', 'iso_639_1': 'en'...",Released,Fear comes home.,Insidious: The Last Key,6.2


#### vote_count

In [287]:
def get_vote_count(movie_id): 
  response = requests.get("https://api.themoviedb.org/3/movie/{}?api_key={}". format(movie_id, tmdb.api_key))
  data_json = response.json()
  return data_json["vote_count"]

In [288]:
dfs_concat['vote_count'] = dfs_concat['id'].map(lambda x: get_vote_count(str(x)))

In [289]:
dfs_concat.head(1)

Unnamed: 0,original_title,id,genres,budget,homepage,keywords,original_language,overview,popularity,production_companies,production_countries,release_date,revenue,runtime,spoken_languages,status,tagline,title,vote_average,vote_count
0,Insidious: The Last Key,406563,"[{'id': 27, 'name': 'Horror'}, {'id': 9648, 'n...",10000000,http://www.insidiousmovie.com,"[{'id': 2723, 'name': 'medium'}, {'id': 2849, ...",en,Parapsychologist Elise Rainier and her team tr...,52.151,"[{'id': 11341, 'logo_path': '/xytTBODEy3p20ksH...","[{'iso_3166_1': 'US', 'name': 'United States o...",2018-01-03,167184112,103,"[{'english_name': 'English', 'iso_639_1': 'en'...",Released,Fear comes home.,Insidious: The Last Key,6.2,2241


#### Comparing DataFrames (columns)

In [290]:
df_movies.columns

Index(['budget', 'genres', 'homepage', 'id', 'keywords', 'original_language',
       'original_title', 'overview', 'popularity', 'production_companies',
       'production_countries', 'release_date', 'revenue', 'runtime',
       'spoken_languages', 'status', 'tagline', 'title', 'vote_average',
       'vote_count'],
      dtype='object')

In [291]:
df_movies.shape

(4803, 20)

In [292]:
dfs_concat.columns

Index(['original_title', 'id', 'genres', 'budget', 'homepage', 'keywords',
       'original_language', 'overview', 'popularity', 'production_companies',
       'production_countries', 'release_date', 'revenue', 'runtime',
       'spoken_languages', 'status', 'tagline', 'title', 'vote_average',
       'vote_count'],
      dtype='object')

In [293]:
dfs_concat.shape

(514, 20)

In [294]:
dfs_concat.drop(columns=["runtime"], inplace=True)

In [295]:
dfs_concat.shape

(514, 19)

In [296]:
df_movies.head(2)

Unnamed: 0,budget,genres,homepage,id,keywords,original_language,original_title,overview,popularity,production_companies,production_countries,release_date,revenue,runtime,spoken_languages,status,tagline,title,vote_average,vote_count
0,237000000,"[{""id"": 28, ""name"": ""Action""}, {""id"": 12, ""nam...",http://www.avatarmovie.com/,19995,"[{""id"": 1463, ""name"": ""culture clash""}, {""id"":...",en,Avatar,"In the 22nd century, a paraplegic Marine is di...",150.437577,"[{""name"": ""Ingenious Film Partners"", ""id"": 289...","[{""iso_3166_1"": ""US"", ""name"": ""United States o...",2009-12-10,2787965087,162.0,"[{""iso_639_1"": ""en"", ""name"": ""English""}, {""iso...",Released,Enter the World of Pandora.,Avatar,7.2,11800
1,300000000,"[{""id"": 12, ""name"": ""Adventure""}, {""id"": 14, ""...",http://disney.go.com/disneypictures/pirates/,285,"[{""id"": 270, ""name"": ""ocean""}, {""id"": 726, ""na...",en,Pirates of the Caribbean: At World's End,"Captain Barbossa, long believed to be dead, ha...",139.082615,"[{""name"": ""Walt Disney Pictures"", ""id"": 2}, {""...","[{""iso_3166_1"": ""US"", ""name"": ""United States o...",2007-05-19,961000000,169.0,"[{""iso_639_1"": ""en"", ""name"": ""English""}]",Released,"At the end of the world, the adventure begins.",Pirates of the Caribbean: At World's End,6.9,4500


In [297]:
dfs_concat.head(2)

Unnamed: 0,original_title,id,genres,budget,homepage,keywords,original_language,overview,popularity,production_companies,production_countries,release_date,revenue,spoken_languages,status,tagline,title,vote_average,vote_count
0,Insidious: The Last Key,406563,"[{'id': 27, 'name': 'Horror'}, {'id': 9648, 'n...",10000000,http://www.insidiousmovie.com,"[{'id': 2723, 'name': 'medium'}, {'id': 2849, ...",en,Parapsychologist Elise Rainier and her team tr...,52.151,"[{'id': 11341, 'logo_path': '/xytTBODEy3p20ksH...","[{'iso_3166_1': 'US', 'name': 'United States o...",2018-01-03,167184112,"[{'english_name': 'English', 'iso_639_1': 'en'...",Released,Fear comes home.,Insidious: The Last Key,6.2,2241
1,The Strange Ones,426258,"[{'id': 53, 'name': 'Thriller'}, {'id': 18, 'n...",0,http://thestrangeones.com/,"[{'id': 380, 'name': 'sibling relationship'}, ...",en,Mysterious events surround the travels of two ...,5.9,"[{'id': 35562, 'logo_path': '/cpeGUCuKo2vgaxzr...","[{'iso_3166_1': 'US', 'name': 'United States o...",2018-01-05,0,"[{'english_name': 'English', 'iso_639_1': 'en'...",Released,,The Strange Ones,5.5,62


#### Credits merge

In [298]:
df_movies_credits = pd.read_csv("tmdb_5000_credits.csv")

In [299]:
df_movies_credits.head(1)

Unnamed: 0,movie_id,title,cast,crew
0,19995,Avatar,"[{""cast_id"": 242, ""character"": ""Jake Sully"", ""...","[{""credit_id"": ""52fe48009251416c750aca23"", ""de..."


In [300]:
df_movies_credits.rename(columns = {'movie_id': "id"}, inplace=True)

In [301]:
df_movies_credits.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 4803 entries, 0 to 4802
Data columns (total 4 columns):
 #   Column  Non-Null Count  Dtype 
---  ------  --------------  ----- 
 0   id      4803 non-null   int64 
 1   title   4803 non-null   object
 2   cast    4803 non-null   object
 3   crew    4803 non-null   object
dtypes: int64(1), object(3)
memory usage: 150.2+ KB


In [302]:
df_movies.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 4803 entries, 0 to 4802
Data columns (total 20 columns):
 #   Column                Non-Null Count  Dtype  
---  ------                --------------  -----  
 0   budget                4803 non-null   int64  
 1   genres                4803 non-null   object 
 2   homepage              1712 non-null   object 
 3   id                    4803 non-null   int64  
 4   keywords              4803 non-null   object 
 5   original_language     4803 non-null   object 
 6   original_title        4803 non-null   object 
 7   overview              4800 non-null   object 
 8   popularity            4803 non-null   float64
 9   production_companies  4803 non-null   object 
 10  production_countries  4803 non-null   object 
 11  release_date          4802 non-null   object 
 12  revenue               4803 non-null   int64  
 13  runtime               4801 non-null   float64
 14  spoken_languages      4803 non-null   object 
 15  status               

In [303]:
df_movies = df_movies.merge(df_movies_credits, on="id")

In [304]:
df_movies.head(1)

Unnamed: 0,budget,genres,homepage,id,keywords,original_language,original_title,overview,popularity,production_companies,...,runtime,spoken_languages,status,tagline,title_x,vote_average,vote_count,title_y,cast,crew
0,237000000,"[{""id"": 28, ""name"": ""Action""}, {""id"": 12, ""nam...",http://www.avatarmovie.com/,19995,"[{""id"": 1463, ""name"": ""culture clash""}, {""id"":...",en,Avatar,"In the 22nd century, a paraplegic Marine is di...",150.437577,"[{""name"": ""Ingenious Film Partners"", ""id"": 289...",...,162.0,"[{""iso_639_1"": ""en"", ""name"": ""English""}, {""iso...",Released,Enter the World of Pandora.,Avatar,7.2,11800,Avatar,"[{""cast_id"": 242, ""character"": ""Jake Sully"", ""...","[{""credit_id"": ""52fe48009251416c750aca23"", ""de..."


#### Movies added later (2018 and 2019)

In [305]:
response = requests.get("https://api.themoviedb.org/3/movie/{}/credits?api_key={}". format(19995, tmdb.api_key))

In [306]:
print(response)

<Response [200]>


In [307]:
data_json = response.json()

In [308]:
data_json

{'cast': [{'adult': False,
   'cast_id': 242,
   'character': 'Jake Sully',
   'credit_id': '5602a8a7c3a3685532001c9a',
   'gender': 2,
   'id': 65731,
   'known_for_department': 'Acting',
   'name': 'Sam Worthington',
   'order': 0,
   'original_name': 'Sam Worthington',
   'popularity': 28.647,
   'profile_path': '/blKKsHlJIL9PmUQZB8f3YmMBW5Y.jpg'},
  {'adult': False,
   'cast_id': 3,
   'character': 'Neytiri',
   'credit_id': '52fe48009251416c750ac9cb',
   'gender': 1,
   'id': 8691,
   'known_for_department': 'Acting',
   'name': 'Zoe Saldana',
   'order': 1,
   'original_name': 'Zoe Saldana',
   'popularity': 30.797,
   'profile_path': '/aGb3JzoumA89gRFwbJYAxFm5Qdk.jpg'},
  {'adult': False,
   'cast_id': 25,
   'character': 'Dr. Grace Augustine',
   'credit_id': '52fe48009251416c750aca39',
   'gender': 1,
   'id': 10205,
   'known_for_department': 'Acting',
   'name': 'Sigourney Weaver',
   'order': 2,
   'original_name': 'Sigourney Weaver',
   'popularity': 15.092,
   'profile_pa

In [309]:
def get_credits_cast(movie_id): 
  response = requests.get("https://api.themoviedb.org/3/movie/{}/credits?api_key={}". format(movie_id, tmdb.api_key))
  data_json = response.json()
  return data_json["cast"]

In [310]:
dfs_concat['cast'] = dfs_concat['id'].map(lambda x: get_credits_cast(str(x)))

In [311]:
dfs_concat.head(1)

Unnamed: 0,original_title,id,genres,budget,homepage,keywords,original_language,overview,popularity,production_companies,production_countries,release_date,revenue,spoken_languages,status,tagline,title,vote_average,vote_count,cast
0,Insidious: The Last Key,406563,"[{'id': 27, 'name': 'Horror'}, {'id': 9648, 'n...",10000000,http://www.insidiousmovie.com,"[{'id': 2723, 'name': 'medium'}, {'id': 2849, ...",en,Parapsychologist Elise Rainier and her team tr...,52.151,"[{'id': 11341, 'logo_path': '/xytTBODEy3p20ksH...","[{'iso_3166_1': 'US', 'name': 'United States o...",2018-01-03,167184112,"[{'english_name': 'English', 'iso_639_1': 'en'...",Released,Fear comes home.,Insidious: The Last Key,6.2,2241,"[{'adult': False, 'gender': 1, 'id': 7401, 'kn..."


In [312]:
def get_credits_crew(movie_id): 
  response = requests.get("https://api.themoviedb.org/3/movie/{}/credits?api_key={}". format(movie_id, tmdb.api_key))
  data_json = response.json()
  return data_json["crew"]

In [313]:
dfs_concat['crew'] = dfs_concat['id'].map(lambda x: get_credits_crew(str(x)))

In [314]:
dfs_concat.head()

Unnamed: 0,original_title,id,genres,budget,homepage,keywords,original_language,overview,popularity,production_companies,...,release_date,revenue,spoken_languages,status,tagline,title,vote_average,vote_count,cast,crew
0,Insidious: The Last Key,406563,"[{'id': 27, 'name': 'Horror'}, {'id': 9648, 'n...",10000000,http://www.insidiousmovie.com,"[{'id': 2723, 'name': 'medium'}, {'id': 2849, ...",en,Parapsychologist Elise Rainier and her team tr...,52.151,"[{'id': 11341, 'logo_path': '/xytTBODEy3p20ksH...",...,2018-01-03,167184112,"[{'english_name': 'English', 'iso_639_1': 'en'...",Released,Fear comes home.,Insidious: The Last Key,6.2,2241,"[{'adult': False, 'gender': 1, 'id': 7401, 'kn...","[{'adult': False, 'gender': 1, 'id': 494, 'kno..."
1,The Strange Ones,426258,"[{'id': 53, 'name': 'Thriller'}, {'id': 18, 'n...",0,http://thestrangeones.com/,"[{'id': 380, 'name': 'sibling relationship'}, ...",en,Mysterious events surround the travels of two ...,5.9,"[{'id': 35562, 'logo_path': '/cpeGUCuKo2vgaxzr...",...,2018-01-05,0,"[{'english_name': 'English', 'iso_639_1': 'en'...",Released,,The Strange Ones,5.5,62,"[{'adult': False, 'gender': 2, 'id': 61363, 'k...","[{'adult': False, 'gender': 1, 'id': 17450, 'k..."
2,Stratton,348389,"[{'id': 28, 'name': 'Action'}, {'id': 53, 'nam...",0,http://www.gfmfilms.co.uk/stratton,"[{'id': 818, 'name': 'based on novel or book'}...",en,A British Special Boat Service commando tracks...,14.717,"[{'id': 23970, 'logo_path': None, 'name': 'Twi...",...,2017-07-06,0,"[{'english_name': 'English', 'iso_639_1': 'en'...",Released,The enemy has a weapon. So do we.,Stratton,5.0,176,"[{'adult': False, 'gender': 2, 'id': 55470, 'k...","[{'adult': False, 'gender': 1, 'id': 10496, 'k..."
3,Sweet Country,468210,"[{'id': 18, 'name': 'Drama'}, {'id': 36, 'name...",0,https://bunyaproductions.com.au/sweet-country/,"[{'id': 570, 'name': 'rape'}, {'id': 2831, 'na...",en,"It’s 1929 on the vast, desert-like, Eastern Ar...",6.646,"[{'id': 86737, 'logo_path': None, 'name': 'Bun...",...,2018-01-25,0,"[{'english_name': 'English', 'iso_639_1': 'en'...",Released,Justice itself is put on trial,Sweet Country,6.5,136,"[{'adult': False, 'gender': 2, 'id': 1887857, ...","[{'adult': False, 'gender': 0, 'id': 55906, 'k..."
4,The Commuter,399035,"[{'id': 28, 'name': 'Action'}, {'id': 53, 'nam...",30000000,https://thecommuter.movie/,"[{'id': 10410, 'name': 'conspiracy'}, {'id': 1...",en,"A businessman, on his daily commute home, gets...",25.341,"[{'id': 694, 'logo_path': '/5LEHONGkZBIoWvp1yg...",...,2018-01-11,119942387,"[{'english_name': 'English', 'iso_639_1': 'en'...",Released,Lives are on the line,The Commuter,6.3,3758,"[{'adult': False, 'gender': 2, 'id': 3896, 'kn...","[{'adult': False, 'gender': 2, 'id': 7230, 'kn..."


#### Final Concatenation

In [315]:
df_movies.head(1)

Unnamed: 0,budget,genres,homepage,id,keywords,original_language,original_title,overview,popularity,production_companies,...,runtime,spoken_languages,status,tagline,title_x,vote_average,vote_count,title_y,cast,crew
0,237000000,"[{""id"": 28, ""name"": ""Action""}, {""id"": 12, ""nam...",http://www.avatarmovie.com/,19995,"[{""id"": 1463, ""name"": ""culture clash""}, {""id"":...",en,Avatar,"In the 22nd century, a paraplegic Marine is di...",150.437577,"[{""name"": ""Ingenious Film Partners"", ""id"": 289...",...,162.0,"[{""iso_639_1"": ""en"", ""name"": ""English""}, {""iso...",Released,Enter the World of Pandora.,Avatar,7.2,11800,Avatar,"[{""cast_id"": 242, ""character"": ""Jake Sully"", ""...","[{""credit_id"": ""52fe48009251416c750aca23"", ""de..."


In [316]:
df_movies.drop(['title_y'], axis = 1, inplace=True)

In [317]:
df_movies.head()

Unnamed: 0,budget,genres,homepage,id,keywords,original_language,original_title,overview,popularity,production_companies,...,revenue,runtime,spoken_languages,status,tagline,title_x,vote_average,vote_count,cast,crew
0,237000000,"[{""id"": 28, ""name"": ""Action""}, {""id"": 12, ""nam...",http://www.avatarmovie.com/,19995,"[{""id"": 1463, ""name"": ""culture clash""}, {""id"":...",en,Avatar,"In the 22nd century, a paraplegic Marine is di...",150.437577,"[{""name"": ""Ingenious Film Partners"", ""id"": 289...",...,2787965087,162.0,"[{""iso_639_1"": ""en"", ""name"": ""English""}, {""iso...",Released,Enter the World of Pandora.,Avatar,7.2,11800,"[{""cast_id"": 242, ""character"": ""Jake Sully"", ""...","[{""credit_id"": ""52fe48009251416c750aca23"", ""de..."
1,300000000,"[{""id"": 12, ""name"": ""Adventure""}, {""id"": 14, ""...",http://disney.go.com/disneypictures/pirates/,285,"[{""id"": 270, ""name"": ""ocean""}, {""id"": 726, ""na...",en,Pirates of the Caribbean: At World's End,"Captain Barbossa, long believed to be dead, ha...",139.082615,"[{""name"": ""Walt Disney Pictures"", ""id"": 2}, {""...",...,961000000,169.0,"[{""iso_639_1"": ""en"", ""name"": ""English""}]",Released,"At the end of the world, the adventure begins.",Pirates of the Caribbean: At World's End,6.9,4500,"[{""cast_id"": 4, ""character"": ""Captain Jack Spa...","[{""credit_id"": ""52fe4232c3a36847f800b579"", ""de..."
2,245000000,"[{""id"": 28, ""name"": ""Action""}, {""id"": 12, ""nam...",http://www.sonypictures.com/movies/spectre/,206647,"[{""id"": 470, ""name"": ""spy""}, {""id"": 818, ""name...",en,Spectre,A cryptic message from Bond’s past sends him o...,107.376788,"[{""name"": ""Columbia Pictures"", ""id"": 5}, {""nam...",...,880674609,148.0,"[{""iso_639_1"": ""fr"", ""name"": ""Fran\u00e7ais""},...",Released,A Plan No One Escapes,Spectre,6.3,4466,"[{""cast_id"": 1, ""character"": ""James Bond"", ""cr...","[{""credit_id"": ""54805967c3a36829b5002c41"", ""de..."
3,250000000,"[{""id"": 28, ""name"": ""Action""}, {""id"": 80, ""nam...",http://www.thedarkknightrises.com/,49026,"[{""id"": 849, ""name"": ""dc comics""}, {""id"": 853,...",en,The Dark Knight Rises,Following the death of District Attorney Harve...,112.31295,"[{""name"": ""Legendary Pictures"", ""id"": 923}, {""...",...,1084939099,165.0,"[{""iso_639_1"": ""en"", ""name"": ""English""}]",Released,The Legend Ends,The Dark Knight Rises,7.6,9106,"[{""cast_id"": 2, ""character"": ""Bruce Wayne / Ba...","[{""credit_id"": ""52fe4781c3a36847f81398c3"", ""de..."
4,260000000,"[{""id"": 28, ""name"": ""Action""}, {""id"": 12, ""nam...",http://movies.disney.com/john-carter,49529,"[{""id"": 818, ""name"": ""based on novel""}, {""id"":...",en,John Carter,"John Carter is a war-weary, former military ca...",43.926995,"[{""name"": ""Walt Disney Pictures"", ""id"": 2}]",...,284139100,132.0,"[{""iso_639_1"": ""en"", ""name"": ""English""}]",Released,"Lost in our world, found in another.",John Carter,6.1,2124,"[{""cast_id"": 5, ""character"": ""John Carter"", ""c...","[{""credit_id"": ""52fe479ac3a36847f813eaa3"", ""de..."


In [318]:
df_movies.rename(columns={"title_x": "title"}, inplace=True)

In [319]:
df_movies.head(1)

Unnamed: 0,budget,genres,homepage,id,keywords,original_language,original_title,overview,popularity,production_companies,...,revenue,runtime,spoken_languages,status,tagline,title,vote_average,vote_count,cast,crew
0,237000000,"[{""id"": 28, ""name"": ""Action""}, {""id"": 12, ""nam...",http://www.avatarmovie.com/,19995,"[{""id"": 1463, ""name"": ""culture clash""}, {""id"":...",en,Avatar,"In the 22nd century, a paraplegic Marine is di...",150.437577,"[{""name"": ""Ingenious Film Partners"", ""id"": 289...",...,2787965087,162.0,"[{""iso_639_1"": ""en"", ""name"": ""English""}, {""iso...",Released,Enter the World of Pandora.,Avatar,7.2,11800,"[{""cast_id"": 242, ""character"": ""Jake Sully"", ""...","[{""credit_id"": ""52fe48009251416c750aca23"", ""de..."


In [320]:
dataframes = [df_movies, dfs_concat]

In [321]:
df_final = pd.concat(dataframes)

In [322]:
df_final.head(1)

Unnamed: 0,budget,genres,homepage,id,keywords,original_language,original_title,overview,popularity,production_companies,...,revenue,runtime,spoken_languages,status,tagline,title,vote_average,vote_count,cast,crew
0,237000000,"[{""id"": 28, ""name"": ""Action""}, {""id"": 12, ""nam...",http://www.avatarmovie.com/,19995,"[{""id"": 1463, ""name"": ""culture clash""}, {""id"":...",en,Avatar,"In the 22nd century, a paraplegic Marine is di...",150.437577,"[{""name"": ""Ingenious Film Partners"", ""id"": 289...",...,2787965087,162.0,"[{""iso_639_1"": ""en"", ""name"": ""English""}, {""iso...",Released,Enter the World of Pandora.,Avatar,7.2,11800,"[{""cast_id"": 242, ""character"": ""Jake Sully"", ""...","[{""credit_id"": ""52fe48009251416c750aca23"", ""de..."


In [323]:
df_final.tail(1)

Unnamed: 0,budget,genres,homepage,id,keywords,original_language,original_title,overview,popularity,production_companies,...,revenue,runtime,spoken_languages,status,tagline,title,vote_average,vote_count,cast,crew
69,0,"[{'id': 18, 'name': 'Drama'}]",,565307,"[{'id': 378, 'name': 'prison'}, {'id': 2501, '...",en,Clemency,Years of carrying out death row executions hav...,5.873,"[{'id': 112399, 'logo_path': '/plWc00ADe9sk3sr...",...,309776,,"[{'english_name': 'English', 'iso_639_1': 'en'...",Released,Murder: How do we feel about it?,Clemency,6.6,68,"[{'adult': False, 'gender': 1, 'id': 1981, 'kn...","[{'adult': False, 'gender': 1, 'id': 1981, 'kn..."


In [324]:
df_movies.shape

(4803, 22)

In [325]:
dfs_concat.shape

(514, 21)

In [326]:
df_final.shape

(5317, 22)

#### Save to a .CSV

In [327]:
df_final.to_csv(r'/content/df_final.csv', index=False, header=True)

In [328]:
df_test = pd.read_csv("df_final.csv")

In [329]:
df_test.head()

Unnamed: 0,budget,genres,homepage,id,keywords,original_language,original_title,overview,popularity,production_companies,...,revenue,runtime,spoken_languages,status,tagline,title,vote_average,vote_count,cast,crew
0,237000000,"[{""id"": 28, ""name"": ""Action""}, {""id"": 12, ""nam...",http://www.avatarmovie.com/,19995,"[{""id"": 1463, ""name"": ""culture clash""}, {""id"":...",en,Avatar,"In the 22nd century, a paraplegic Marine is di...",150.437577,"[{""name"": ""Ingenious Film Partners"", ""id"": 289...",...,2787965087,162.0,"[{""iso_639_1"": ""en"", ""name"": ""English""}, {""iso...",Released,Enter the World of Pandora.,Avatar,7.2,11800,"[{""cast_id"": 242, ""character"": ""Jake Sully"", ""...","[{""credit_id"": ""52fe48009251416c750aca23"", ""de..."
1,300000000,"[{""id"": 12, ""name"": ""Adventure""}, {""id"": 14, ""...",http://disney.go.com/disneypictures/pirates/,285,"[{""id"": 270, ""name"": ""ocean""}, {""id"": 726, ""na...",en,Pirates of the Caribbean: At World's End,"Captain Barbossa, long believed to be dead, ha...",139.082615,"[{""name"": ""Walt Disney Pictures"", ""id"": 2}, {""...",...,961000000,169.0,"[{""iso_639_1"": ""en"", ""name"": ""English""}]",Released,"At the end of the world, the adventure begins.",Pirates of the Caribbean: At World's End,6.9,4500,"[{""cast_id"": 4, ""character"": ""Captain Jack Spa...","[{""credit_id"": ""52fe4232c3a36847f800b579"", ""de..."
2,245000000,"[{""id"": 28, ""name"": ""Action""}, {""id"": 12, ""nam...",http://www.sonypictures.com/movies/spectre/,206647,"[{""id"": 470, ""name"": ""spy""}, {""id"": 818, ""name...",en,Spectre,A cryptic message from Bond’s past sends him o...,107.376788,"[{""name"": ""Columbia Pictures"", ""id"": 5}, {""nam...",...,880674609,148.0,"[{""iso_639_1"": ""fr"", ""name"": ""Fran\u00e7ais""},...",Released,A Plan No One Escapes,Spectre,6.3,4466,"[{""cast_id"": 1, ""character"": ""James Bond"", ""cr...","[{""credit_id"": ""54805967c3a36829b5002c41"", ""de..."
3,250000000,"[{""id"": 28, ""name"": ""Action""}, {""id"": 80, ""nam...",http://www.thedarkknightrises.com/,49026,"[{""id"": 849, ""name"": ""dc comics""}, {""id"": 853,...",en,The Dark Knight Rises,Following the death of District Attorney Harve...,112.31295,"[{""name"": ""Legendary Pictures"", ""id"": 923}, {""...",...,1084939099,165.0,"[{""iso_639_1"": ""en"", ""name"": ""English""}]",Released,The Legend Ends,The Dark Knight Rises,7.6,9106,"[{""cast_id"": 2, ""character"": ""Bruce Wayne / Ba...","[{""credit_id"": ""52fe4781c3a36847f81398c3"", ""de..."
4,260000000,"[{""id"": 28, ""name"": ""Action""}, {""id"": 12, ""nam...",http://movies.disney.com/john-carter,49529,"[{""id"": 818, ""name"": ""based on novel""}, {""id"":...",en,John Carter,"John Carter is a war-weary, former military ca...",43.926995,"[{""name"": ""Walt Disney Pictures"", ""id"": 2}]",...,284139100,132.0,"[{""iso_639_1"": ""en"", ""name"": ""English""}]",Released,"Lost in our world, found in another.",John Carter,6.1,2124,"[{""cast_id"": 5, ""character"": ""John Carter"", ""c...","[{""credit_id"": ""52fe479ac3a36847f813eaa3"", ""de..."
