### Movie budgets

Using this DataFrame we are going to try to predict if a film's budget is related to the love of the people for the movie. For this purpose we are going to use two different DataFrames.

In [567]:
import numpy as np
import os
import re
import pandas as pd
import seaborn as sns
from datetime import datetime
import squarify
import matplotlib.pyplot as plt

***

### df_budgets

In [568]:
df_budgets = pd.read_csv("top-500-movies.csv", index_col = 0)

In [569]:
df_budgets.head(2)

Unnamed: 0_level_0,release_date,title,url,production_cost,domestic_gross,worldwide_gross,opening_weekend,mpaa,genre,theaters,runtime,year
rank,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1
1,2019-04-23,Avengers: Endgame,/movie/Avengers-Endgame-(2019)#tab=summary,400000000,858373000,2797800564,357115007.0,PG-13,Action,4662.0,181.0,2019.0
2,2011-05-20,Pirates of the Caribbean: On Stranger Tides,/movie/Pirates-of-the-Caribbean-On-Stranger-Ti...,379000000,241071802,1045713802,90151958.0,PG-13,Adventure,4164.0,136.0,2011.0


First, we drop the "url" column:

In [570]:
df_budgets.drop("url", inplace = True, axis = 1)

In [571]:
df_budgets.dtypes

release_date        object
title               object
production_cost      int64
domestic_gross       int64
worldwide_gross      int64
opening_weekend    float64
mpaa                object
genre               object
theaters           float64
runtime            float64
year               float64
dtype: object

In [572]:
df_budgets.isnull().sum()

release_date        1
title               0
production_cost     0
domestic_gross      0
worldwide_gross     0
opening_weekend    21
mpaa                8
genre               5
theaters           21
runtime            13
year                1
dtype: int64

We find there are many null values. We will fill them manually.

In [573]:
df_budgets["release_date"] = df_budgets["release_date"].replace(np.nan, 2022)

In [574]:
df_budgets[df_budgets["theaters"].isna()]

Unnamed: 0_level_0,release_date,title,production_cost,domestic_gross,worldwide_gross,opening_weekend,mpaa,genre,theaters,runtime,year
rank,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1
9,2023-07-11,Mission: Impossible Dead Reckoning Part One,290000000,0,0,,,Action,,,2023.0
84,2020-09-04,Mulan,200000000,0,69965374,,PG-13,Adventure,,115.0,2020.0
85,2021-07-02,The Tomorrow War,200000000,0,19220000,,PG-13,Action,,140.0,2021.0
86,2022-07-13,The Gray Man,200000000,0,451178,,PG-13,Thriller/Suspense,,129.0,2022.0
110,2005-12-09,"The Chronicles of Narnia: The Lion, the Witch a…",180000000,291710957,720539572,,,,,,2005.0
141,2022-03-10,Turning Red,175000000,0,10965045,,PG,Adventure,,100.0,2022.0
180,2019-11-01,The Irishman,159000000,0,910234,,R,Drama,,210.0,2019.0
182,2010-12-10,The Chronicles of Narnia: The Voyage of the Daw…,155000000,104386950,418186950,,,,,,2010.0
235,2021-11-04,Red Notice,150000000,0,173638,,PG-13,Action,,115.0,2021.0
236,2019-12-13,6 Underground,150000000,0,0,,R,Action,,128.0,2019.0


***

### df_ratings

In [575]:
df_ratings = pd.read_csv("movie_ratings.csv")

In [576]:
df_ratings.head()

Unnamed: 0,filmtv_id,title,year,genre,duration,country,directors,actors,avg_vote,critics_vote,public_vote,total_votes,description,notes,humor,rhythm,effort,tension,erotism
0,2,Bugs Bunny's Third Movie: 1001 Rabbit Tales,1982,Animation,76,United States,"David Detiege, Art Davis, Bill Perez",,7.7,8.0,7.0,22,"With two protruding front teeth, a slightly sl...","These are many small independent stories, whic...",3,3,0,0,0
1,3,18 anni tra una settimana,1991,Drama,98,Italy,Luigi Perelli,"Kim Rossi Stuart, Simona Cavallari, Ennio Fant...",6.5,6.0,7.0,4,"Samantha, not yet eighteen, leaves the comfort...","Luigi Perelli, the director of the ""Piovra"", o...",0,2,0,2,0
2,17,Ride a Wild Pony,1976,Romantic,91,United States,Don Chaffey,"Michael Craig, John Meillon, Eva Griffith, Gra...",5.6,6.0,5.0,9,"In the Australia of the pioneers, a boy and a ...","""Ecological"" story with a happy ending, not wi...",1,2,1,0,0
3,18,Diner,1982,Comedy,95,United States,Barry Levinson,"Mickey Rourke, Steve Guttenberg, Ellen Barkin,...",7.0,8.0,6.0,18,Five boys from Baltimore have a habit of meeti...,A cast of will be famous for Levinson's direct...,2,2,0,1,2
4,20,A che servono questi quattrini?,1942,Comedy,85,Italy,Esodo Pratelli,"Eduardo De Filippo, Peppino De Filippo, Clelia...",5.9,5.33,7.0,15,"With a stratagem, the penniless and somewhat p...",Taken from the play by Armando Curcio that the...,3,1,1,0,0


We drop all the columns we find no use for:

In [577]:
df_ratings.drop("actors", inplace = True, axis = 1)

In [578]:
df_ratings.drop("humor", inplace = True, axis = 1)

In [579]:
df_ratings.drop("rhythm", inplace = True, axis = 1)

In [580]:
df_ratings.drop("effort", inplace = True, axis = 1)

In [581]:
df_ratings.drop("tension", inplace = True, axis = 1)

In [582]:
df_ratings.drop("erotism", inplace = True, axis = 1)

In [583]:
df_ratings.drop("total_votes", inplace = True, axis = 1)

In [584]:
df_ratings.drop("notes", inplace = True, axis = 1)

In [585]:
df_ratings[df_ratings["title"] == "Avengers: Endgame"]

Unnamed: 0,filmtv_id,title,year,genre,duration,country,directors,avg_vote,critics_vote,public_vote,description
36532,166273,Avengers: Endgame,2019,Super-hero,181,United States,"Anthony Russo, Joe Russo",6.2,6.33,6.0,Half of the living beings in the universe were...


In [586]:
df_ratings.head(2)

Unnamed: 0,filmtv_id,title,year,genre,duration,country,directors,avg_vote,critics_vote,public_vote,description
0,2,Bugs Bunny's Third Movie: 1001 Rabbit Tales,1982,Animation,76,United States,"David Detiege, Art Davis, Bill Perez",7.7,8.0,7.0,"With two protruding front teeth, a slightly sl..."
1,3,18 anni tra una settimana,1991,Drama,98,Italy,Luigi Perelli,6.5,6.0,7.0,"Samantha, not yet eighteen, leaves the comfort..."


In [587]:
df_budgets.head()

Unnamed: 0_level_0,release_date,title,production_cost,domestic_gross,worldwide_gross,opening_weekend,mpaa,genre,theaters,runtime,year
rank,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1
1,2019-04-23,Avengers: Endgame,400000000,858373000,2797800564,357115007.0,PG-13,Action,4662.0,181.0,2019.0
2,2011-05-20,Pirates of the Caribbean: On Stranger Tides,379000000,241071802,1045713802,90151958.0,PG-13,Adventure,4164.0,136.0,2011.0
3,2015-04-22,Avengers: Age of Ultron,365000000,459005868,1395316979,191271109.0,PG-13,Action,4276.0,141.0,2015.0
4,2015-12-16,Star Wars Ep. VII: The Force Awakens,306000000,936662225,2064615817,247966675.0,PG-13,Adventure,4134.0,136.0,2015.0
5,2018-04-25,Avengers: Infinity War,300000000,678815482,2048359754,257698183.0,PG-13,Action,4474.0,156.0,2018.0


In [588]:
df_budgets.isnull().sum()

release_date        0
title               0
production_cost     0
domestic_gross      0
worldwide_gross     0
opening_weekend    21
mpaa                8
genre               5
theaters           21
runtime            13
year                1
dtype: int64

In [589]:
df_ratings.isnull().sum()

filmtv_id          0
title              0
year               0
genre             95
duration           0
country           11
directors         33
avg_vote           0
critics_vote    4600
public_vote      474
description     1455
dtype: int64

In [590]:
df_ratings.shape[0]

40303

In [591]:
df_budgets.shape[0]

500

***

### We concat both columns.

In [592]:
df_total_merge = df_budgets.merge(df_ratings)

In [593]:
df_total_merge.shape

(97, 19)

In [594]:
df_total_merge.head()

Unnamed: 0,release_date,title,production_cost,domestic_gross,worldwide_gross,opening_weekend,mpaa,genre,theaters,runtime,year,filmtv_id,duration,country,directors,avg_vote,critics_vote,public_vote,description
0,2015-10-06,Spectre,300000000,200074175,879500760,70403148.0,PG-13,Action,3929.0,148.0,2015.0,73835,148,"United States, Great Britain",Sam Mendes,6.1,6.38,6.0,A cryptic message from his past sends James Bo...
1,2013-12-12,The Hobbit: The Desolation of Smaug,250000000,258241522,959358436,73645197.0,PG-13,Adventure,3928.0,201.0,2013.0,44164,170,"United States, New Zealand",Peter Jackson,6.4,6.3,7.0,"After managing to escape the misty mountains, ..."
2,2017-04-07,The Fate of the Furious,250000000,225764765,1236703796,98786705.0,PG-13,Action,4329.0,136.0,2017.0,124716,136,"United States, Japan, France, Canada, American...",F. Gary Gray,4.8,4.71,5.0,Dom and Letty are on their honeymoon while Bri...
3,2012-07-19,The Dark Knight Rises,230000000,448139099,1082228107,160887295.0,PG-13,Action,4404.0,164.0,2012.0,45819,165,"United States, Great Britain",Christopher Nolan,6.9,7.08,7.0,It's been a few years since Bruce Wayne (Chris...
4,2012-04-11,Battleship,220000000,65233400,313477717,25534825.0,PG-13,Action,3702.0,130.0,2012.0,46201,131,United States,Peter Berg,4.7,4.64,5.0,To save the planet from the attack of a superi...


In [595]:
df_total_concat = pd.concat([df_budgets, df_ratings], axis = 1)

**FIX: Concat index**

In [596]:
df_budgets.head()

Unnamed: 0_level_0,release_date,title,production_cost,domestic_gross,worldwide_gross,opening_weekend,mpaa,genre,theaters,runtime,year
rank,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1
1,2019-04-23,Avengers: Endgame,400000000,858373000,2797800564,357115007.0,PG-13,Action,4662.0,181.0,2019.0
2,2011-05-20,Pirates of the Caribbean: On Stranger Tides,379000000,241071802,1045713802,90151958.0,PG-13,Adventure,4164.0,136.0,2011.0
3,2015-04-22,Avengers: Age of Ultron,365000000,459005868,1395316979,191271109.0,PG-13,Action,4276.0,141.0,2015.0
4,2015-12-16,Star Wars Ep. VII: The Force Awakens,306000000,936662225,2064615817,247966675.0,PG-13,Adventure,4134.0,136.0,2015.0
5,2018-04-25,Avengers: Infinity War,300000000,678815482,2048359754,257698183.0,PG-13,Action,4474.0,156.0,2018.0


In [597]:
df_ratings = df_ratings.set_index('title')

In [598]:
df_ratings.head()

Unnamed: 0_level_0,filmtv_id,year,genre,duration,country,directors,avg_vote,critics_vote,public_vote,description
title,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
Bugs Bunny's Third Movie: 1001 Rabbit Tales,2,1982,Animation,76,United States,"David Detiege, Art Davis, Bill Perez",7.7,8.0,7.0,"With two protruding front teeth, a slightly sl..."
18 anni tra una settimana,3,1991,Drama,98,Italy,Luigi Perelli,6.5,6.0,7.0,"Samantha, not yet eighteen, leaves the comfort..."
Ride a Wild Pony,17,1976,Romantic,91,United States,Don Chaffey,5.6,6.0,5.0,"In the Australia of the pioneers, a boy and a ..."
Diner,18,1982,Comedy,95,United States,Barry Levinson,7.0,8.0,6.0,Five boys from Baltimore have a habit of meeti...
A che servono questi quattrini?,20,1942,Comedy,85,Italy,Esodo Pratelli,5.9,5.33,7.0,"With a stratagem, the penniless and somewhat p..."


In [599]:
df_total2 = pd.concat([df_budgets, df_ratings], axis = 1, join='inner')

InvalidIndexError: Reindexing only valid with uniquely valued Index objects

In [None]:
df_total_concat.head()

Unnamed: 0,release_date,title,production_cost,domestic_gross,worldwide_gross,opening_weekend,mpaa,genre,theaters,runtime,...,title.1,year,genre.1,duration,country,directors,avg_vote,critics_vote,public_vote,description
0,,,,,,,,,,,...,Bugs Bunny's Third Movie: 1001 Rabbit Tales,1982,Animation,76,United States,"David Detiege, Art Davis, Bill Perez",7.7,8.0,7.0,"With two protruding front teeth, a slightly sl..."
1,2019-04-23,Avengers: Endgame,400000000.0,858373000.0,2797801000.0,357115007.0,PG-13,Action,4662.0,181.0,...,18 anni tra una settimana,1991,Drama,98,Italy,Luigi Perelli,6.5,6.0,7.0,"Samantha, not yet eighteen, leaves the comfort..."
2,2011-05-20,Pirates of the Caribbean: On Stranger Tides,379000000.0,241071802.0,1045714000.0,90151958.0,PG-13,Adventure,4164.0,136.0,...,Ride a Wild Pony,1976,Romantic,91,United States,Don Chaffey,5.6,6.0,5.0,"In the Australia of the pioneers, a boy and a ..."
3,2015-04-22,Avengers: Age of Ultron,365000000.0,459005868.0,1395317000.0,191271109.0,PG-13,Action,4276.0,141.0,...,Diner,1982,Comedy,95,United States,Barry Levinson,7.0,8.0,6.0,Five boys from Baltimore have a habit of meeti...
4,2015-12-16,Star Wars Ep. VII: The Force Awakens,306000000.0,936662225.0,2064616000.0,247966675.0,PG-13,Adventure,4134.0,136.0,...,A che servono questi quattrini?,1942,Comedy,85,Italy,Esodo Pratelli,5.9,5.33,7.0,"With a stratagem, the penniless and somewhat p..."


In [None]:
df_total_concat.shape

(40303, 22)

In [None]:
df_total_concat.isnull().sum()

release_date       39803
title              39803
production_cost    39803
domestic_gross     39803
worldwide_gross    39803
opening_weekend    39824
mpaa               39811
genre              39808
theaters           39824
runtime            39816
year               39804
filmtv_id              0
title                  0
year                   0
genre                 95
duration               0
country               11
directors             33
avg_vote               0
critics_vote        4600
public_vote          474
description         1455
dtype: int64

In [None]:
df_final = df_total_concat.dropna()

In [None]:
df_final.isnull().sum()

release_date       0
title              0
production_cost    0
domestic_gross     0
worldwide_gross    0
opening_weekend    0
mpaa               0
genre              0
theaters           0
runtime            0
year               0
filmtv_id          0
title              0
year               0
genre              0
duration           0
country            0
directors          0
avg_vote           0
critics_vote       0
public_vote        0
description        0
dtype: int64

In [None]:
df_final.shape

(474, 22)

In [None]:
df_final.tail()

Unnamed: 0,release_date,title,production_cost,domestic_gross,worldwide_gross,opening_weekend,mpaa,genre,theaters,runtime,...,title.1,year,genre.1,duration,country,directors,avg_vote,critics_vote,public_vote,description
494,2008-02-14,The Spiderwick Chronicles,92500000.0,71195053.0,162839667.0,19004058.0,PG,Adventure,3847.0,96.0,...,"Kiss Me, Stupid",1964,Comedy,123,United States,Billy Wilder,8.3,8.43,8.0,Orville is a piano teacher from a provincial t...
495,2004-10-22,The Incredibles,92000000.0,261441092.0,631441092.0,70467623.0,PG,Adventure,3933.0,115.0,...,Il bacio,1974,Drama,105,Italy,Mario Lanfranchi,3.8,4.0,4.0,A coveted noble title (and relative patrimony)...
496,2013-02-06,A Good Day to Die Hard,92000000.0,67349198.0,304249198.0,24834845.0,R,Action,3555.0,98.0,...,Killer's Kiss,1955,Noir,67,United States,Stanley Kubrick,7.1,7.1,7.0,"Dave Gordon, a failed boxer, meets Gloria who ..."
497,2004-04-09,The Alamo,92000000.0,22406362.0,23911362.0,9124701.0,PG-13,Western,2609.0,137.0,...,Kiss of the Spider Woman,1985,Drama,119,"United States, Brazil",Hector Babenco,7.7,7.75,8.0,Two men are being held in the same cell in a S...
499,2013-12-19,The Secret Life of Walter Mitty,91000000.0,58236838.0,187861183.0,12765508.0,PG,Adventure,2922.0,114.0,...,Cat People,1982,Horror,118,United States,Paul Schrader,6.7,6.79,7.0,"Irena Gallier (Kinski) is already grown up, bu..."


In [None]:
df_final[df_final.index.duplicated()]

Unnamed: 0,release_date,title,production_cost,domestic_gross,worldwide_gross,opening_weekend,mpaa,genre,theaters,runtime,...,title.1,year,genre.1,duration,country,directors,avg_vote,critics_vote,public_vote,description


In [None]:
df_final.index.is_unique

True

In [None]:
df_final.loc[~df_final.index.duplicated(), :]

Unnamed: 0,release_date,title,production_cost,domestic_gross,worldwide_gross,opening_weekend,mpaa,genre,theaters,runtime,...,title.1,year,genre.1,duration,country,directors,avg_vote,critics_vote,public_vote,description
1,2019-04-23,Avengers: Endgame,400000000.0,858373000.0,2.797801e+09,357115007.0,PG-13,Action,4662.0,181.0,...,18 anni tra una settimana,1991,Drama,98,Italy,Luigi Perelli,6.5,6.00,7.0,"Samantha, not yet eighteen, leaves the comfort..."
2,2011-05-20,Pirates of the Caribbean: On Stranger Tides,379000000.0,241071802.0,1.045714e+09,90151958.0,PG-13,Adventure,4164.0,136.0,...,Ride a Wild Pony,1976,Romantic,91,United States,Don Chaffey,5.6,6.00,5.0,"In the Australia of the pioneers, a boy and a ..."
3,2015-04-22,Avengers: Age of Ultron,365000000.0,459005868.0,1.395317e+09,191271109.0,PG-13,Action,4276.0,141.0,...,Diner,1982,Comedy,95,United States,Barry Levinson,7.0,8.00,6.0,Five boys from Baltimore have a habit of meeti...
4,2015-12-16,Star Wars Ep. VII: The Force Awakens,306000000.0,936662225.0,2.064616e+09,247966675.0,PG-13,Adventure,4134.0,136.0,...,A che servono questi quattrini?,1942,Comedy,85,Italy,Esodo Pratelli,5.9,5.33,7.0,"With a stratagem, the penniless and somewhat p..."
5,2018-04-25,Avengers: Infinity War,300000000.0,678815482.0,2.048360e+09,257698183.0,PG-13,Action,4474.0,156.0,...,The Uranian Conspiracy,1978,Spy,117,"Italy, Germany, Israel","Gianfranco Baldanello, Menahem Golan",4.8,3.50,6.0,Two Israeli secret agents discover that traffi...
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
494,2008-02-14,The Spiderwick Chronicles,92500000.0,71195053.0,1.628397e+08,19004058.0,PG,Adventure,3847.0,96.0,...,"Kiss Me, Stupid",1964,Comedy,123,United States,Billy Wilder,8.3,8.43,8.0,Orville is a piano teacher from a provincial t...
495,2004-10-22,The Incredibles,92000000.0,261441092.0,6.314411e+08,70467623.0,PG,Adventure,3933.0,115.0,...,Il bacio,1974,Drama,105,Italy,Mario Lanfranchi,3.8,4.00,4.0,A coveted noble title (and relative patrimony)...
496,2013-02-06,A Good Day to Die Hard,92000000.0,67349198.0,3.042492e+08,24834845.0,R,Action,3555.0,98.0,...,Killer's Kiss,1955,Noir,67,United States,Stanley Kubrick,7.1,7.10,7.0,"Dave Gordon, a failed boxer, meets Gloria who ..."
497,2004-04-09,The Alamo,92000000.0,22406362.0,2.391136e+07,9124701.0,PG-13,Western,2609.0,137.0,...,Kiss of the Spider Woman,1985,Drama,119,"United States, Brazil",Hector Babenco,7.7,7.75,8.0,Two men are being held in the same cell in a S...


In [None]:
df_final.dtypes

release_date        object
title               object
production_cost    float64
domestic_gross     float64
worldwide_gross    float64
opening_weekend    float64
mpaa                object
genre               object
theaters           float64
runtime            float64
year               float64
filmtv_id            int64
title               object
year                 int64
genre               object
duration             int64
country             object
directors           object
avg_vote           float64
critics_vote       float64
public_vote        float64
description         object
dtype: object

In [None]:
columns = list(df_final.columns)

start1 = columns.index("production_cost")
end1 = columns.index("opening_weekend")
start2 = columns.index("theaters")
end2 = columns.index("year")

for index, col in enumerate(columns):
    if (start1 <= index) & (index <= end1):
        df_final[col] = df_final[col].astype(int)
    if (start2 <= index) & (index <= end2):
        df_final[col] = df_final[col].astype(int)

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df_final[col] = df_final[col].astype(int)
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df_final[col] = df_final[col].astype(int)


In [None]:
df_final.dtypes

release_date        object
title               object
production_cost      int64
domestic_gross       int64
worldwide_gross      int64
opening_weekend      int64
mpaa                object
genre               object
theaters             int64
runtime              int64
year                 int64
filmtv_id            int64
title               object
year                 int64
genre               object
duration             int64
country             object
directors           object
avg_vote           float64
critics_vote       float64
public_vote        float64
description         object
dtype: object

In [None]:
df_final.head(2)

Unnamed: 0,release_date,title,production_cost,domestic_gross,worldwide_gross,opening_weekend,mpaa,genre,theaters,runtime,...,title.1,year,genre.1,duration,country,directors,avg_vote,critics_vote,public_vote,description
1,2019-04-23,Avengers: Endgame,400000000,858373000,2797800564,357115007,PG-13,Action,4662,181,...,18 anni tra una settimana,1991,Drama,98,Italy,Luigi Perelli,6.5,6.0,7.0,"Samantha, not yet eighteen, leaves the comfort..."
2,2011-05-20,Pirates of the Caribbean: On Stranger Tides,379000000,241071802,1045713802,90151958,PG-13,Adventure,4164,136,...,Ride a Wild Pony,1976,Romantic,91,United States,Don Chaffey,5.6,6.0,5.0,"In the Australia of the pioneers, a boy and a ..."


In [None]:
df_final[df_final["title"] == "The Incredibles"]

ValueError: cannot reindex from a duplicate axis