# DS3 - Introduction to Pandas

Welcome to our pandas intro notebook. Here you can practice useful functions for DataFrame manipulation and analysis. 
With interactive problems to guide you in your learning process. Have fun !

First let's import pandas and numpy, a crucial component to any jupyter notebook session.

In [1]:
import pandas as pd
import numpy as np

The file Movies.csv contains data on movies shown on various streaming services. Let's read it into Python:

In [2]:
df = pd.read_csv('Movies.csv')
df.head()

Unnamed: 0.1,Unnamed: 0,ID,Title,Year,Age,IMDb,Rotten Tomatoes,Netflix,Hulu,Prime Video,Disney+,Type,Directors,Genres,Country,Language,Runtime
0,0,1,Inception,2010,13+,8.8,87%,1,0,0,0,0,Christopher Nolan,"Action,Adventure,Sci-Fi,Thriller","United States,United Kingdom","English,Japanese,French",148.0
1,1,2,The Matrix,1999,18+,8.7,87%,1,0,0,0,0,"Lana Wachowski,Lilly Wachowski","Action,Sci-Fi",United States,English,136.0
2,2,3,Avengers: Infinity War,2018,13+,8.5,84%,1,0,0,0,0,"Anthony Russo,Joe Russo","Action,Adventure,Sci-Fi",United States,English,149.0
3,3,4,Back to the Future,1985,7+,8.5,96%,1,0,0,0,0,Robert Zemeckis,"Adventure,Comedy,Sci-Fi",United States,English,116.0
4,4,5,"The Good, the Bad and the Ugly",1966,18+,8.8,97%,1,0,1,0,0,Sergio Leone,Western,"Italy,Spain,West Germany",Italian,161.0


**Note.** pd.read_ is the general method of loading data into a Pandas dataframe. Whether the file type is csv, excel, json, etc. 

For example:

movies = pd.read_excel('Movies.xls')

In [3]:
#feel free to play around with the integer n in df.head(n) or df.tail()
df.tail(3)

Unnamed: 0.1,Unnamed: 0,ID,Title,Year,Age,IMDb,Rotten Tomatoes,Netflix,Hulu,Prime Video,Disney+,Type,Directors,Genres,Country,Language,Runtime
16741,16741,16742,Sharks of Lost Island,2013,,5.7,,0,0,0,1,0,Neil Gelinas,Documentary,United States,English,
16742,16742,16743,Man Among Cheetahs,2017,,6.6,,0,0,0,1,0,Richard Slater-Jones,Documentary,United States,English,
16743,16743,16744,In Beaver Valley,1950,,,,0,0,0,1,0,James Algar,"Documentary,Short,Family",United States,English,32.0


## DataFrame Stucture and datatypes

Take a look at the structure of your DataFrame using *df.shape*. This is an easy method of quantifying your DataFrame.  DataFrames often contain various data types and it can be beneficial to understand just what type are contained within. Here, *df.dtypes* is especially helpful. 

In [4]:
df.shape

(16744, 17)

In [5]:
df.dtypes

Unnamed: 0           int64
ID                   int64
Title               object
Year                 int64
Age                 object
IMDb               float64
Rotten Tomatoes     object
Netflix              int64
Hulu                 int64
Prime Video          int64
Disney+              int64
Type                 int64
Directors           object
Genres              object
Country             object
Language            object
Runtime            float64
dtype: object

# CAROLYN! 

# YASH!

In [6]:
df.head(3)

Unnamed: 0.1,Unnamed: 0,ID,Title,Year,Age,IMDb,Rotten Tomatoes,Netflix,Hulu,Prime Video,Disney+,Type,Directors,Genres,Country,Language,Runtime
0,0,1,Inception,2010,13+,8.8,87%,1,0,0,0,0,Christopher Nolan,"Action,Adventure,Sci-Fi,Thriller","United States,United Kingdom","English,Japanese,French",148.0
1,1,2,The Matrix,1999,18+,8.7,87%,1,0,0,0,0,"Lana Wachowski,Lilly Wachowski","Action,Sci-Fi",United States,English,136.0
2,2,3,Avengers: Infinity War,2018,13+,8.5,84%,1,0,0,0,0,"Anthony Russo,Joe Russo","Action,Adventure,Sci-Fi",United States,English,149.0


In [7]:
df.shape

(16744, 17)

In [8]:
df.drop(['Unnamed: 0', 'ID', 'Type'], axis = 1, inplace = True)
df.head(3)

Unnamed: 0,Title,Year,Age,IMDb,Rotten Tomatoes,Netflix,Hulu,Prime Video,Disney+,Directors,Genres,Country,Language,Runtime
0,Inception,2010,13+,8.8,87%,1,0,0,0,Christopher Nolan,"Action,Adventure,Sci-Fi,Thriller","United States,United Kingdom","English,Japanese,French",148.0
1,The Matrix,1999,18+,8.7,87%,1,0,0,0,"Lana Wachowski,Lilly Wachowski","Action,Sci-Fi",United States,English,136.0
2,Avengers: Infinity War,2018,13+,8.5,84%,1,0,0,0,"Anthony Russo,Joe Russo","Action,Adventure,Sci-Fi",United States,English,149.0


In [9]:
df.shape

(16744, 14)

In [10]:
# This following line will drop every row that has at least 1 NaN value. Out of the 16744 total rows in the dataset,
# only 3301 rows had no missing values. 13443 (16744 - 14) rows, around 80% of the dataset, were dropped in this line!
# This is NOT a good example of data cleaning
df.dropna().head(3)

Unnamed: 0,Title,Year,Age,IMDb,Rotten Tomatoes,Netflix,Hulu,Prime Video,Disney+,Directors,Genres,Country,Language,Runtime
0,Inception,2010,13+,8.8,87%,1,0,0,0,Christopher Nolan,"Action,Adventure,Sci-Fi,Thriller","United States,United Kingdom","English,Japanese,French",148.0
1,The Matrix,1999,18+,8.7,87%,1,0,0,0,"Lana Wachowski,Lilly Wachowski","Action,Sci-Fi",United States,English,136.0
2,Avengers: Infinity War,2018,13+,8.5,84%,1,0,0,0,"Anthony Russo,Joe Russo","Action,Adventure,Sci-Fi",United States,English,149.0


In [11]:
df.dropna().shape

(3301, 14)

In [12]:
# default axis for drop is 0, which stands for rows (indices). how is the parameter that dictates whether a row/column
# is dropped if ALL of its values are NaN or ANY of its values are NaN. Default is ANY. This line means drop any rows (indices) 
# where ALL the values are NaN. As we can see, the dimensions are the same as they were before we dropped any NaN, 
# so we can conclude that there are no garbage rows with fully NaN values.

df.dropna(how = 'all', inplace = True)
df.shape

(16744, 14)

In [13]:
df.dropna(subset = ['IMDb']).shape

(16173, 14)

In [14]:
df.dropna(subset = ['IMDb', 'Rotten Tomatoes']).shape

(5156, 14)

In [15]:
m1 = df.rename(columns = {'Age':'Age Rating', 'Runtime':'Runtime (min)'})
m1.head(3)

Unnamed: 0,Title,Year,Age Rating,IMDb,Rotten Tomatoes,Netflix,Hulu,Prime Video,Disney+,Directors,Genres,Country,Language,Runtime (min)
0,Inception,2010,13+,8.8,87%,1,0,0,0,Christopher Nolan,"Action,Adventure,Sci-Fi,Thriller","United States,United Kingdom","English,Japanese,French",148.0
1,The Matrix,1999,18+,8.7,87%,1,0,0,0,"Lana Wachowski,Lilly Wachowski","Action,Sci-Fi",United States,English,136.0
2,Avengers: Infinity War,2018,13+,8.5,84%,1,0,0,0,"Anthony Russo,Joe Russo","Action,Adventure,Sci-Fi",United States,English,149.0


In [16]:
df.columns

Index(['Title', 'Year', 'Age', 'IMDb', 'Rotten Tomatoes', 'Netflix', 'Hulu',
       'Prime Video', 'Disney+', 'Directors', 'Genres', 'Country', 'Language',
       'Runtime'],
      dtype='object')

In [17]:
df.columns = ['Title', 'Year', 'Age Rating', 'IMDb', 'Rotten Tomatoes', 'Netflix', 'Hulu',
       'Prime Video', 'Disney+', 'Directors', 'Genres', 'Country', 'Language',
       'Runtime (min)']
df.head(3)

Unnamed: 0,Title,Year,Age Rating,IMDb,Rotten Tomatoes,Netflix,Hulu,Prime Video,Disney+,Directors,Genres,Country,Language,Runtime (min)
0,Inception,2010,13+,8.8,87%,1,0,0,0,Christopher Nolan,"Action,Adventure,Sci-Fi,Thriller","United States,United Kingdom","English,Japanese,French",148.0
1,The Matrix,1999,18+,8.7,87%,1,0,0,0,"Lana Wachowski,Lilly Wachowski","Action,Sci-Fi",United States,English,136.0
2,Avengers: Infinity War,2018,13+,8.5,84%,1,0,0,0,"Anthony Russo,Joe Russo","Action,Adventure,Sci-Fi",United States,English,149.0


In [18]:
# an example of how you can rename all the columns at once and call a function on them

In [19]:
df.rename(str.lower, axis = 1)

Unnamed: 0,title,year,age rating,imdb,rotten tomatoes,netflix,hulu,prime video,disney+,directors,genres,country,language,runtime (min)
0,Inception,2010,13+,8.8,87%,1,0,0,0,Christopher Nolan,"Action,Adventure,Sci-Fi,Thriller","United States,United Kingdom","English,Japanese,French",148.0
1,The Matrix,1999,18+,8.7,87%,1,0,0,0,"Lana Wachowski,Lilly Wachowski","Action,Sci-Fi",United States,English,136.0
2,Avengers: Infinity War,2018,13+,8.5,84%,1,0,0,0,"Anthony Russo,Joe Russo","Action,Adventure,Sci-Fi",United States,English,149.0
3,Back to the Future,1985,7+,8.5,96%,1,0,0,0,Robert Zemeckis,"Adventure,Comedy,Sci-Fi",United States,English,116.0
4,"The Good, the Bad and the Ugly",1966,18+,8.8,97%,1,0,1,0,Sergio Leone,Western,"Italy,Spain,West Germany",Italian,161.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
16739,The Ghosts of Buxley Hall,1980,,6.2,,0,0,0,1,Bruce Bilson,"Comedy,Family,Fantasy,Horror",United States,English,120.0
16740,The Poof Point,2001,7+,4.7,,0,0,0,1,Neal Israel,"Comedy,Family,Sci-Fi",United States,English,90.0
16741,Sharks of Lost Island,2013,,5.7,,0,0,0,1,Neil Gelinas,Documentary,United States,English,
16742,Man Among Cheetahs,2017,,6.6,,0,0,0,1,Richard Slater-Jones,Documentary,United States,English,


In [20]:
df['Rotten Tomatoes'] = df['Rotten Tomatoes'].str.strip('%')
df['Rotten Tomatoes'] = pd.to_numeric(df['Rotten Tomatoes'])

In [21]:
df.head(3)

Unnamed: 0,Title,Year,Age Rating,IMDb,Rotten Tomatoes,Netflix,Hulu,Prime Video,Disney+,Directors,Genres,Country,Language,Runtime (min)
0,Inception,2010,13+,8.8,87.0,1,0,0,0,Christopher Nolan,"Action,Adventure,Sci-Fi,Thriller","United States,United Kingdom","English,Japanese,French",148.0
1,The Matrix,1999,18+,8.7,87.0,1,0,0,0,"Lana Wachowski,Lilly Wachowski","Action,Sci-Fi",United States,English,136.0
2,Avengers: Infinity War,2018,13+,8.5,84.0,1,0,0,0,"Anthony Russo,Joe Russo","Action,Adventure,Sci-Fi",United States,English,149.0


In [22]:
df['Disney+'] == 1

0        False
1        False
2        False
3        False
4        False
         ...  
16739     True
16740     True
16741     True
16742     True
16743     True
Name: Disney+, Length: 16744, dtype: bool

In [23]:
disney = df[df['Disney+'] == 1]
disney.head()

Unnamed: 0,Title,Year,Age Rating,IMDb,Rotten Tomatoes,Netflix,Hulu,Prime Video,Disney+,Directors,Genres,Country,Language,Runtime (min)
95,Saving Mr. Banks,2013,13+,7.5,79.0,1,0,0,1,John Lee Hancock,"Biography,Comedy,Drama","United States,United Kingdom,Australia",English,125.0
103,Amy,2015,18+,7.8,95.0,1,0,1,1,,Drama,United States,English,60.0
122,Bolt,2008,7+,6.8,89.0,1,0,0,1,"Byron Howard,Chris Williams","Animation,Adventure,Comedy,Drama,Family",United States,English,96.0
125,The Princess and the Frog,2009,all,7.1,85.0,1,0,0,1,"Ron Clements,John Musker","Animation,Adventure,Comedy,Family,Fantasy,Musi...",United States,"English,French",97.0
150,Miracle,2004,7+,7.5,81.0,1,0,0,1,Gavin O'Connor,"Biography,Drama,History,Sport","Canada,United States",English,135.0


In [24]:
disney.shape

(564, 14)

In [25]:
director = df['Directors'].str.contains('Steven Spielberg')
family_genre = df['Genres'].str.contains('Family')
action_genre = df['Genres'].str.contains('Action')
df[director & (family_genre | action_genre)]

Unnamed: 0,Title,Year,Age Rating,IMDb,Rotten Tomatoes,Netflix,Hulu,Prime Video,Disney+,Directors,Genres,Country,Language,Runtime (min)
8,Raiders of the Lost Ark,1981,7+,8.4,95.0,1,0,0,0,Steven Spielberg,"Action,Adventure",United States,"English,German,Hebrew,Spanish,Arabic,Nepali",115.0
16,Indiana Jones and the Last Crusade,1989,13+,8.2,88.0,1,0,0,0,Steven Spielberg,"Action,Adventure",United States,"English,German,Greek,Arabic",127.0
36,Minority Report,2002,13+,7.6,90.0,1,0,0,0,Steven Spielberg,"Action,Crime,Mystery,Sci-Fi,Thriller",United States,"English,Swedish",145.0
44,Indiana Jones and the Temple of Doom,1984,7+,7.6,85.0,1,0,0,0,Steven Spielberg,"Action,Adventure",United States,"English,Sinhalese,Hindi",118.0
121,The Adventures of Tintin,2011,7+,7.3,74.0,1,0,0,0,Steven Spielberg,"Animation,Action,Adventure,Family,Mystery","United States,New Zealand,United Kingdom",English,107.0
186,War Horse,2011,13+,7.2,74.0,1,0,0,0,Steven Spielberg,"Action,Adventure,Drama,History,War","United States,India","English,German",146.0
219,Indiana Jones and the Kingdom of the Crystal S...,2008,13+,6.1,78.0,1,0,0,0,Steven Spielberg,"Action,Adventure",United States,"English,German,Russian",122.0
16331,The BFG,2016,7+,6.4,75.0,0,0,0,1,Steven Spielberg,"Adventure,Family,Fantasy","United States,India,United Kingdom",English,117.0


In [26]:
df[(df['Directors'].str.contains('Steven Spielberg')) &
   ((df['Genres'].str.contains('Family')) | (df['Genres'].str.contains('Action')))]

Unnamed: 0,Title,Year,Age Rating,IMDb,Rotten Tomatoes,Netflix,Hulu,Prime Video,Disney+,Directors,Genres,Country,Language,Runtime (min)
8,Raiders of the Lost Ark,1981,7+,8.4,95.0,1,0,0,0,Steven Spielberg,"Action,Adventure",United States,"English,German,Hebrew,Spanish,Arabic,Nepali",115.0
16,Indiana Jones and the Last Crusade,1989,13+,8.2,88.0,1,0,0,0,Steven Spielberg,"Action,Adventure",United States,"English,German,Greek,Arabic",127.0
36,Minority Report,2002,13+,7.6,90.0,1,0,0,0,Steven Spielberg,"Action,Crime,Mystery,Sci-Fi,Thriller",United States,"English,Swedish",145.0
44,Indiana Jones and the Temple of Doom,1984,7+,7.6,85.0,1,0,0,0,Steven Spielberg,"Action,Adventure",United States,"English,Sinhalese,Hindi",118.0
121,The Adventures of Tintin,2011,7+,7.3,74.0,1,0,0,0,Steven Spielberg,"Animation,Action,Adventure,Family,Mystery","United States,New Zealand,United Kingdom",English,107.0
186,War Horse,2011,13+,7.2,74.0,1,0,0,0,Steven Spielberg,"Action,Adventure,Drama,History,War","United States,India","English,German",146.0
219,Indiana Jones and the Kingdom of the Crystal S...,2008,13+,6.1,78.0,1,0,0,0,Steven Spielberg,"Action,Adventure",United States,"English,German,Russian",122.0
16331,The BFG,2016,7+,6.4,75.0,0,0,0,1,Steven Spielberg,"Adventure,Family,Fantasy","United States,India,United Kingdom",English,117.0


In [27]:
babysitting = df[((df['Netflix'] == 1) | (df['Prime Video'] == 1)) & 
   (df['IMDb'] >= 8) & (df['Year'] >= 2000) & (df['Age Rating'] == 'all') & 
    (df['Runtime (min)'] <= 90)]
babysitting.head(3)

Unnamed: 0,Title,Year,Age Rating,IMDb,Rotten Tomatoes,Netflix,Hulu,Prime Video,Disney+,Directors,Genres,Country,Language,Runtime (min)
536,I Am Kalam,2010,all,8.0,80.0,1,0,1,0,Nila Madhab Panda,"Comedy,Drama,Family",India,Hindi,88.0
6987,If so,2003,all,8.1,,0,0,1,0,Rajiv Whabi,Mystery,United Arab Emirates,English,80.0
8543,Fetish,2010,all,8.5,,0,0,1,0,Soopum Sohn,Thriller,United States,"English,Korean",87.0


In [28]:
platform = (df['Netflix'] == 1) | (df['Prime Video'] == 1)
rating = df['IMDb'] >= 8
year = df['Year'] >= 2000
age = df['Age Rating'] == 'all'
time = df['Runtime (min)'] <= 90
babysitting = df[platform & rating & year & age & time]
babysitting.head(3)

Unnamed: 0,Title,Year,Age Rating,IMDb,Rotten Tomatoes,Netflix,Hulu,Prime Video,Disney+,Directors,Genres,Country,Language,Runtime (min)
536,I Am Kalam,2010,all,8.0,80.0,1,0,1,0,Nila Madhab Panda,"Comedy,Drama,Family",India,Hindi,88.0
6987,If so,2003,all,8.1,,0,0,1,0,Rajiv Whabi,Mystery,United Arab Emirates,English,80.0
8543,Fetish,2010,all,8.5,,0,0,1,0,Soopum Sohn,Thriller,United States,"English,Korean",87.0


In [29]:
babysitting.shape[0]

7

In [30]:
social = df[(df['Netflix'] == 1) & ((df['Age Rating'].str.contains('16+')) | (df['Age Rating'].str.contains('18+')))
    & (df['Genres'].str.contains('Comedy')) & ((df['Year'] >= 2000) & (df['Year'] <= 2016))
    & (df['IMDb'] > 5.5)]
social.head(3)

Unnamed: 0,Title,Year,Age Rating,IMDb,Rotten Tomatoes,Netflix,Hulu,Prime Video,Disney+,Directors,Genres,Country,Language,Runtime (min)
26,Silver Linings Playbook,2012,18+,7.7,92.0,1,0,0,0,David O. Russell,"Comedy,Drama,Romance",United States,English,122.0
73,About Time,2013,18+,7.8,68.0,1,0,0,0,Richard Curtis,"Comedy,Drama,Fantasy,Romance,Sci-Fi",United Kingdom,English,123.0
74,Kung Fu Hustle,2004,18+,7.7,90.0,1,0,0,0,Stephen Chow,"Action,Comedy,Fantasy","Hong Kong,China,United States","Cantonese,Mandarin",99.0


In [31]:
nfx_pty = df['Netflix'] == 1
ages = (df['Age Rating'].str.contains('16+')) | (df['Age Rating'].str.contains('18+'))
genre = df['Genres'].str.contains('Comedy')
year_range = (df['Year'] >= 2000) & (df['Year'] <= 2016)
rating = df['IMDb'] > 5.5
social = df[nfx_pty & ages & genre & year_range & rating]
social.head(3)

Unnamed: 0,Title,Year,Age Rating,IMDb,Rotten Tomatoes,Netflix,Hulu,Prime Video,Disney+,Directors,Genres,Country,Language,Runtime (min)
26,Silver Linings Playbook,2012,18+,7.7,92.0,1,0,0,0,David O. Russell,"Comedy,Drama,Romance",United States,English,122.0
73,About Time,2013,18+,7.8,68.0,1,0,0,0,Richard Curtis,"Comedy,Drama,Fantasy,Romance,Sci-Fi",United Kingdom,English,123.0
74,Kung Fu Hustle,2004,18+,7.7,90.0,1,0,0,0,Stephen Chow,"Action,Comedy,Fantasy","Hong Kong,China,United States","Cantonese,Mandarin",99.0


In [32]:
social.shape[0]

108

In [33]:
df.sort_values('Rotten Tomatoes', ascending = True).head(3)

Unnamed: 0,Title,Year,Age Rating,IMDb,Rotten Tomatoes,Netflix,Hulu,Prime Video,Disney+,Directors,Genres,Country,Language,Runtime (min)
8133,Kickin' It Old Skool,2007,13+,4.6,2.0,0,0,1,0,Harvey Glazer,Comedy,"United States,Canada",English,108.0
6938,Strange Wilderness,2008,18+,5.3,2.0,0,0,1,0,Fred Wolf,"Adventure,Comedy",United States,English,87.0
4208,Getaway,2013,13+,4.4,2.0,0,1,0,0,Sam Peckinpah,"Action,Crime,Thriller",United States,"English,Spanish",123.0


In [34]:
df.sort_values(by = "Rotten Tomatoes", ascending = False).head(3)

Unnamed: 0,Title,Year,Age Rating,IMDb,Rotten Tomatoes,Netflix,Hulu,Prime Video,Disney+,Directors,Genres,Country,Language,Runtime (min)
481,Dance Academy: The Movie,2017,,7.0,100.0,1,0,0,0,Jeffrey Walker,Drama,"Germany,Australia",English,101.0
828,National Bird,2016,,7.1,100.0,1,0,0,0,Sonia Kennebeck,Documentary,United States,"English,Dari",92.0
6507,The Shelter,2015,,3.6,100.0,0,0,1,0,María Lidón,"Drama,Sci-Fi",Spain,English,95.0


In [35]:
df.sort_values(by = ["Rotten Tomatoes", 'IMDb'], ascending = False).head(3)

Unnamed: 0,Title,Year,Age Rating,IMDb,Rotten Tomatoes,Netflix,Hulu,Prime Video,Disney+,Directors,Genres,Country,Language,Runtime (min)
4473,Stop Making Sense,1984,,8.6,100.0,0,0,1,0,Jonathan Demme,"Documentary,Music",United States,English,88.0
4662,Tom Petty and the Heartbreakers: Runnin' Down ...,2007,,8.6,100.0,0,0,1,0,Peter Bogdanovich,"Documentary,Music",United States,English,239.0
4591,Mahanati,2018,7+,8.5,100.0,0,0,1,0,Nag Ashwin,"Biography,Drama",India,"Telugu,Tamil",177.0


In [36]:
df.sort_values(by = ['IMDb', 'Rotten Tomatoes'], ascending = False).head(3)

Unnamed: 0,Title,Year,Age Rating,IMDb,Rotten Tomatoes,Netflix,Hulu,Prime Video,Disney+,Directors,Genres,Country,Language,Runtime (min)
1292,My Next Guest with David Letterman and Shah Ru...,2019,,9.3,,1,0,0,0,,Talk-Show,,,61.0
5110,Love on a Leash,2011,,9.3,,0,0,1,0,Fen Tian,"Comedy,Drama,Fantasy,Romance",United States,,90.0
6566,Square One,2019,,9.3,,0,0,1,0,Danny Wu,"Documentary,Drama,Music",United States,English,83.0


In [37]:
df.sort_values(by = ['Rotten Tomatoes', 'Year'], ascending = [False, True]).head(3)

Unnamed: 0,Title,Year,Age Rating,IMDb,Rotten Tomatoes,Netflix,Hulu,Prime Video,Disney+,Directors,Genres,Country,Language,Runtime (min)
4467,A Trip to the Moon,1902,all,8.2,100.0,0,0,1,0,Georges Méliès,"Short,Action,Adventure,Comedy,Fantasy,Sci-Fi",France,"None,French",13.0
4470,The Cabinet of Dr. Caligari,1920,7+,8.1,100.0,0,0,1,0,Robert Wiene,"Fantasy,Horror,Mystery,Thriller",Germany,German,76.0
4844,The Golem: How He Came into the World,1920,,7.2,100.0,0,0,1,0,"Carl Boese,Paul Wegener","Fantasy,Horror",Germany,,76.0


In [38]:
social_sorted = social.sort_values(by = ['Rotten Tomatoes', 'IMDb'], ascending = False)
social_sorted.head(10)

Unnamed: 0,Title,Year,Age Rating,IMDb,Rotten Tomatoes,Netflix,Hulu,Prime Video,Disney+,Directors,Genres,Country,Language,Runtime (min)
141,Bill Burr: I'm Sorry You Feel That Way,2014,18+,8.4,100.0,1,0,0,0,Jay Karas,Comedy,United States,English,80.0
1014,Jim Gaffigan: Obsessed,2014,16+,7.6,100.0,1,0,0,0,Jay Chapman,"Documentary,Comedy",United States,English,60.0
521,Melvin Goes to Dinner,2003,18+,6.8,100.0,1,0,0,0,Bob Odenkirk,"Comedy,Drama,Romance",United States,English,83.0
884,Hannibal Buress: Comedy Camisado,2016,18+,6.6,100.0,1,0,0,0,Lance Bangs,Comedy,United States,English,83.0
473,Aśoka,2001,18+,6.5,100.0,1,0,0,0,,Comedy,United States,English,30.0
332,Sour Grapes,2016,16+,7.3,96.0,1,0,0,0,Larry David,Comedy,United States,English,91.0
79,The Edge of Seventeen,2016,18+,7.3,94.0,1,0,0,0,Kelly Fremon Craig,"Comedy,Drama","United States,China",English,104.0
244,The Death of Mr. Lazarescu,2005,18+,7.9,93.0,1,0,0,0,Cristi Puiu,"Comedy,Drama",Romania,Romanian,153.0
26,Silver Linings Playbook,2012,18+,7.7,92.0,1,0,0,0,David O. Russell,"Comedy,Drama,Romance",United States,English,122.0
134,Frances Ha,2013,18+,7.5,92.0,1,0,0,0,Noah Baumbach,"Comedy,Drama,Romance",United States,"French,English",86.0


In [39]:
social_sorted.loc[0:9, 'Title']

KeyError: 0

In [40]:
social_sorted.reset_index(drop = True, inplace = True)
social_sorted.head(3)

Unnamed: 0,Title,Year,Age Rating,IMDb,Rotten Tomatoes,Netflix,Hulu,Prime Video,Disney+,Directors,Genres,Country,Language,Runtime (min)
0,Bill Burr: I'm Sorry You Feel That Way,2014,18+,8.4,100.0,1,0,0,0,Jay Karas,Comedy,United States,English,80.0
1,Jim Gaffigan: Obsessed,2014,16+,7.6,100.0,1,0,0,0,Jay Chapman,"Documentary,Comedy",United States,English,60.0
2,Melvin Goes to Dinner,2003,18+,6.8,100.0,1,0,0,0,Bob Odenkirk,"Comedy,Drama,Romance",United States,English,83.0


In [41]:
social_sorted.loc[0:9, 'Title']

0    Bill Burr: I'm Sorry You Feel That Way
1                    Jim Gaffigan: Obsessed
2                     Melvin Goes to Dinner
3          Hannibal Buress: Comedy Camisado
4                                     Aśoka
5                               Sour Grapes
6                     The Edge of Seventeen
7                The Death of Mr. Lazarescu
8                   Silver Linings Playbook
9                                Frances Ha
Name: Title, dtype: object

In [42]:
social_sorted.iloc[0:10, 0]

0    Bill Burr: I'm Sorry You Feel That Way
1                    Jim Gaffigan: Obsessed
2                     Melvin Goes to Dinner
3          Hannibal Buress: Comedy Camisado
4                                     Aśoka
5                               Sour Grapes
6                     The Edge of Seventeen
7                The Death of Mr. Lazarescu
8                   Silver Linings Playbook
9                                Frances Ha
Name: Title, dtype: object

In [43]:
social_sorted.to_csv('Club_Movie_Social.csv')

In [44]:
social_sorted.to_csv('Club_Movie_Social_wo_Index.csv', index = False)

In [45]:
df1 = pd.read_csv("Club_Movie_Social.csv")
df1.head()

Unnamed: 0.1,Unnamed: 0,Title,Year,Age Rating,IMDb,Rotten Tomatoes,Netflix,Hulu,Prime Video,Disney+,Directors,Genres,Country,Language,Runtime (min)
0,0,Bill Burr: I'm Sorry You Feel That Way,2014,18+,8.4,100.0,1,0,0,0,Jay Karas,Comedy,United States,English,80.0
1,1,Jim Gaffigan: Obsessed,2014,16+,7.6,100.0,1,0,0,0,Jay Chapman,"Documentary,Comedy",United States,English,60.0
2,2,Melvin Goes to Dinner,2003,18+,6.8,100.0,1,0,0,0,Bob Odenkirk,"Comedy,Drama,Romance",United States,English,83.0
3,3,Hannibal Buress: Comedy Camisado,2016,18+,6.6,100.0,1,0,0,0,Lance Bangs,Comedy,United States,English,83.0
4,4,Aśoka,2001,18+,6.5,100.0,1,0,0,0,,Comedy,United States,English,30.0


In [46]:
df2 = pd.read_csv('Club_Movie_Social_wo_Index.csv')
df2.head()

Unnamed: 0,Title,Year,Age Rating,IMDb,Rotten Tomatoes,Netflix,Hulu,Prime Video,Disney+,Directors,Genres,Country,Language,Runtime (min)
0,Bill Burr: I'm Sorry You Feel That Way,2014,18+,8.4,100.0,1,0,0,0,Jay Karas,Comedy,United States,English,80.0
1,Jim Gaffigan: Obsessed,2014,16+,7.6,100.0,1,0,0,0,Jay Chapman,"Documentary,Comedy",United States,English,60.0
2,Melvin Goes to Dinner,2003,18+,6.8,100.0,1,0,0,0,Bob Odenkirk,"Comedy,Drama,Romance",United States,English,83.0
3,Hannibal Buress: Comedy Camisado,2016,18+,6.6,100.0,1,0,0,0,Lance Bangs,Comedy,United States,English,83.0
4,Aśoka,2001,18+,6.5,100.0,1,0,0,0,,Comedy,United States,English,30.0
