### About Netflix
Netflix is one of the most popular media and video streaming platforms. They have over 10000 movies or tv shows available on their platform, as of mid-2021, they have over 222M Subscribers globally. This tabular dataset consists of listings of all the movies and tv shows available on Netflix, along with details such as - cast, directors, ratings, release year, duration, etc.

### Business Problem

Analyze the data and generate insights that could help Netflix in deciding which type of shows/movies to produce and how they can grow the business in different countries.

In [92]:
import pandas as pd

In [236]:
df = pd.read_csv("netflix.csv")

### About Dataset
1. `Show_id`: Unique ID for every Movie / Tv Show
1. `Type`: Identifier - A Movie or TV Show
1. `Title`: Title of the Movie / Tv Show
1. `Director`: Director of the Movie
1. `Cast`: Actors involved in the movie/show
1. `Country`: Country where the movie/show was produced
1. `Date_added`: Date it was added on Netflix
1. `Release_year`: Actual Release year of the movie/show
1. `Rating`: TV Rating of the movie/show
1. `Duration`: Total Duration - in minutes or number of seasons
1. `Listed_in`: Genre
1. `Description`: The summary description

In [94]:
df.head()

Unnamed: 0,show_id,type,title,director,cast,country,date_added,release_year,rating,duration,listed_in,description
0,s1,Movie,Dick Johnson Is Dead,Kirsten Johnson,,United States,"September 25, 2021",2020,PG-13,90 min,Documentaries,"As her father nears the end of his life, filmm..."
1,s2,TV Show,Blood & Water,,"Ama Qamata, Khosi Ngema, Gail Mabalane, Thaban...",South Africa,"September 24, 2021",2021,TV-MA,2 Seasons,"International TV Shows, TV Dramas, TV Mysteries","After crossing paths at a party, a Cape Town t..."
2,s3,TV Show,Ganglands,Julien Leclercq,"Sami Bouajila, Tracy Gotoas, Samuel Jouy, Nabi...",,"September 24, 2021",2021,TV-MA,1 Season,"Crime TV Shows, International TV Shows, TV Act...",To protect his family from a powerful drug lor...
3,s4,TV Show,Jailbirds New Orleans,,,,"September 24, 2021",2021,TV-MA,1 Season,"Docuseries, Reality TV","Feuds, flirtations and toilet talk go down amo..."
4,s5,TV Show,Kota Factory,,"Mayur More, Jitendra Kumar, Ranjan Raj, Alam K...",India,"September 24, 2021",2021,TV-MA,2 Seasons,"International TV Shows, Romantic TV Shows, TV ...",In a city of coaching centers known to train I...


### 1. Defining Problem Statement and Analysing basic metrics (10 Points)
> `Probem statement`: How has the number of movies released per year changed over the last 20-30 years?


### 2. Observations on the shape of data, data types of all the attributes, conversion of categorical attributes to 'category' (If required), missing value detection, statistical summary (10 Points)

In [95]:
print("Shape of data:", df.shape)

Shape of data: (8807, 12)


In [96]:
# data type of all the attributes
df.dtypes

show_id         object
type            object
title           object
director        object
cast            object
country         object
date_added      object
release_year     int64
rating          object
duration        object
listed_in       object
description     object
dtype: object

In [97]:
# find missing value before handling missing value
print("Missing Value in all Columns:")
df.isnull().sum()

Missing Value in all Columns:


show_id            0
type               0
title              0
director        2634
cast             825
country          831
date_added        10
release_year       0
rating             4
duration           3
listed_in          0
description        0
dtype: int64

### Handle Missing Value

In [98]:
df = df.dropna().reset_index(drop=True)

In [99]:
df.head()

Unnamed: 0,show_id,type,title,director,cast,country,date_added,release_year,rating,duration,listed_in,description
0,s8,Movie,Sankofa,Haile Gerima,"Kofi Ghanaba, Oyafunmike Ogunlano, Alexandra D...","United States, Ghana, Burkina Faso, United Kin...","September 24, 2021",1993,TV-MA,125 min,"Dramas, Independent Movies, International Movies","On a photo shoot in Ghana, an American model s..."
1,s9,TV Show,The Great British Baking Show,Andy Devonshire,"Mel Giedroyc, Sue Perkins, Mary Berry, Paul Ho...",United Kingdom,"September 24, 2021",2021,TV-14,9 Seasons,"British TV Shows, Reality TV",A talented batch of amateur bakers face off in...
2,s10,Movie,The Starling,Theodore Melfi,"Melissa McCarthy, Chris O'Dowd, Kevin Kline, T...",United States,"September 24, 2021",2021,PG-13,104 min,"Comedies, Dramas",A woman adjusting to life after a loss contend...
3,s13,Movie,Je Suis Karl,Christian Schwochow,"Luna Wedler, Jannis Niewöhner, Milan Peschel, ...","Germany, Czech Republic","September 23, 2021",2021,TV-MA,127 min,"Dramas, International Movies",After most of her family is murdered in a terr...
4,s25,Movie,Jeans,S. Shankar,"Prashanth, Aishwarya Rai Bachchan, Sri Lakshmi...",India,"September 21, 2021",1998,TV-14,166 min,"Comedies, International Movies, Romantic Movies",When the father of the man she loves insists t...


In [100]:
# find missing value after handling missing value
print("Missing Value in all Columns:")
df.isnull().sum()

Missing Value in all Columns:


show_id         0
type            0
title           0
director        0
cast            0
country         0
date_added      0
release_year    0
rating          0
duration        0
listed_in       0
description     0
dtype: int64

In [101]:
# shape after handling missing value
df.shape

(5332, 12)

In [102]:
#Statistical Summary of object features
df.describe(include='object')

Unnamed: 0,show_id,type,title,director,cast,country,date_added,rating,duration,listed_in,description
count,5332,5332,5332,5332,5332,5332,5332,5332,5332,5332,5332
unique,5332,2,5332,3945,5200,604,1453,14,198,335,5321
top,s8,Movie,Sankofa,"Raúl Campos, Jan Suter",Samuel West,United States,"January 1, 2020",TV-MA,94 min,"Dramas, International Movies",When pretty new neighbor Seema falls for their...
freq,1,5185,1,18,10,1846,92,1822,135,336,2


In [103]:
#Statistical Summary of integer features
df.describe(include='int')

Unnamed: 0,release_year
count,5332.0
mean,2012.742123
std,9.625831
min,1942.0
25%,2011.0
50%,2016.0
75%,2018.0
max,2021.0


### 3. Non-Graphical Analysis: Value counts and unique attributes ​​(10 Points)

In [104]:
def non_graphica_analysis(column):
    count = df[column].value_counts()
    unique = df[column].unique()
    print("Unique Attrubutes:", unique.tolist())
    print("Value Counts:")
    print(count)

In [105]:
df.columns

Index(['show_id', 'type', 'title', 'director', 'cast', 'country', 'date_added',
       'release_year', 'rating', 'duration', 'listed_in', 'description'],
      dtype='object')

In [106]:
print("-"*25+"show_id"+"-"*25)
non_graphica_analysis('show_id')

-------------------------show_id-------------------------
Unique Attrubutes: ['s8', 's9', 's10', 's13', 's25', 's28', 's29', 's30', 's39', 's42', 's43', 's44', 's45', 's47', 's49', 's52', 's53', 's54', 's55', 's57', 's58', 's59', 's60', 's61', 's62', 's63', 's64', 's74', 's82', 's85', 's91', 's95', 's97', 's106', 's108', 's115', 's116', 's117', 's119', 's123', 's127', 's128', 's130', 's132', 's134', 's135', 's136', 's137', 's138', 's139', 's140', 's141', 's142', 's143', 's144', 's145', 's146', 's147', 's150', 's151', 's152', 's153', 's156', 's157', 's158', 's159', 's160', 's162', 's163', 's164', 's165', 's167', 's168', 's169', 's170', 's171', 's172', 's173', 's174', 's175', 's176', 's177', 's178', 's179', 's180', 's183', 's184', 's189', 's191', 's192', 's193', 's196', 's199', 's200', 's201', 's202', 's203', 's204', 's205', 's206', 's207', 's208', 's209', 's210', 's211', 's212', 's216', 's217', 's218', 's228', 's229', 's230', 's232', 's248', 's252', 's254', 's260', 's265', 's268', 's271

In [107]:
print("-"*25+"type"+"-"*25)
non_graphica_analysis('type')

-------------------------type-------------------------
Unique Attrubutes: ['Movie', 'TV Show']
Value Counts:
type
Movie      5185
TV Show     147
Name: count, dtype: int64


In [108]:
print("-"*25+"title"+"-"*25)
non_graphica_analysis('title')

-------------------------title-------------------------
Value Counts:
title
Sankofa                             1
Benji's Very Own Christmas Story    1
Beneath the Leaves                  1
Below Her Mouth                     1
Being AP                            1
                                   ..
Evvarikee Cheppoddu                 1
Defiance                            1
Holiday Rush                        1
The Island                          1
Zubaan                              1
Name: count, Length: 5332, dtype: int64


In [109]:
print("-"*25+"director"+"-"*25)
non_graphica_analysis('director')

-------------------------director-------------------------
Unique Attrubutes: ['Haile Gerima', 'Andy Devonshire', 'Theodore Melfi', 'Christian Schwochow', 'S. Shankar', 'Dennis Dugan', 'Scott Stewart', 'Robert Luketic', 'George Nolfi', 'Steven Spielberg', 'Jeannot Szwarc', 'Joe Alves', 'Joseph Sargent', 'Daniel Espinosa', 'Antoine Fuqua', 'Toshiya Shinohara', 'Masahiko Murata', 'Hajime Kamegaki', 'Hirotsugu Kawasaki', 'Toshiyuki Tsuru', 'Tensai Okamura', 'Kemi Adetiba', 'Cedric Nicolas-Troyan', 'JJC Skillz, Funke Akindele', 'Alice Waddington', 'Raja Gosnell', 'Stephen Kijak', 'Lijo Jose Pellissery', 'David de Vos', 'Rahul Rawail', 'Jane Campion', 'Nagesh Kukunoor', 'Shanker Raman', 'Vidhu Vinod Chopra', 'Mark Rosman', 'Lasse Hallström', 'Ridley Scott', 'Neill Blomkamp', 'Phillip Noyce', 'Renny Harlin', 'Anthony Minghella', 'Simon Wincer', 'Spike Lee', 'Sebastián Schindel', 'Steven C. Miller', 'Richard LaGravenese', 'Martin Campbell', 'Reginald Hudlin', 'George Jackson, Doug McHenry', '

In [110]:
print("-"*25+"cast"+"-"*25)
non_graphica_analysis('cast')

-------------------------cast-------------------------
Unique Attrubutes: ['Kofi Ghanaba, Oyafunmike Ogunlano, Alexandra Duah, Nick Medley, Mutabaruka, Afemo Omilami, Reggie Carter, Mzuri', 'Mel Giedroyc, Sue Perkins, Mary Berry, Paul Hollywood', "Melissa McCarthy, Chris O'Dowd, Kevin Kline, Timothy Olyphant, Daveed Diggs, Skyler Gisondo, Laura Harrier, Rosalind Chao, Kimberly Quinn, Loretta Devine, Ravi Kapoor", 'Luna Wedler, Jannis Niewöhner, Milan Peschel, Edin Hasanović, Anna Fialová, Marlon Boess, Victor Boccard, Fleur Geffrier, Aziz Dyab, Mélanie Fouché, Elizaveta Maximová', 'Prashanth, Aishwarya Rai Bachchan, Sri Lakshmi, Nassar', 'Adam Sandler, Kevin James, Chris Rock, David Spade, Rob Schneider, Salma Hayek, Maria Bello, Maya Rudolph, Colin Quinn, Tim Meadows, Joyce Van Patten', 'Keri Russell, Josh Hamilton, J.K. Simmons, Dakota Goyo, Kadan Rockett, L.J. Benet, Rich Hutchman, Myndy Crist, Annie Thurman, Jake Brennan', 'Liam Hemsworth, Gary Oldman, Amber Heard, Harrison Ford, L

In [111]:
print("-"*25+"country"+"-"*25)
non_graphica_analysis('country')

-------------------------country-------------------------
Unique Attrubutes: ['United States, Ghana, Burkina Faso, United Kingdom, Germany, Ethiopia', 'United Kingdom', 'United States', 'Germany, Czech Republic', 'India', 'United States, India, France', 'China, Canada, United States', 'South Africa, United States, Japan', 'Japan', 'Nigeria', 'Spain, United States', 'United Kingdom, United States', 'United Kingdom, Australia, France', 'United Kingdom, Australia, France, United States', 'United States, Canada', 'Germany, United States', 'South Africa, United States', 'United States, Mexico', 'United States, Italy, France, Japan', 'United States, Italy, Romania, United Kingdom', 'Australia, United States', 'Argentina, Venezuela', 'United States, United Kingdom, Canada', 'China, Hong Kong', 'Canada', 'Hong Kong', 'United States, China, Hong Kong', 'Italy, United States', 'United States, Germany', 'France', 'United Kingdom, Canada, United States', 'United States, United Kingdom', 'India, Ne

In [112]:
print("-"*25+"date_added"+"-"*25)
non_graphica_analysis('date_added')

-------------------------date_added-------------------------
Unique Attrubutes: ['September 24, 2021', 'September 23, 2021', 'September 21, 2021', 'September 20, 2021', 'September 19, 2021', 'September 16, 2021', 'September 15, 2021', 'September 14, 2021', 'September 10, 2021', 'September 9, 2021', 'September 8, 2021', 'September 7, 2021', 'September 5, 2021', 'September 4, 2021', 'September 2, 2021', 'September 1, 2021', 'August 31, 2021', 'August 28, 2021', 'August 27, 2021', 'August 25, 2021', 'August 20, 2021', 'August 19, 2021', 'August 18, 2021', 'August 16, 2021', 'August 15, 2021', 'August 13, 2021', 'August 12, 2021', 'August 11, 2021', 'August 8, 2021', 'August 7, 2021', 'August 6, 2021', 'August 5, 2021', 'August 4, 2021', 'August 3, 2021', 'August 1, 2021', 'July 30, 2021', 'July 29, 2021', 'July 28, 2021', 'July 27, 2021', 'July 24, 2021', 'July 23, 2021', 'July 22, 2021', 'July 21, 2021', 'July 20, 2021', 'July 19, 2021', 'July 17, 2021', 'July 16, 2021', 'July 15, 2021',

In [113]:
print("-"*25+"release_year"+"-"*25)
non_graphica_analysis('release_year')

-------------------------release_year-------------------------
Unique Attrubutes: [1993, 2021, 1998, 2010, 2013, 2017, 1975, 1978, 1983, 1987, 2012, 2001, 2002, 2003, 2004, 2011, 2008, 2009, 2007, 2005, 2006, 2018, 2020, 2019, 1994, 2015, 1982, 1989, 2014, 1990, 1991, 1999, 2016, 1986, 1996, 1984, 1997, 1980, 1961, 1995, 1985, 1992, 2000, 1976, 1959, 1988, 1972, 1981, 1964, 1954, 1979, 1958, 1956, 1963, 1970, 1973, 1960, 1974, 1966, 1971, 1962, 1969, 1977, 1967, 1968, 1965, 1945, 1946, 1955, 1942, 1947, 1944]
Value Counts:
release_year
2017    657
2018    648
2016    577
2019    519
2020    442
       ... 
1946      1
1961      1
1942      1
1947      1
1944      1
Name: count, Length: 72, dtype: int64


In [114]:
print("-"*25+"rating"+"-"*25)
non_graphica_analysis('rating')

-------------------------rating-------------------------
Unique Attrubutes: ['TV-MA', 'TV-14', 'PG-13', 'PG', 'R', 'TV-PG', 'G', 'TV-Y7', 'TV-G', 'TV-Y', 'NC-17', 'NR', 'TV-Y7-FV', 'UR']
Value Counts:
rating
TV-MA       1822
TV-14       1214
R            778
PG-13        470
TV-PG        431
PG           275
TV-G          84
TV-Y7         76
TV-Y          76
NR            58
G             40
TV-Y7-FV       3
UR             3
NC-17          2
Name: count, dtype: int64


In [115]:
print("-"*25+"duration"+"-"*25)
non_graphica_analysis('duration')

-------------------------duration-------------------------
Unique Attrubutes: ['125 min', '9 Seasons', '104 min', '127 min', '166 min', '103 min', '97 min', '106 min', '96 min', '124 min', '116 min', '98 min', '91 min', '115 min', '122 min', '99 min', '88 min', '100 min', '102 min', '93 min', '95 min', '85 min', '83 min', '182 min', '147 min', '90 min', '128 min', '143 min', '119 min', '114 min', '118 min', '108 min', '117 min', '121 min', '142 min', '113 min', '154 min', '120 min', '82 min', '94 min', '109 min', '101 min', '105 min', '86 min', '229 min', '76 min', '89 min', '110 min', '156 min', '112 min', '129 min', '107 min', '1 Season', '135 min', '136 min', '165 min', '150 min', '133 min', '145 min', '92 min', '2 Seasons', '64 min', '59 min', '111 min', '87 min', '148 min', '189 min', '141 min', '130 min', '7 Seasons', '68 min', '131 min', '126 min', '155 min', '123 min', '84 min', '4 Seasons', '13 min', '77 min', '74 min', '49 min', '72 min', '78 min', '70 min', '132 min', '140 m

In [116]:
print("-"*25+"listed_in"+"-"*25)
non_graphica_analysis('listed_in')

-------------------------listed_in-------------------------
Unique Attrubutes: ['Dramas, Independent Movies, International Movies', 'British TV Shows, Reality TV', 'Comedies, Dramas', 'Dramas, International Movies', 'Comedies, International Movies, Romantic Movies', 'Comedies', 'Horror Movies, Sci-Fi & Fantasy', 'Thrillers', 'Action & Adventure, Dramas', 'Action & Adventure, Classic Movies, Dramas', 'Dramas, Horror Movies, Thrillers', 'Action & Adventure, Horror Movies, Thrillers', 'Action & Adventure', 'Dramas, Thrillers', 'Action & Adventure, Anime Features, International Movies', 'Action & Adventure, Comedies, Dramas', 'Sci-Fi & Fantasy, Thrillers', 'Children & Family Movies, Comedies', 'Documentaries, Music & Musicals', 'Children & Family Movies, Dramas', 'Dramas, International Movies, Thrillers', 'Dramas, Romantic Movies', 'Comedies, Dramas, Independent Movies', 'Dramas, International Movies, Romantic Movies', 'Dramas', 'Action & Adventure, Classic Movies, Cult Movies', 'Action & 

In [117]:
print("-"*25+"description"+"-"*25)
non_graphica_analysis('description')

-------------------------description-------------------------
Value Counts:
description
When pretty new neighbor Seema falls for their shy roommate Sid, jealous womanizers Omi and Jai plot to break up the new lovebirds.                        2
Mistakenly accused of an attack on the Fourth Raikage, ninja Naruto is imprisoned in the impenetrable Hozuki Castle and his powers are sealed.             2
With their biggest foe seemingly defeated, InuYasha and his friends return to everyday life. But the peace is soon shattered by an emerging new enemy.     2
When Elastigirl gets recruited for a high-profile crime-fighting mission, Mr. Incredible takes on his toughest assignment ever: full-time parenting.       2
After devastating terror attacks in Norway, a young survivor, grieving families and the country rally for justice and healing. Based on a true story.      2
                                                                                                                               

### 4. Visual Analysis - Univariate, Bivariate after pre-processing of the data

Note: Pre-processing involves unnesting of the data in columns like Actor, Director, Country

4.1 For continuous variable(s): Distplot, countplot, histogram for univariate analysis (10 Points)

4.2 For categorical variable(s): Boxplot (10 Points)

4.3 For correlation: Heatmaps, Pairplots (10 Points)

In [193]:
# unnesting director column
director_df = df['director'].str.split(', ', expand=True).set_index(df['title']).stack().reset_index().rename(columns={0:"Directors"}).drop(['level_1'], axis=1)
director_df

Unnamed: 0,title,Directors
0,Sankofa,Haile Gerima
1,The Great British Baking Show,Andy Devonshire
2,The Starling,Theodore Melfi
3,Je Suis Karl,Christian Schwochow
4,Jeans,S. Shankar
...,...,...
5955,Zinzana,Majid Al Ansari
5956,Zodiac,David Fincher
5957,Zombieland,Ruben Fleischer
5958,Zoom,Peter Hewitt


In [218]:
# unnesting cast_df column
cast_df = df['cast'].str.split(", ", expand=True).set_index(df['title']).stack().reset_index().rename(columns={0:'casts'}).drop(['level_1'], axis=1)
cast_df

Unnamed: 0,title,casts
0,Sankofa,Kofi Ghanaba
1,Sankofa,Oyafunmike Ogunlano
2,Sankofa,Alexandra Duah
3,Sankofa,Nick Medley
4,Sankofa,Mutabaruka
...,...,...
42706,Zubaan,Manish Chaudhary
42707,Zubaan,Meghna Malik
42708,Zubaan,Malkeet Rauni
42709,Zubaan,Anita Shabdish


In [219]:
# unnesting country column
df_country = df['country'].str.split(', ', expand=True).set_index(df['title']).stack().reset_index().drop(['level_1'], axis=1).rename(columns={0:"country"})
df_country

Unnamed: 0,title,country
0,Sankofa,United States
1,Sankofa,Ghana
2,Sankofa,Burkina Faso
3,Sankofa,United Kingdom
4,Sankofa,Germany
...,...,...
6874,Zinzana,Jordan
6875,Zodiac,United States
6876,Zombieland,United States
6877,Zoom,United States


In [228]:
# unnesting listed_in column
listed_df = df['listed_in'].str.split(", ", expand=True).set_index(df['title']).stack().reset_index().drop(['level_1'], axis=1).rename(columns={0:'listed_in'})
listed_df

Unnamed: 0,title,listed_in
0,Sankofa,Dramas
1,Sankofa,Independent Movies
2,Sankofa,International Movies
3,The Great British Baking Show,British TV Shows
4,The Great British Baking Show,Reality TV
...,...,...
11853,Zoom,Children & Family Movies
11854,Zoom,Comedies
11855,Zubaan,Dramas
11856,Zubaan,International Movies


In [230]:
new_df = cast_df.merge(director_df, on=['title'], how='inner').merge(df_country, on=['title'], how='inner').merge(listed_df, on=['title'], how='inner')
new_df

Unnamed: 0,title,casts,Directors,country,listed_in
0,Sankofa,Kofi Ghanaba,Haile Gerima,United States,Dramas
1,Sankofa,Kofi Ghanaba,Haile Gerima,United States,Independent Movies
2,Sankofa,Kofi Ghanaba,Haile Gerima,United States,International Movies
3,Sankofa,Kofi Ghanaba,Haile Gerima,Ghana,Dramas
4,Sankofa,Kofi Ghanaba,Haile Gerima,Ghana,Independent Movies
...,...,...,...,...,...
143087,Zubaan,Anita Shabdish,Mozez Singh,India,International Movies
143088,Zubaan,Anita Shabdish,Mozez Singh,India,Music & Musicals
143089,Zubaan,Chittaranjan Tripathy,Mozez Singh,India,Dramas
143090,Zubaan,Chittaranjan Tripathy,Mozez Singh,India,International Movies


In [231]:
new_df.isnull().sum()

title        0
casts        0
Directors    0
country      0
listed_in    0
dtype: int64

In [237]:
df.drop(['director','cast','country', 'listed_in' ], axis=True, inplace=True)
df.head()

Unnamed: 0,show_id,type,title,date_added,release_year,rating,duration,description
0,s1,Movie,Dick Johnson Is Dead,"September 25, 2021",2020,PG-13,90 min,"As her father nears the end of his life, filmm..."
1,s2,TV Show,Blood & Water,"September 24, 2021",2021,TV-MA,2 Seasons,"After crossing paths at a party, a Cape Town t..."
2,s3,TV Show,Ganglands,"September 24, 2021",2021,TV-MA,1 Season,To protect his family from a powerful drug lor...
3,s4,TV Show,Jailbirds New Orleans,"September 24, 2021",2021,TV-MA,1 Season,"Feuds, flirtations and toilet talk go down amo..."
4,s5,TV Show,Kota Factory,"September 24, 2021",2021,TV-MA,2 Seasons,In a city of coaching centers known to train I...


In [239]:
final_df = new_df.merge(df, on=['title'], how='left')
final_df.head()

Unnamed: 0,title,casts,Directors,country,listed_in,show_id,type,date_added,release_year,rating,duration,description
0,Sankofa,Kofi Ghanaba,Haile Gerima,United States,Dramas,s8,Movie,"September 24, 2021",1993,TV-MA,125 min,"On a photo shoot in Ghana, an American model s..."
1,Sankofa,Kofi Ghanaba,Haile Gerima,United States,Independent Movies,s8,Movie,"September 24, 2021",1993,TV-MA,125 min,"On a photo shoot in Ghana, an American model s..."
2,Sankofa,Kofi Ghanaba,Haile Gerima,United States,International Movies,s8,Movie,"September 24, 2021",1993,TV-MA,125 min,"On a photo shoot in Ghana, an American model s..."
3,Sankofa,Kofi Ghanaba,Haile Gerima,Ghana,Dramas,s8,Movie,"September 24, 2021",1993,TV-MA,125 min,"On a photo shoot in Ghana, an American model s..."
4,Sankofa,Kofi Ghanaba,Haile Gerima,Ghana,Independent Movies,s8,Movie,"September 24, 2021",1993,TV-MA,125 min,"On a photo shoot in Ghana, an American model s..."


In [240]:
final_df.isnull().sum()

title           0
casts           0
Directors       0
country         0
listed_in       0
show_id         0
type            0
date_added      0
release_year    0
rating          0
duration        0
description     0
dtype: int64

`4.1 For continuous variable(s): Distplot, countplot, histogram for univariate analysis (10 Points)`


`
4.2 For categorical variable(s): Boxplot (10 Points)
`

`
4.3 For correlation: Heatmaps, Pairplots (10 Points)`

### Hints

1. The exploration should have a goal. As you explore the data, keep in mind that you want to answer which type of shows to produce and how to grow the business.
1. Ensure each recommendation is backed by data. The company is looking for data-driven insights, not personal opinions or anecdotes.
1. Assume that you are presenting your findings to business executives who have only a basic understanding of data science. Avoid 1. unnecessary technical jargon.
1. Start by exploring a few questions: What type of content is available in different countries?
    1. How has the number of movies released per year changed over the last 20-30 years?
    1. Comparison of tv shows vs. movies.
    1. What is the best time to launch a TV show?
    1. Analysis of actors/directors of different types of shows/movies.
    1. Does Netflix has more focus on TV Shows than movies in recent years
    1. Understanding what content is available in different countries