# Exploratory Data Analysis on Netflix DataSet
![](https://wallpapercave.com/wp/wp1917118.jpg)

**What is Exploratory Data Analysis?**

Exploratory data analysis (EDA for short) is what data analysts do with large sets of data, looking for patterns and summarizing the dataset’s main characteristics beyond what they learn from modeling and hypothesis testing. EDA is a philosophy that allows data analysts to approach a database without assumptions. When a data analyst employs EDA, it’s like they’re asking the data to tell them what they don’t know.

It is an approach to data analysis, that uses these techniques:

1. Maximize insights into a dataset.
2. Uncover underlying structures.
3. Extract important variables.
4. Detect outliers and anomalies.
5. Test underlying assumptions.
6. Determine optimal factor settings.

## **Outline Of the Project**

1. **Download and Load the Dataset**
2. **Import and Install required libraries**
3. **Data Preparation and Cleaning**
4. **Ask and answer questions about the data through visualization**
5. **Summarize and write a conclusion**

## Select and Download the dataset

#### You can download the data set from :- https://www.kaggle.com/shivamb/netflix-shows

In [1]:
!pip install jovian --upgrade --quiet

In [2]:
import jovian

In [3]:
# Execute this to save new versions of the notebook
jovian.commit(project="eda")

<IPython.core.display.Javascript object>

[jovian] Updating notebook "phegde/eda" on https://jovian.ai[0m
[jovian] Committed successfully! https://jovian.ai/phegde/eda[0m


'https://jovian.ai/phegde/eda'

In [4]:
!pip install plotly folium --quiet

### Installing all the required libraries

In [5]:
import pandas as pd
import plotly.express as px
import plotly.graph_objects as go
from plotly.subplots import make_subplots
import plotly.figure_factory as ff
import seaborn as sns
import matplotlib.pyplot as plt
import folium

### Download the data set from https://www.kaggle.com/shivamb/netflix-shows

### Load the Data set using pandas

In [6]:
netflix_df = pd.read_csv("netflix_titles.csv")

In [7]:
# Lets check first 5 data points to get some idea about the kind of the data we are dealing with
netflix_df.head()

Unnamed: 0,show_id,type,title,director,cast,country,date_added,release_year,rating,duration,listed_in,description
0,s1,Movie,Dick Johnson Is Dead,Kirsten Johnson,,United States,"September 25, 2021",2020,PG-13,90 min,Documentaries,"As her father nears the end of his life, filmm..."
1,s2,TV Show,Blood & Water,,"Ama Qamata, Khosi Ngema, Gail Mabalane, Thaban...",South Africa,"September 24, 2021",2021,TV-MA,2 Seasons,"International TV Shows, TV Dramas, TV Mysteries","After crossing paths at a party, a Cape Town t..."
2,s3,TV Show,Ganglands,Julien Leclercq,"Sami Bouajila, Tracy Gotoas, Samuel Jouy, Nabi...",,"September 24, 2021",2021,TV-MA,1 Season,"Crime TV Shows, International TV Shows, TV Act...",To protect his family from a powerful drug lor...
3,s4,TV Show,Jailbirds New Orleans,,,,"September 24, 2021",2021,TV-MA,1 Season,"Docuseries, Reality TV","Feuds, flirtations and toilet talk go down amo..."
4,s5,TV Show,Kota Factory,,"Mayur More, Jitendra Kumar, Ranjan Raj, Alam K...",India,"September 24, 2021",2021,TV-MA,2 Seasons,"International TV Shows, Romantic TV Shows, TV ...",In a city of coaching centers known to train I...


In [8]:
# Get more information about data type of the column number of missing values from data.info() method

In [9]:
netflix_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 8807 entries, 0 to 8806
Data columns (total 12 columns):
 #   Column        Non-Null Count  Dtype 
---  ------        --------------  ----- 
 0   show_id       8807 non-null   object
 1   type          8807 non-null   object
 2   title         8807 non-null   object
 3   director      6173 non-null   object
 4   cast          7982 non-null   object
 5   country       7976 non-null   object
 6   date_added    8797 non-null   object
 7   release_year  8807 non-null   int64 
 8   rating        8803 non-null   object
 9   duration      8804 non-null   object
 10  listed_in     8807 non-null   object
 11  description   8807 non-null   object
dtypes: int64(1), object(11)
memory usage: 825.8+ KB


In [10]:
# Converting date_added to date format
netflix_df["date_added"] = pd.to_datetime(netflix_df["date_added"])

In [11]:
jovian.commit()

<IPython.core.display.Javascript object>

[jovian] Updating notebook "phegde/eda" on https://jovian.ai[0m
[jovian] Committed successfully! https://jovian.ai/phegde/eda[0m


'https://jovian.ai/phegde/eda'

In [12]:
## Checking for null values
netflix_df.isnull().sum()

show_id            0
type               0
title              0
director        2634
cast             825
country          831
date_added        10
release_year       0
rating             4
duration           3
listed_in          0
description        0
dtype: int64

**We can see that there are missing values present in column director, cast, country and many more
Either we can replace the missing values or we can drop those columns
Here i am replacing missing values for director and country column**

In [13]:
# Replacing missing values
netflix_df["director"].fillna("missing", inplace=True)
netflix_df["country"].fillna("missisng", inplace=True)

In [14]:
netflix_df.isnull().sum()

show_id           0
type              0
title             0
director          0
cast            825
country           0
date_added       10
release_year      0
rating            4
duration          3
listed_in         0
description       0
dtype: int64

In [15]:
netflix_df.dropna(inplace=True)

In [16]:
netflix_df.isnull().sum()

show_id         0
type            0
title           0
director        0
cast            0
country         0
date_added      0
release_year    0
rating          0
duration        0
listed_in       0
description     0
dtype: int64

In [17]:
## Changing the column name listed_in to Genre
netflix_df.rename(columns={"listed_in":"genre"}, inplace=True)

In [18]:
## Converting multiple genre of movie to single genre
netflix_df['genre'] = netflix_df['genre'].apply(lambda x: x.split(",")[0])

In [19]:
netflix_df.head()

Unnamed: 0,show_id,type,title,director,cast,country,date_added,release_year,rating,duration,genre,description
1,s2,TV Show,Blood & Water,missing,"Ama Qamata, Khosi Ngema, Gail Mabalane, Thaban...",South Africa,2021-09-24,2021,TV-MA,2 Seasons,International TV Shows,"After crossing paths at a party, a Cape Town t..."
2,s3,TV Show,Ganglands,Julien Leclercq,"Sami Bouajila, Tracy Gotoas, Samuel Jouy, Nabi...",missisng,2021-09-24,2021,TV-MA,1 Season,Crime TV Shows,To protect his family from a powerful drug lor...
4,s5,TV Show,Kota Factory,missing,"Mayur More, Jitendra Kumar, Ranjan Raj, Alam K...",India,2021-09-24,2021,TV-MA,2 Seasons,International TV Shows,In a city of coaching centers known to train I...
5,s6,TV Show,Midnight Mass,Mike Flanagan,"Kate Siegel, Zach Gilford, Hamish Linklater, H...",missisng,2021-09-24,2021,TV-MA,1 Season,TV Dramas,The arrival of a charismatic young priest brin...
6,s7,Movie,My Little Pony: A New Generation,"Robert Cullen, José Luis Ucha","Vanessa Hudgens, Kimiko Glenn, James Marsden, ...",missisng,2021-09-24,2021,PG,91 min,Children & Family Movies,Equestria's divided. But a bright-eyed hero be...


In [20]:
# Creating year and month column
netflix_df["Year_added"] = netflix_df['date_added'].dt.year
netflix_df["Month_added"] = netflix_df['date_added'].dt.month

In [21]:
netflix_df.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 7965 entries, 1 to 8806
Data columns (total 14 columns):
 #   Column        Non-Null Count  Dtype         
---  ------        --------------  -----         
 0   show_id       7965 non-null   object        
 1   type          7965 non-null   object        
 2   title         7965 non-null   object        
 3   director      7965 non-null   object        
 4   cast          7965 non-null   object        
 5   country       7965 non-null   object        
 6   date_added    7965 non-null   datetime64[ns]
 7   release_year  7965 non-null   int64         
 8   rating        7965 non-null   object        
 9   duration      7965 non-null   object        
 10  genre         7965 non-null   object        
 11  description   7965 non-null   object        
 12  Year_added    7965 non-null   int64         
 13  Month_added   7965 non-null   int64         
dtypes: datetime64[ns](1), int64(3), object(10)
memory usage: 933.4+ KB


In [22]:
jovian.commit()

<IPython.core.display.Javascript object>

[jovian] Updating notebook "phegde/eda" on https://jovian.ai[0m
[jovian] Committed successfully! https://jovian.ai/phegde/eda[0m


'https://jovian.ai/phegde/eda'

In [23]:
netflix_df.head()

Unnamed: 0,show_id,type,title,director,cast,country,date_added,release_year,rating,duration,genre,description,Year_added,Month_added
1,s2,TV Show,Blood & Water,missing,"Ama Qamata, Khosi Ngema, Gail Mabalane, Thaban...",South Africa,2021-09-24,2021,TV-MA,2 Seasons,International TV Shows,"After crossing paths at a party, a Cape Town t...",2021,9
2,s3,TV Show,Ganglands,Julien Leclercq,"Sami Bouajila, Tracy Gotoas, Samuel Jouy, Nabi...",missisng,2021-09-24,2021,TV-MA,1 Season,Crime TV Shows,To protect his family from a powerful drug lor...,2021,9
4,s5,TV Show,Kota Factory,missing,"Mayur More, Jitendra Kumar, Ranjan Raj, Alam K...",India,2021-09-24,2021,TV-MA,2 Seasons,International TV Shows,In a city of coaching centers known to train I...,2021,9
5,s6,TV Show,Midnight Mass,Mike Flanagan,"Kate Siegel, Zach Gilford, Hamish Linklater, H...",missisng,2021-09-24,2021,TV-MA,1 Season,TV Dramas,The arrival of a charismatic young priest brin...,2021,9
6,s7,Movie,My Little Pony: A New Generation,"Robert Cullen, José Luis Ucha","Vanessa Hudgens, Kimiko Glenn, James Marsden, ...",missisng,2021-09-24,2021,PG,91 min,Children & Family Movies,Equestria's divided. But a bright-eyed hero be...,2021,9


In [24]:
# Since we will not perform any statistics or visualization with cast and since we created year and month column from date_added
# Dropping these two columns
netflix_df.drop(axis=1, columns=["cast","date_added"], inplace=True)

In [25]:
netflix_df.head(9)

Unnamed: 0,show_id,type,title,director,country,release_year,rating,duration,genre,description,Year_added,Month_added
1,s2,TV Show,Blood & Water,missing,South Africa,2021,TV-MA,2 Seasons,International TV Shows,"After crossing paths at a party, a Cape Town t...",2021,9
2,s3,TV Show,Ganglands,Julien Leclercq,missisng,2021,TV-MA,1 Season,Crime TV Shows,To protect his family from a powerful drug lor...,2021,9
4,s5,TV Show,Kota Factory,missing,India,2021,TV-MA,2 Seasons,International TV Shows,In a city of coaching centers known to train I...,2021,9
5,s6,TV Show,Midnight Mass,Mike Flanagan,missisng,2021,TV-MA,1 Season,TV Dramas,The arrival of a charismatic young priest brin...,2021,9
6,s7,Movie,My Little Pony: A New Generation,"Robert Cullen, José Luis Ucha",missisng,2021,PG,91 min,Children & Family Movies,Equestria's divided. But a bright-eyed hero be...,2021,9
7,s8,Movie,Sankofa,Haile Gerima,"United States, Ghana, Burkina Faso, United Kin...",1993,TV-MA,125 min,Dramas,"On a photo shoot in Ghana, an American model s...",2021,9
8,s9,TV Show,The Great British Baking Show,Andy Devonshire,United Kingdom,2021,TV-14,9 Seasons,British TV Shows,A talented batch of amateur bakers face off in...,2021,9
9,s10,Movie,The Starling,Theodore Melfi,United States,2021,PG-13,104 min,Comedies,A woman adjusting to life after a loss contend...,2021,9
11,s12,TV Show,Bangkok Breaking,Kongkiat Komesiri,missisng,2021,TV-MA,1 Season,Crime TV Shows,"Struggling to earn a living in Bangkok, a man ...",2021,9


In [26]:
# Calculating totoal movies and TV shows in each country

In [27]:
country_number = {}

In [28]:
def count_countries(one_row):
    country_list = one_row.split(",")

    for value in country_list:
        value = value.lower()
        if value in country_number.keys():
            count = country_number[value] + 1
            country_number[value] = count
        else:
            country_number[value] = 1

In [29]:
netflix_df["country"].apply(count_countries)

1       None
2       None
4       None
5       None
6       None
        ... 
8801    None
8802    None
8804    None
8805    None
8806    None
Name: country, Length: 7965, dtype: object

In [30]:
country_number.pop('missisng')

675

In [31]:
country = []
number =[]
for key,val in country_number.items():
    country.append(key)
    number.append(val)

In [32]:
country_n = {"country":country,
            "number":number}

In [33]:
country_df = pd.DataFrame.from_dict(country_n)

In [34]:
country_df

Unnamed: 0,country,number
0,south africa,39
1,india,976
2,united states,2839
3,ghana,1
4,burkina faso,1
...,...,...
180,sudan,1
181,panama,1
182,slovenia,2
183,east germany,1


In [35]:
location = pd.read_csv("location_countries.csv");

In [36]:
location = location.loc[:,["latitude","longitude","name"]]

In [37]:
location["country"] = location["name"].str.lower()

In [38]:
location

Unnamed: 0,latitude,longitude,name,country
0,42.546245,1.601554,Andorra,andorra
1,23.424076,53.847818,United Arab Emirates,united arab emirates
2,33.939110,67.709953,Afghanistan,afghanistan
3,17.060816,-61.796428,Antigua and Barbuda,antigua and barbuda
4,18.220554,-63.068615,Anguilla,anguilla
...,...,...,...,...
240,15.552727,48.516388,Yemen,yemen
241,-12.827500,45.166244,Mayotte,mayotte
242,-30.559482,22.937506,South Africa,south africa
243,-13.133897,27.849332,Zambia,zambia


In [39]:
country_df_map =location.merge(country_df, on="country",how="inner")

In [40]:
country_df_map

Unnamed: 0,latitude,longitude,name,country,number
0,23.424076,53.847818,United Arab Emirates,united arab emirates,20
1,-38.416097,-63.616672,Argentina,argentina,70
2,47.516231,14.550072,Austria,austria,7
3,-25.274398,133.775136,Australia,australia,100
4,23.684994,90.356331,Bangladesh,bangladesh,3
...,...,...,...,...,...
77,-32.522779,-55.765835,Uruguay,uruguay,7
78,6.423750,-66.589730,Venezuela,venezuela,1
79,14.058324,108.277199,Vietnam,vietnam,7
80,-30.559482,22.937506,South Africa,south africa,39


In [41]:
# Countries where netflix is present

In [42]:
m = folium.Map(location=[0, 0], zoom_start=2)
tooltip = "Click me!"
for index, row in country_df_map.iterrows():
    folium.Marker([row['latitude'], row['longitude']], popup=row["number"],tooltip=tooltip).add_to(m)
    
m

In [43]:
netflix_df.head()

Unnamed: 0,show_id,type,title,director,country,release_year,rating,duration,genre,description,Year_added,Month_added
1,s2,TV Show,Blood & Water,missing,South Africa,2021,TV-MA,2 Seasons,International TV Shows,"After crossing paths at a party, a Cape Town t...",2021,9
2,s3,TV Show,Ganglands,Julien Leclercq,missisng,2021,TV-MA,1 Season,Crime TV Shows,To protect his family from a powerful drug lor...,2021,9
4,s5,TV Show,Kota Factory,missing,India,2021,TV-MA,2 Seasons,International TV Shows,In a city of coaching centers known to train I...,2021,9
5,s6,TV Show,Midnight Mass,Mike Flanagan,missisng,2021,TV-MA,1 Season,TV Dramas,The arrival of a charismatic young priest brin...,2021,9
6,s7,Movie,My Little Pony: A New Generation,"Robert Cullen, José Luis Ucha",missisng,2021,PG,91 min,Children & Family Movies,Equestria's divided. But a bright-eyed hero be...,2021,9


In [44]:
# Finding the number of seasons for TV shows

In [45]:
def season_split(season):
    if "Seasons" in season:
        season_list.append(season.split(" ")[0])
    elif "Season" in season:
        season_list.append(season.split(" ")[0])
    else:
        season_list.append(0)

In [46]:
season_list = []
netflix_df["duration"].apply(season_split)

1       None
2       None
4       None
5       None
6       None
        ... 
8801    None
8802    None
8804    None
8805    None
8806    None
Name: duration, Length: 7965, dtype: object

In [47]:
netflix_df["seasons"] = season_list

In [48]:
netflix_df.head()

Unnamed: 0,show_id,type,title,director,country,release_year,rating,duration,genre,description,Year_added,Month_added,seasons
1,s2,TV Show,Blood & Water,missing,South Africa,2021,TV-MA,2 Seasons,International TV Shows,"After crossing paths at a party, a Cape Town t...",2021,9,2
2,s3,TV Show,Ganglands,Julien Leclercq,missisng,2021,TV-MA,1 Season,Crime TV Shows,To protect his family from a powerful drug lor...,2021,9,1
4,s5,TV Show,Kota Factory,missing,India,2021,TV-MA,2 Seasons,International TV Shows,In a city of coaching centers known to train I...,2021,9,2
5,s6,TV Show,Midnight Mass,Mike Flanagan,missisng,2021,TV-MA,1 Season,TV Dramas,The arrival of a charismatic young priest brin...,2021,9,1
6,s7,Movie,My Little Pony: A New Generation,"Robert Cullen, José Luis Ucha",missisng,2021,PG,91 min,Children & Family Movies,Equestria's divided. But a bright-eyed hero be...,2021,9,0


In [None]:
jovian.commit()

<IPython.core.display.Javascript object>

# Exploratary Data Analysis

In [None]:
netflix_df.head()

# 1. What is the ratio of Movie and TV Shows on Netflix

In [None]:
distribution = netflix_df["type"].value_counts()
fig = px.pie(values=distribution.values, names=["Movies","Tv Shows"])
fig.show()

**Insight From the Visualization:**
From the Visualization we can see that 71% of the content in netflix are Movies, So Netflix has more movies compared to tv-shows by 40%


In [None]:
# 2. Distribution of Rating in Netflix

In [None]:
df_rating = pd.DataFrame(netflix_df["rating"].value_counts()).reset_index().rename(columns={'index':'rating','rating':'count'})


fig = px.bar(df_rating, y='rating', x='count', title='Distribution of Rating',
color_discrete_sequence=['red'], text='count')
fig.update_xaxes(showgrid=False)
fig.update_yaxes(showgrid=False, categoryorder='total ascending', ticksuffix=' ', showline=False)
fig.update_traces(hovertemplate=None, marker=dict(line=dict(width=0)))
fig.update_layout(margin=dict(t=80, b=0, l=70, r=40),
hovermode="y unified",
xaxis_title=' ', yaxis_title=" ", height=400,
plot_bgcolor='#333', paper_bgcolor='#333',
title_font=dict(size=25, color='#8a8d93', family="Lato, sans-serif"),
font=dict(color='#8a8d93'),
legend=dict(orientation="h", yanchor="bottom", y=1, xanchor="center", x=0.5),
hoverlabel=dict(bgcolor="black", font_size=13, font_family="Lato, sans-serif")) 

**Insight From the Visualization:**
From the Visualization we can see that most of the netflix content are for Mature Audiance with the count of shows 2879

In [None]:
# 3. Distribution of Movies and TV shows over USA, India, Canada, South Africa, United Kingdom, Japan

In [None]:
list_of_countries = ["united states","india","japan","canada","south africa","united kingdom","japan"]
filt = country_df["country"].isin(list_of_countries)
df_country_list = country_df.loc[filt,]

fig = px.bar(df_country_list, x="number",y="country", orientation="h")
fig.show()

In [None]:
df_country_list

**Insight From the Visualization:**
From the Visualization we can see that USA has most number of the contents about more than 2800 followed by India near to 1000

In [None]:
jovian.commit()

In [None]:
# 4. Year where shows or movie added is high
netflix_df["Year_added"].value_counts()
df_years = netflix_df.groupby(["Year_added"]).size().reset_index(name='counts')
fig = px.line(df_years,x="Year_added",y="counts", title="Content Added Over the Year")
fig.show()

**Insight From the Visualization:**
From the Visualization we can see that number of the contents added increased during the pandamic time, 2019 around 1850 new contents were added during this year

In [None]:
jovian.commit()

In [None]:
# 5. Difference of Growth of movie and tv show over the Years

In [None]:
d1 = netflix_df[netflix_df["type"] == "TV Show"]
d2 = netflix_df[netflix_df["type"] == "Movie"]

col = "Year_added"

vc1 = d1[col].value_counts().reset_index()
vc1 = vc1.rename(columns = {col : "count", "index" : col})
vc1['percent'] = vc1['count'].apply(lambda x : 100*x/sum(vc1['count']))
vc1 = vc1.sort_values(col)

vc2 = d2[col].value_counts().reset_index()
vc2 = vc2.rename(columns = {col : "count", "index" : col})
vc2['percent'] = vc2['count'].apply(lambda x : 100*x/sum(vc2['count']))
vc2 = vc2.sort_values(col)

trace1 = go.Scatter(x=vc1[col], y=vc1["count"], name="TV Shows", marker=dict(color="#a678de"))
trace2 = go.Scatter(x=vc2[col], y=vc2["count"], name="Movies", marker=dict(color="#6ad49b"))
data = [trace1, trace2]
layout = go.Layout(title="Difference of Content added over the years", legend=dict(x=0.1, y=1.1, orientation="h"))
fig = go.Figure(data, layout=layout)
fig.show()


**Insight From the Visualization:**
From the Visualization we can see that there is lot of difference between new tv-show content to movies from the year 2017

In [None]:
# 6. Distribution amoung the genre

In [None]:
df_genre = pd.DataFrame(netflix_df["genre"].value_counts()).reset_index().rename(columns={'index':'genre', 'genre':'count'})

df_genre["content"] = "All"

fig = px.treemap(df_genre, 
                      path=["content", 'count','genre'])

fig.update_layout(title='Highest watched Geners on Netflix',
                  title_font=dict(size=25, color='#fff', family="Lato, sans-serif"),
                 plot_bgcolor='#333', paper_bgcolor='#333')
fig.show()

**Insight From the Visualization:**
From the Visualization we can see that netflix has more number of Crime and stand-up genre with the count of 333

In [None]:
jovian.commit()

In [None]:
# 7. Which is the best month to release the content

In [None]:
fig = px.histogram(netflix_df, x="Month_added")
fig.update_layout(bargap=0.2,title="Movie Added over the Months")
fig.show()

**Insight From the Visualization:**
From the Visualization we can see that almost all the months has equal distribution of content but during December and July month more new contents are added

In [None]:
jovian.commit()

In [None]:
# 8. Count the number of seasons of TV shows 

In [None]:
import matplotlib

matplotlib.rcParams['font.size'] = 14
matplotlib.rcParams['figure.figsize'] = (12, 8)
matplotlib.rcParams['figure.facecolor'] = '#00000000'

sns.set_style(style="darkgrid")

filt = netflix_df["seasons"] != 0

netflix_season = netflix_df.loc[filt]

sns.countplot(x="seasons", data=netflix_season)
plt.title("Distribution of Seasons of TV show");

**Insight From the Visualization:**
From the Visualization we can see that around 1500 tv shows has only one season and there are tv-shows with more that 8 seasons in netflix tv-shows

In [None]:
jovian.commit()

In [None]:
# 9. Top 15 director for Movie and TV shows.

In [None]:
netflix_df

In [None]:
filt = netflix_df["type"]=="TV Show"
filt2 = netflix_df["type"] == "Movie"

df_movie_director = netflix_df.loc[filt,]
df_tv_director = netflix_df.loc[filt2,]

df_movie = pd.DataFrame(df_movie_director["director"].value_counts()).reset_index().rename(columns={'index':'director', 'director':'count'})
df_tv = pd.DataFrame(df_tv_director["director"].value_counts()).reset_index().rename(columns={'index':'director', 'director':'count'})

df_tv.drop(index=0,inplace=True)
df_movie.drop(index=0,inplace=True)

df_top_movie = df_movie.sort_values(by="count", ascending=False).head(15)
df_top_tv = df_tv.sort_values(by="count", ascending=False).head(15)

fig = px.bar(data_frame=df_top_movie,x="count",y="director")
fig.show()

**Insight From the Visualization:**
From the Visualization we can see that netflix has a more movies directed by Alastair Fothergill, Netflix has 3 movies directed by him

In [None]:
fig = px.bar(data_frame=df_top_tv,x="count",y="director")
fig.show()

**Insight From the Visualization:**
From the Visualization we can see that netflix has a more Tv-shows directed by Raul Campos, Netflix has 18 shows directed by him

In [None]:
jovian.commit()

## Conclusion

1. Netflix is used in Most of the countries, USA and India are the top two users of Netflix.
2. From the Time of Pandamic netflix added more content compared to any other years.
3. Netflix has more Movies then TV-shows
4. Netflix has TV-shows with more than 8 seasons

## Future Scope:

1. We can combine this data set with data set which has country wise views and rating to identify the popular shows in that country.
2. We can combine with other data set to check the revenue of Netflix per country

In [None]:
jovian.commit()