<a href="https://www.kaggle.com/code/anjusukumaran4/netflix-movies-and-tv-shows-eda-cutecharts?scriptVersionId=136202797" target="_blank"><img align="left" alt="Kaggle" title="Open in Kaggle" src="https://kaggle.com/static/images/open-in-kaggle.svg"></a>

# Netflix Movies and TV Shows- Exploratory Data Analysis using **cutecharts** library. cutecharts library is a python hand-painted styles visualization package.

In [1]:
#install cutecharts
!pip install cutecharts

Collecting cutecharts
  Downloading cutecharts-1.2.0-py3-none-any.whl (17 kB)
Installing collected packages: cutecharts
Successfully installed cutecharts-1.2.0
[0m

### Import libraries

In [2]:
import pandas as pd
import cutecharts.charts as ctc

### Load data

In [3]:
df= pd.read_csv('../input/netflix-shows/netflix_titles.csv')

In [4]:
df.head()

Unnamed: 0,show_id,type,title,director,cast,country,date_added,release_year,rating,duration,listed_in,description
0,s1,Movie,Dick Johnson Is Dead,Kirsten Johnson,,United States,"September 25, 2021",2020,PG-13,90 min,Documentaries,"As her father nears the end of his life, filmm..."
1,s2,TV Show,Blood & Water,,"Ama Qamata, Khosi Ngema, Gail Mabalane, Thaban...",South Africa,"September 24, 2021",2021,TV-MA,2 Seasons,"International TV Shows, TV Dramas, TV Mysteries","After crossing paths at a party, a Cape Town t..."
2,s3,TV Show,Ganglands,Julien Leclercq,"Sami Bouajila, Tracy Gotoas, Samuel Jouy, Nabi...",,"September 24, 2021",2021,TV-MA,1 Season,"Crime TV Shows, International TV Shows, TV Act...",To protect his family from a powerful drug lor...
3,s4,TV Show,Jailbirds New Orleans,,,,"September 24, 2021",2021,TV-MA,1 Season,"Docuseries, Reality TV","Feuds, flirtations and toilet talk go down amo..."
4,s5,TV Show,Kota Factory,,"Mayur More, Jitendra Kumar, Ranjan Raj, Alam K...",India,"September 24, 2021",2021,TV-MA,2 Seasons,"International TV Shows, Romantic TV Shows, TV ...",In a city of coaching centers known to train I...


In [5]:
df.shape

(8807, 12)

In [6]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 8807 entries, 0 to 8806
Data columns (total 12 columns):
 #   Column        Non-Null Count  Dtype 
---  ------        --------------  ----- 
 0   show_id       8807 non-null   object
 1   type          8807 non-null   object
 2   title         8807 non-null   object
 3   director      6173 non-null   object
 4   cast          7982 non-null   object
 5   country       7976 non-null   object
 6   date_added    8797 non-null   object
 7   release_year  8807 non-null   int64 
 8   rating        8803 non-null   object
 9   duration      8804 non-null   object
 10  listed_in     8807 non-null   object
 11  description   8807 non-null   object
dtypes: int64(1), object(11)
memory usage: 825.8+ KB


### Checking null values

In [7]:
df.isnull().sum()

show_id            0
type               0
title              0
director        2634
cast             825
country          831
date_added        10
release_year       0
rating             4
duration           3
listed_in          0
description        0
dtype: int64

## **Donut chart**

### Top 10 countries with  most content 

In [8]:
top_countries=df['country'].value_counts()[:10].to_frame(name='count')
top_countries

Unnamed: 0,count
United States,2818
India,972
United Kingdom,419
Japan,245
South Korea,199
Canada,181
Spain,145
France,124
Mexico,110
Egypt,106


In [9]:
donut=ctc.Pie('Countries with most content',width='720px',height='720px')  #add title

donut.set_options(labels=list(top_countries.index),inner_radius=0.4)  #country names as labels, inner radius set 0.4
donut.add_series(list(top_countries['count']))

donut.render_notebook()  #display the chart

## **Pie Chart**

### Content type on Netflix

In [10]:
content_type=df['type'].value_counts().to_frame(name='count')
content_type

Unnamed: 0,count
Movie,6131
TV Show,2676


In [11]:
pie=ctc.Pie('content type on Netflix', width='720px',height='720px')  # add title

pie.set_options(labels=list(content_type.index),inner_radius=0,colors=['#FFF1C9','#EA5F89'])    #country names as labels ,inner radius set to 0
pie.add_series(list(content_type['count']))     #label to be shown on graph

pie.render_notebook()   #display the chart

## **Bar Chart**

### Rating of contents

In [12]:
rating=df['rating'].value_counts().to_frame(name='count')[:10]
rating

Unnamed: 0,count
TV-MA,3207
TV-14,2160
TV-PG,863
R,799
PG-13,490
TV-Y7,334
TV-Y,307
PG,287
TV-G,220
NR,80


In [13]:
bar=ctc.Bar('Rating of contents')    #add title

bar.set_options(labels=list(rating.index),x_label='Rating',y_label='Count')  #set the chart options
bar.add_series('Count',list(rating['count']))

bar.render_notebook()   #display the chart

## **Line Chart**

### Growth in content over the years (2000-2020)

In [14]:
content = df.groupby('release_year').count()['show_id'][-22:-1].to_frame(name="count")
content

Unnamed: 0_level_0,count
release_year,Unnamed: 1_level_1
2000,37
2001,45
2002,51
2003,61
2004,64
2005,80
2006,96
2007,88
2008,136
2009,152


In [15]:
line = ctc.Line('Growth in Content over the Years',width='720px',height='720px')  #add title

line.set_options(labels=list(content.index),x_label='Years',y_label='Count',colors=['#00FFFF'])     #set the chart options
line.add_series('No: of releases', list(content['count']))

line.render_notebook()   #display the chart