Data Source - [Netflix Data](https://www.kaggle.com/datasets/shivamb/netflix-shows)

The purpose of this analysis is to find the following:
1. Data Cleaning & Prep
- Handle missing values (e.g., director, cast, country).
- Convert date_added from string to datetime format.
- Extract useful features (e.g., month/year added, duration in minutes).

2. Exploratory Data Analysis (EDA)
- Content Distribution: Movies vs. TV shows over time.
- Release Trends: When were most shows/movies added to Netflix?
- Country Analysis: Which countries produce the most content?
- Ratings Analysis: What’s the most common rating (TV-MA, PG-13, etc.)?

3. Visualizations (Use Matplotlib/Seaborn or Plotly)
- 📈 Bar Chart: Number of Movies vs. TV Shows by year.
- 🌍 Map Visualization: Countries producing the most content (using geopandas or Plotly).
- 📅 Time Series Plot: Monthly additions of content over the years.
- 📊 Pie Chart: Distribution of ratings (TV-MA, PG-13, etc.).

4. Bonus (If You Want More Challenge)
- Text Analysis: Analyze the description column for common keywords.
- Recommendation System (Basic): Suggest similar content based on genre/director.

In [5]:
# Packages
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
print('Happy Coding 😊')

Happy Coding 😊


In [6]:
data = pd.read_csv('./netflix_titles.csv') # importing the dataset

## Data Cleaning

In [7]:
data.head()

Unnamed: 0,show_id,type,title,director,cast,country,date_added,release_year,rating,duration,listed_in,description
0,s1,Movie,Dick Johnson Is Dead,Kirsten Johnson,,United States,"September 25, 2021",2020,PG-13,90 min,Documentaries,"As her father nears the end of his life, filmm..."
1,s2,TV Show,Blood & Water,,"Ama Qamata, Khosi Ngema, Gail Mabalane, Thaban...",South Africa,"September 24, 2021",2021,TV-MA,2 Seasons,"International TV Shows, TV Dramas, TV Mysteries","After crossing paths at a party, a Cape Town t..."
2,s3,TV Show,Ganglands,Julien Leclercq,"Sami Bouajila, Tracy Gotoas, Samuel Jouy, Nabi...",,"September 24, 2021",2021,TV-MA,1 Season,"Crime TV Shows, International TV Shows, TV Act...",To protect his family from a powerful drug lor...
3,s4,TV Show,Jailbirds New Orleans,,,,"September 24, 2021",2021,TV-MA,1 Season,"Docuseries, Reality TV","Feuds, flirtations and toilet talk go down amo..."
4,s5,TV Show,Kota Factory,,"Mayur More, Jitendra Kumar, Ranjan Raj, Alam K...",India,"September 24, 2021",2021,TV-MA,2 Seasons,"International TV Shows, Romantic TV Shows, TV ...",In a city of coaching centers known to train I...


In [9]:
data.sample(5)

Unnamed: 0,show_id,type,title,director,cast,country,date_added,release_year,rating,duration,listed_in,description
3732,s3733,Movie,The Edge of Democracy,Petra Costa,,Brazil,"June 19, 2019",2019,TV-14,122 min,"Documentaries, International Movies",Political documentary and personal memoir coll...
4911,s4912,Movie,The Rachel Divide,Laura Brownson,,United States,"April 27, 2018",2018,TV-MA,105 min,Documentaries,"Rachel Dolezal, her family and her critics rec..."
4855,s4856,Movie,Catching Feelings,Kagiso Lediga,"Kagiso Lediga, Pearl Thusi, Akin Omotoso, Andr...",South Africa,"May 18, 2018",2018,TV-MA,117 min,"Comedies, International Movies, Romantic Movies","Amid growing tensions in their marriage, a Joh..."
4513,s4514,Movie,Feminists: What Were They Thinking?,Johanna Demetrakas,,United States,"October 12, 2018",2018,TV-MA,86 min,"Documentaries, LGBTQ Movies",Revisiting 1970s photos of women that captured...
7903,s7904,Movie,Runaway Bride,Garry Marshall,"Julia Roberts, Richard Gere, Joan Cusack, Hect...",United States,"December 1, 2020",1999,PG,116 min,"Comedies, Romantic Movies",Sparks fly when a newspaper columnist writes a...


In [12]:
data.isnull().sum()

show_id            0
type               0
title              0
director        2634
cast             825
country          831
date_added        10
release_year       0
rating             4
duration           3
listed_in          0
description        0
dtype: int64

In [33]:
data[(data['director'].isnull()) & (data['type'] != 'TV Show')]

Unnamed: 0,show_id,type,title,director,cast,country,date_added,release_year,rating,duration,listed_in,description
404,s405,Movie,9to5: The Story of a Movement,,,,"July 22, 2021",2021,TV-MA,85 min,Documentaries,"In this documentary, female office workers in ..."
470,s471,Movie,Bridgerton - The Afterparty,,"David Spade, London Hughes, Fortune Feimster",,"July 13, 2021",2021,TV-14,39 min,Movies,"""Bridgerton"" cast members share behind-the-sce..."
483,s484,Movie,Last Summer,,"Fatih Şahin, Ece Çeşmioğlu, Halit Özgür Sarı, ...",,"July 9, 2021",2021,TV-MA,102 min,"Dramas, International Movies, Romantic Movies","During summer vacation in a beachside town, 16..."
641,s642,Movie,Sisters on Track,,,,"June 24, 2021",2021,PG,97 min,"Documentaries, Sports Movies",Three track star sisters face obstacles in lif...
717,s718,Movie,Headspace: Unwind Your Mind,,"Andy Puddicombe, Evelyn Lewis Prieto, Ginger D...",,"June 15, 2021",2021,TV-G,273 min,Documentaries,"Do you want to relax, meditate or sleep deeply..."
...,...,...,...,...,...,...,...,...,...,...,...,...
8231,s8232,Movie,The Bund,,Chow Yun Fat,Hong Kong,"September 20, 2018",1983,TV-14,103 min,"Action & Adventure, Dramas, International Movies","After losing everything, a young man rebuilds ..."
8268,s8269,Movie,The Darkest Dawn,,,United Kingdom,"June 23, 2018",2016,TV-MA,75 min,"Action & Adventure, Independent Movies, Intern...",An aspiring filmmaker records the chaos of an ...
8330,s8331,Movie,The Great Battle,,"Zo In-sung, Nam Joo-hyuk, Park Sung-woong, Bae...",South Korea,"April 8, 2019",2018,TV-MA,136 min,"Action & Adventure, Dramas, International Movies","In seventh-century Korea, the commander of Ans..."
8647,s8648,Movie,"Twisted Trunk, Big Fat Body",,"Vijay Maurya, Naman Jain, Usha Nadkarni, Mukes...",India,"January 15, 2017",2015,TV-14,89 min,"Dramas, International Movies",After terrorists place a bomb inside a toy Lor...


In [None]:
print(data['type'].unique()) 

['Movie' 'TV Show']


In [38]:
data.loc[data['duration'].isnull(), 'rating'] = 'TV-MA' # The rating of the TV shows with missing duration is set to TV-MA

In [41]:
data.loc[data['title'] == 'Louis C.K. 2017', 'duration'] = '74 min' # The duration of the movie is set to 74 min
data.loc[data['title'] == 'Louis C.K.: Hilarious', 'duration'] = '84 min' # The duration of the movie is set to 84 min
data.loc[data['title'] == 'Louis C.K.: Live at the Comedy Store', 'duration'] = '66 min' # The duration of the movie is set to 66 min

In [43]:
data['director'] = data['director'].fillna('Unknown') # filling the null values with 'Unknown'
data['country'] = data['country'].fillna('Unknown') # filling the null values with 'Unknown'
data['cast'] = data['cast'].fillna('Unknown') # filling the null values with 'Unknown'

In [46]:
data['rating'] = data['rating'].fillna('Not Rated')

In [48]:
data[data['date_added'].isnull()]

Unnamed: 0,show_id,type,title,director,cast,country,date_added,release_year,rating,duration,listed_in,description
6066,s6067,TV Show,A Young Doctor's Notebook and Other Stories,Unknown,"Daniel Radcliffe, Jon Hamm, Adam Godley, Chris...",United Kingdom,,2013,TV-MA,2 Seasons,"British TV Shows, TV Comedies, TV Dramas","Set during the Russian Revolution, this comic ..."
6174,s6175,TV Show,Anthony Bourdain: Parts Unknown,Unknown,Anthony Bourdain,United States,,2018,TV-PG,5 Seasons,Docuseries,This CNN original series has chef Anthony Bour...
6795,s6796,TV Show,Frasier,Unknown,"Kelsey Grammer, Jane Leeves, David Hyde Pierce...",United States,,2003,TV-PG,11 Seasons,"Classic & Cult TV, TV Comedies",Frasier Crane is a snooty but lovable Seattle ...
6806,s6807,TV Show,Friends,Unknown,"Jennifer Aniston, Courteney Cox, Lisa Kudrow, ...",United States,,2003,TV-14,10 Seasons,"Classic & Cult TV, TV Comedies",This hit sitcom follows the merry misadventure...
6901,s6902,TV Show,Gunslinger Girl,Unknown,"Yuuka Nanri, Kanako Mitsuhashi, Eri Sendai, Am...",Japan,,2008,TV-14,2 Seasons,"Anime Series, Crime TV Shows","On the surface, the Social Welfare Agency appe..."
7196,s7197,TV Show,Kikoriki,Unknown,Igor Dmitriev,Unknown,,2010,TV-Y,2 Seasons,Kids' TV,A wacky rabbit and his gang of animal pals hav...
7254,s7255,TV Show,La Familia P. Luche,Unknown,"Eugenio Derbez, Consuelo Duval, Luis Manuel Áv...",United States,,2012,TV-14,3 Seasons,"International TV Shows, Spanish-Language TV Sh...","This irreverent sitcom featues Ludovico, Feder..."
7406,s7407,TV Show,Maron,Unknown,"Marc Maron, Judd Hirsch, Josh Brener, Nora Zeh...",United States,,2016,TV-MA,4 Seasons,TV Comedies,"Marc Maron stars as Marc Maron, who interviews..."
7847,s7848,TV Show,Red vs. Blue,Unknown,"Burnie Burns, Jason Saldaña, Gustavo Sorola, G...",United States,,2015,NR,13 Seasons,"TV Action & Adventure, TV Comedies, TV Sci-Fi ...","This parody of first-person shooter games, mil..."
8182,s8183,TV Show,The Adventures of Figaro Pho,Unknown,"Luke Jurevicius, Craig Behenna, Charlotte Haml...",Australia,,2015,TV-Y7,2 Seasons,"Kids' TV, TV Comedies","Imagine your worst fears, then multiply them: ..."


In [54]:
data['date_added'] = pd.to_datetime(data['date_added'].str.strip(), format="%B %d, %Y", errors='coerce')

In [56]:
data['date_added'].dtype

dtype('<M8[ns]')

In [58]:
data.isnull().sum() # checking for null values

show_id          0
type             0
title            0
director         0
cast             0
country          0
date_added      10
release_year     0
rating           0
duration         0
listed_in        0
description      0
dtype: int64