![Movie popcorn on red background](redpopcorn.jpg)

**Netflix**! What started in 1997 as a DVD rental service has since exploded into one of the largest entertainment and media companies.

Given the large number of movies and series available on the platform, it is a perfect opportunity to flex your exploratory data analysis skills and dive into the entertainment industry. Our friend has also been brushing up on their Python skills and has taken a first crack at a CSV file containing Netflix data. They believe that the average duration of movies has been declining. Using your friends initial research, you'll delve into the Netflix data to see if you can determine whether movie lengths are actually getting shorter and explain some of the contributing factors, if any.

You have been supplied with the dataset `netflix_data.csv` , along with the following table detailing the column names and descriptions. This data does contain null values and some outliers, but handling these is out of scope for the project. Feel free to experiment after submitting!

## The data
### **netflix_data.csv**
| Column | Description |
|--------|-------------|
| `show_id` | The ID of the show |
| `type` | Type of show |
| `title` | Title of the show |
| `director` | Director of the show |
| `cast` | Cast of the show |
| `country` | Country of origin |
| `date_added` | Date added to Netflix |
| `release_year` | Year of Netflix release |
| `duration` | Duration of the show in minutes |
| `description` | Description of the show |
| `genre` | Show genre |

In [1]:
# Importing pandas and matplotlib
import pandas as pd
import matplotlib.pyplot as plt


In [2]:
df = pd.read_csv('netflix_data.csv')

In [3]:
df.head()

Unnamed: 0,show_id,type,title,director,cast,country,date_added,release_year,duration,description,genre
0,s1,TV Show,3%,,"João Miguel, Bianca Comparato, Michel Gomes, R...",Brazil,"August 14, 2020",2020,4,In a future where the elite inhabit an island ...,International TV
1,s2,Movie,7:19,Jorge Michel Grau,"Demián Bichir, Héctor Bonilla, Oscar Serrano, ...",Mexico,"December 23, 2016",2016,93,After a devastating earthquake hits Mexico Cit...,Dramas
2,s3,Movie,23:59,Gilbert Chan,"Tedd Chan, Stella Chung, Henley Hii, Lawrence ...",Singapore,"December 20, 2018",2011,78,"When an army recruit is found dead, his fellow...",Horror Movies
3,s4,Movie,9,Shane Acker,"Elijah Wood, John C. Reilly, Jennifer Connelly...",United States,"November 16, 2017",2009,80,"In a postapocalyptic world, rag-doll robots hi...",Action
4,s5,Movie,21,Robert Luketic,"Jim Sturgess, Kevin Spacey, Kate Bosworth, Aar...",United States,"January 1, 2020",2008,123,A brilliant group of students become card-coun...,Dramas


In [4]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 7787 entries, 0 to 7786
Data columns (total 11 columns):
 #   Column        Non-Null Count  Dtype 
---  ------        --------------  ----- 
 0   show_id       7787 non-null   object
 1   type          7787 non-null   object
 2   title         7787 non-null   object
 3   director      5398 non-null   object
 4   cast          7069 non-null   object
 5   country       7280 non-null   object
 6   date_added    7777 non-null   object
 7   release_year  7787 non-null   int64 
 8   duration      7787 non-null   int64 
 9   description   7787 non-null   object
 10  genre         7787 non-null   object
dtypes: int64(2), object(9)
memory usage: 669.3+ KB


In [5]:
# Most frequent movie duration
most_frequent_movie_duration = df['duration'].mode()[0]

print(f"Most_frequent_movie_duration: {most_frequent_movie_duration}")

Most_frequent_movie_duration: 1


In [6]:
df_1990 = df[df['release_year'] == 1990]
df_1990.head()

Unnamed: 0,show_id,type,title,director,cast,country,date_added,release_year,duration,description,genre
343,s344,Movie,Agneepath,Mukul Anand,"Amitabh Bachchan, Mithun Chakraborty, Danny De...",India,"April 1, 2020",1990,174,A boy grows up to become a gangster in pursuit...,Dramas
1756,s1757,Movie,Dil,Indra Kumar,"Aamir Khan, Madhuri Dixit, Saeed Jaffrey, Deve...",India,"October 12, 2020",1990,165,A miser’s scheme to set his son up with a mill...,Comedies
2025,s2026,Movie,"Escape from the ""Liberty"" Cinema",Wojciech Marczewski,"Janusz Gajos, Zbigniew Zamachowski, Teresa Mar...",Poland,"October 1, 2019",1990,88,Artistic rebellion ignites at the movies when ...,Comedies
2393,s2394,Movie,Ghayal,Rajkumar Santoshi,"Sunny Deol, Meenakshi Sheshadri, Amrish Puri, ...",India,"December 31, 2019",1990,163,"Framed for his older brother's murder, a boxer...",Action
2493,s2494,Movie,GoodFellas,Martin Scorsese,"Robert De Niro, Ray Liotta, Joe Pesci, Lorrain...",United States,"January 1, 2021",1990,145,Former mobster Henry Hill recounts his colorfu...,Classic Movies


In [7]:
# Most_frequent_movie_duration_in_1990
most_frequent_movie_duration = df_1990['duration'].mode()[0]
print(f"Most_frequent_movie_duration_in_1990: {most_frequent_movie_duration}")

Most_frequent_movie_duration_in_1990: 88


In [8]:
# Number_of_short_movie_in_1990
df_short_movie = df_1990[df_1990['duration'] <= 90]
df_short_movie.head(20)

Unnamed: 0,show_id,type,title,director,cast,country,date_added,release_year,duration,description,genre
2025,s2026,Movie,"Escape from the ""Liberty"" Cinema",Wojciech Marczewski,"Janusz Gajos, Zbigniew Zamachowski, Teresa Mar...",Poland,"October 1, 2019",1990,88,Artistic rebellion ignites at the movies when ...,Comedies
3334,s3335,TV Show,Ken Burns: The Civil War,Ken Burns,"Sam Waterston, Julie Harris, Jason Robards, Mo...",United States,"February 22, 2017",1990,1,Ken Burns's documentary depicts the action of ...,Docuseries
3717,s3718,Movie,"Look Out, Officer",Sze Yu Lau,"Stephen Chow, Bill Tung, Stanley Sui-Fan Fung,...",Hong Kong,"August 16, 2018",1990,88,An officer killed on the job returns to Earth ...,Action
4776,s4777,Movie,Paris Is Burning,Jennie Livingston,,United States,"February 1, 2017",1990,77,This Sundance prize-winning documentary is an ...,Classic Movies
4811,s4812,TV Show,Pee-wee's Playhouse,,Paul Reubens,United States,"December 18, 2014",1990,5,Pee-wee Herman brings his stage show to the ma...,Classic
7089,s7090,Movie,Tim Allen: Men Are Pigs,Ellen Brown,Tim Allen,United States,"December 31, 2018",1990,30,Standup comedian Tim Allen delivers a set dedi...,Stand-Up
7280,s7281,TV Show,Twin Peaks,,"Kyle MacLachlan, Michael Ontkean, Mädchen Amic...",United States,"July 1, 2017",1990,2,"""Who killed Laura Palmer?"" is the question on ...",Classic


In [9]:
number_of_short_movie = df_short_movie['show_id'].shape[0]

print(f"Number_of_short_movie_in_1990: {number_of_short_movie}")

Number_of_short_movie_in_1990: 7


In [10]:
# Number_of_short_movie_in_2020
df_2020 = df[df['release_year'] == 2020]
df_short_movies = df_2020[df_2020['duration'] <= 90]
df_short_movies.head()

Unnamed: 0,show_id,type,title,director,cast,country,date_added,release_year,duration,description,genre
0,s1,TV Show,3%,,"João Miguel, Bianca Comparato, Michel Gomes, R...",Brazil,"August 14, 2020",2020,4,In a future where the elite inhabit an island ...,International TV
24,s25,TV Show,​SAINT SEIYA: Knights of the Zodiac,,"Bryson Baugus, Emily Neves, Blake Shepard, Pat...",Japan,"January 23, 2020",2020,2,Seiya and the Knights of the Zodiac rise again...,Anime Series
26,s27,TV Show,(Un)Well,,,United States,"August 12, 2020",2020,1,This docuseries takes a deep dive into the luc...,Reality TV
29,s30,TV Show,#blackAF,,"Kenya Barris, Rashida Jones, Iman Benson, Genn...",United States,"April 17, 2020",2020,1,Kenya Barris and his family navigate relations...,TV Comedies
30,s31,Movie,#cats_the_mewvie,Michael Margolis,,Canada,"February 5, 2020",2020,90,This pawesome documentary explores how our fel...,Documentaries


In [11]:
# Number_of_short_movie_in_2020
number_of_short_movies = df_short_movies['show_id'].shape[0]

print(f"Number_of_short_movie_in_2020: {number_of_short_movies}")

Number_of_short_movie_in_2020: 631
