# Investigating Netflix Movies

## Introduction

Netflix started in 1997 as a DVD rental service and has grown into a major entertainment company. This workbook focuses on the nostalgic films of the 1990s, allowing you to explore Netflix data through exploratory data analysis. As part of a production company specializing in nostalgic styles, you’ll uncover trends and insights from this iconic decade in cinema.

- What was the most frequent movie duration in the 1990s? Save an approximate answer as an integer called duration (use 1990 as the decade's start year).

- A movie is considered short if it is less than 90 minutes. Count the number of short action movies released in the 1990s and save this integer as short_movie_count.


In [1]:
import pandas as pd
import matplotlib.pyplot as plt

df = pd.read_csv("Dataset/netflix_data.csv")

In [2]:
df.head()

Unnamed: 0,show_id,type,title,director,cast,country,date_added,release_year,duration,description,genre
0,s2,Movie,7:19,Jorge Michel Grau,"Demián Bichir, Héctor Bonilla, Oscar Serrano, ...",Mexico,"December 23, 2016",2016,93,After a devastating earthquake hits Mexico Cit...,Dramas
1,s3,Movie,23:59,Gilbert Chan,"Tedd Chan, Stella Chung, Henley Hii, Lawrence ...",Singapore,"December 20, 2018",2011,78,"When an army recruit is found dead, his fellow...",Horror Movies
2,s4,Movie,9,Shane Acker,"Elijah Wood, John C. Reilly, Jennifer Connelly...",United States,"November 16, 2017",2009,80,"In a postapocalyptic world, rag-doll robots hi...",Action
3,s5,Movie,21,Robert Luketic,"Jim Sturgess, Kevin Spacey, Kate Bosworth, Aar...",United States,"January 1, 2020",2008,123,A brilliant group of students become card-coun...,Dramas
4,s6,TV Show,46,Serdar Akar,"Erdal Beşikçioğlu, Yasemin Allen, Melis Birkan...",Turkey,"July 1, 2017",2016,1,A genetics professor experiments with a treatm...,International TV


In [3]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 4812 entries, 0 to 4811
Data columns (total 11 columns):
 #   Column        Non-Null Count  Dtype 
---  ------        --------------  ----- 
 0   show_id       4812 non-null   object
 1   type          4812 non-null   object
 2   title         4812 non-null   object
 3   director      4812 non-null   object
 4   cast          4812 non-null   object
 5   country       4812 non-null   object
 6   date_added    4812 non-null   object
 7   release_year  4812 non-null   int64 
 8   duration      4812 non-null   int64 
 9   description   4812 non-null   object
 10  genre         4812 non-null   object
dtypes: int64(2), object(9)
memory usage: 413.7+ KB


Filter Data

First, we filter the data to include only the movies with a release date between 1990 and 1999.

In [10]:
release_90 = df[(df["release_year"] >= 1990) & (df["release_year"] < 2000)]

In [22]:
release_90["release_year"].unique()

array([1997, 1993, 1998, 1996, 1990, 1999, 1991, 1994, 1995, 1992])

In [20]:
release_counts = release_90["release_year"].value_counts()

In [21]:
release_counts

release_year
1997    26
1998    26
1999    26
1993    16
1995    16
1992    16
1996    15
1990    15
1991    14
1994    14
Name: count, dtype: int64

In [27]:
release_90.shape
print(f"There are {release_90.shape[0]} movies released in the 1990s.")

There are 184 movies released in the 1990s.


We successfully obtained a new DataFrame ***release_90*** containing all the movies released in the 1990s, totaling 184 films.