# Netflix 90s Movies EDA

This notebook stores and explains my python script for Datacamp's project exploring Netflix's database on 1990s movies.

In [19]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import os 

In [20]:
# Read in the Netflix CSV as a DataFrame
netflix_df = pd.read_csv("netflix_data.csv")

## EDA parameters

Explore Netflix's dataset to understand more about movies in the 90s, and find the following:

* Most frequent movie duration saved as integer field called 'duration' 
* Count number of short Action movies released in the 90s and save this count as 'short_movie_count'

In [21]:
duration = int(netflix_df['duration'].mean())
print(duration)

117


In [22]:
netflix_df.dtypes # Check structure and data types

show_id         object
type            object
title           object
director        object
cast            object
country         object
date_added      object
release_year     int64
duration         int64
description     object
genre           object
dtype: object

In [23]:
# Set filtering to look for action movies in the 90s decade.
nineties_action = netflix_df[(netflix_df['release_year'] > 1990) & (netflix_df['release_year'] < 2000) & (netflix_df['genre'] == 'Action') & (netflix_df['type'] == 'movie')]
print(nineties_action)
nineties_action.head()

    show_id   type      title           director  \
2     s0002  movie    Movie 2  Quentin Tarantino   
8     s0008  movie    Movie 8   Steven Spielberg   
11    s0011  movie   Movie 11   Steven Spielberg   
18    s0018  movie   Movie 18   Steven Spielberg   
24    s0024  movie   Movie 24        Nora Ephron   
27    s0027  movie   Movie 27   Steven Spielberg   
31    s0031  movie   Movie 31  Quentin Tarantino   
36    s0036  movie   Movie 36   Steven Spielberg   
38    s0038  movie   Movie 38   Steven Spielberg   
41    s0041  movie   Movie 41          Spike Lee   
55    s0055  movie   Movie 55   Steven Spielberg   
70    s0070  movie   Movie 70  Quentin Tarantino   
71    s0071  movie   Movie 71   Steven Spielberg   
88    s0088  movie   Movie 88  Quentin Tarantino   
89    s0089  movie   Movie 89   Steven Spielberg   
93    s0093  movie   Movie 93      James Cameron   
96    s0096  movie   Movie 96        Nora Ephron   
97    s0097  movie   Movie 97      James Cameron   
99    s0099 

Unnamed: 0,show_id,type,title,director,cast,country,date_added,release_year,duration,description,genre
2,s0002,movie,Movie 2,Quentin Tarantino,"Jodie Foster, Robin Williams, Morgan Freeman",France,"July 14, 2000",1999,137,"A compelling story of love, loss, and redemption.",Action
8,s0008,movie,Movie 8,Steven Spielberg,"Jodie Foster, Tom Hanks, Denzel Washington",USA,"December 20, 2005",1998,129,"A compelling story of love, loss, and redemption.",Action
11,s0011,movie,Movie 11,Steven Spielberg,"Tom Hanks, Julia Roberts, Morgan Freeman",USA,"July 14, 2000",1999,100,"A compelling story of love, loss, and redemption.",Action
18,s0018,movie,Movie 18,Steven Spielberg,"Tom Hanks, Denzel Washington, Jodie Foster",Australia,"July 14, 2000",1995,139,"A compelling story of love, loss, and redemption.",Action
24,s0024,movie,Movie 24,Nora Ephron,"Denzel Washington, Will Smith, Morgan Freeman",USA,"July 14, 2000",1998,130,"A compelling story of love, loss, and redemption.",Action


In [24]:
short_movie_count = 0 

for _, row in nineties_action.iterrows():
    if row['duration'] < 90:
        short_movie_count += 1 

print("Duration:", duration)
print("Count of short movies:", short_movie_count)

Duration: 117
Count of short movies: 6


In [25]:
print(nineties_action['title'])

print(netflix_df['genre'].unique())
print(netflix_df['type'].unique())
netflix_df.info()

2        Movie 2
8        Movie 8
11      Movie 11
18      Movie 18
24      Movie 24
27      Movie 27
31      Movie 31
36      Movie 36
38      Movie 38
41      Movie 41
55      Movie 55
70      Movie 70
71      Movie 71
88      Movie 88
89      Movie 89
93      Movie 93
96      Movie 96
97      Movie 97
99      Movie 99
104    Movie 104
107    Movie 107
134    Movie 134
144    Movie 144
166    Movie 166
179    Movie 179
182    Movie 182
186    Movie 186
193    Movie 193
204    Movie 204
208    Movie 208
232    Movie 232
240    Movie 240
245    Movie 245
283    Movie 283
285    Movie 285
286    Movie 286
288    Movie 288
289    Movie 289
294    Movie 294
297    Movie 297
Name: title, dtype: object
['Romance' 'Thriller' 'Action' 'Adventure' 'Comedy' 'Drama']
['movie']
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 300 entries, 0 to 299
Data columns (total 11 columns):
 #   Column        Non-Null Count  Dtype 
---  ------        --------------  ----- 
 0   show_id       300 non-null  