# Investigating Netflix Movies
I work for a production company that specializes in nostalgic styles and want to do some research on movies released in the 1990's. I'll perform exploratory data analysis on Netflix data to better understand movies from this decade.

### Visualize
- Build a word cloud from the movie and TV shows descriptions. Make sure to remove stop words!

### Analyze
- Has Netflix invested more in certain genres (see listed_in) in recent years? What about certain age groups (see ratings)?

In [68]:
# Import Pandas and Matplotlib
import pandas as pd
import matplotlib.pyplot as plt

# Import data
nf = pd.read_csv("data/netflix_titles.csv", index_col = 0)
nf.head(10)

Unnamed: 0_level_0,type,title,director,cast,country,date_added,release_year,rating,duration,listed_in,description
show_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1
s1,Movie,Dick Johnson Is Dead,Kirsten Johnson,,United States,"September 25, 2021",2020,PG-13,90 min,Documentaries,"As her father nears the end of his life, filmm..."
s2,TV Show,Blood & Water,,"Ama Qamata, Khosi Ngema, Gail Mabalane, Thaban...",South Africa,"September 24, 2021",2021,TV-MA,2 Seasons,"International TV Shows, TV Dramas, TV Mysteries","After crossing paths at a party, a Cape Town t..."
s3,TV Show,Ganglands,Julien Leclercq,"Sami Bouajila, Tracy Gotoas, Samuel Jouy, Nabi...",,"September 24, 2021",2021,TV-MA,1 Season,"Crime TV Shows, International TV Shows, TV Act...",To protect his family from a powerful drug lor...
s4,TV Show,Jailbirds New Orleans,,,,"September 24, 2021",2021,TV-MA,1 Season,"Docuseries, Reality TV","Feuds, flirtations and toilet talk go down amo..."
s5,TV Show,Kota Factory,,"Mayur More, Jitendra Kumar, Ranjan Raj, Alam K...",India,"September 24, 2021",2021,TV-MA,2 Seasons,"International TV Shows, Romantic TV Shows, TV ...",In a city of coaching centers known to train I...
s6,TV Show,Midnight Mass,Mike Flanagan,"Kate Siegel, Zach Gilford, Hamish Linklater, H...",,"September 24, 2021",2021,TV-MA,1 Season,"TV Dramas, TV Horror, TV Mysteries",The arrival of a charismatic young priest brin...
s7,Movie,My Little Pony: A New Generation,"Robert Cullen, José Luis Ucha","Vanessa Hudgens, Kimiko Glenn, James Marsden, ...",,"September 24, 2021",2021,PG,91 min,Children & Family Movies,Equestria's divided. But a bright-eyed hero be...
s8,Movie,Sankofa,Haile Gerima,"Kofi Ghanaba, Oyafunmike Ogunlano, Alexandra D...","United States, Ghana, Burkina Faso, United Kin...","September 24, 2021",1993,TV-MA,125 min,"Dramas, Independent Movies, International Movies","On a photo shoot in Ghana, an American model s..."
s9,TV Show,The Great British Baking Show,Andy Devonshire,"Mel Giedroyc, Sue Perkins, Mary Berry, Paul Ho...",United Kingdom,"September 24, 2021",2021,TV-14,9 Seasons,"British TV Shows, Reality TV",A talented batch of amateur bakers face off in...
s10,Movie,The Starling,Theodore Melfi,"Melissa McCarthy, Chris O'Dowd, Kevin Kline, T...",United States,"September 24, 2021",2021,PG-13,104 min,"Comedies, Dramas",A woman adjusting to life after a loss contend...


## Explore
- What was the most frequent movie duration in the 1990s? Save an approximate answer as an integer called 'duration' (using the 1990's as the decade's start year)
- Count the number of **short action movies** (duration less than 90 minutes) and save this as an integer 'short_movie_count'.
- How much variety exists in Netflix's offering? Base this on three variables: type, country, and listed_in.

In [71]:
# Explore

# Subset dataframe to only movies from the 1990s
nf_90s_movies = nf[(nf['type'] == 'Movie') & (nf['release_year'] >= 1990) & (nf['release_year'] <= 1999)]
nf_90s_movies.head(10)

# Find the most frequent movie duration


Unnamed: 0_level_0,type,title,director,cast,country,date_added,release_year,rating,duration,listed_in,description
show_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1
s8,Movie,Sankofa,Haile Gerima,"Kofi Ghanaba, Oyafunmike Ogunlano, Alexandra D...","United States, Ghana, Burkina Faso, United Kin...","September 24, 2021",1993,TV-MA,125 min,"Dramas, Independent Movies, International Movies","On a photo shoot in Ghana, an American model s..."
s23,Movie,Avvai Shanmughi,K.S. Ravikumar,"Kamal Hassan, Meena, Gemini Ganesan, Heera Raj...",,"September 21, 2021",1996,TV-PG,161 min,"Comedies, International Movies",Newly divorced and denied visitation rights wi...
s25,Movie,Jeans,S. Shankar,"Prashanth, Aishwarya Rai Bachchan, Sri Lakshmi...",India,"September 21, 2021",1998,TV-14,166 min,"Comedies, International Movies, Romantic Movies",When the father of the man she loves insists t...
s27,Movie,Minsara Kanavu,Rajiv Menon,"Arvind Swamy, Kajol, Prabhu Deva, Nassar, S.P....",,"September 21, 2021",1997,TV-PG,147 min,"Comedies, International Movies, Music & Musicals",A tangled love triangle ensues when a man fall...
s115,Movie,Anjaam,Rahul Rawail,"Madhuri Dixit, Shah Rukh Khan, Tinnu Anand, Jo...",India,"September 2, 2021",1994,TV-14,143 min,"Dramas, International Movies, Thrillers",A wealthy industrialist’s dangerous obsession ...
s135,Movie,Clear and Present Danger,Phillip Noyce,"Harrison Ford, Willem Dafoe, Anne Archer, Joaq...","United States, Mexico","September 1, 2021",1994,PG-13,142 min,"Action & Adventure, Dramas","When the president's friend is murdered, CIA D..."
s136,Movie,Cliffhanger,Renny Harlin,"Sylvester Stallone, John Lithgow, Michael Rook...","United States, Italy, France, Japan","September 1, 2021",1993,R,113 min,Action & Adventure,Ranger Gabe Walker and his partner are called ...
s145,Movie,House Party,Reginald Hudlin,"Christopher Reid, Christopher Martin, Robin Ha...",United States,"September 1, 2021",1990,R,104 min,"Comedies, Cult Movies","Grounded by his strict father, Kid risks life ..."
s146,Movie,House Party 2,"George Jackson, Doug McHenry","Christopher Reid, Christopher Martin, Martin L...",United States,"September 1, 2021",1991,R,94 min,"Comedies, Cult Movies, Music & Musicals",Kid goes off to college with scholarship money...
s147,Movie,House Party 3,Eric Meza,"Christopher Reid, Christopher Martin, Tisha Ca...",United States,"September 1, 2021",1994,R,94 min,"Comedies, Music & Musicals","After Kid gets engaged, Play plans to throw th..."


In [66]:
# Most frequent movie duration

## Remove 'min' from movie duration and convert minutes to numeric type
nf_movies_90s = nf_movies_90s.copy() # Need to create a separate and independent copy of nf_movies_90s in order to create this new column 'duration_MIN'
nf_movies_90s['duration_mins'] = nf_movies_90s['duration'].str.replace(' min', '', regex=False).astype(int)
duration_MODE = nf_movies_90s['duration_mins'].mode()
# nf_movies_90s.head(10)
print("The most frequent movie duration is: " + str(duration_MODE))

The most frequent movie duration is: 0    94
Name: duration_mins, dtype: int64


In [65]:
import numpy as np

# Count of short action movies
nf_movies_90s = nf_movies_90s.copy() # Need to create a copy when adding a column to our existing dataframe
nf_movies_90s['genre_list'] = nf_movies_90s['listed_in'].str.split(', ')
#[(nf_movies_90s['duration_mins']) < 90]
nf_movies_90s.head(10)
all_genres = set([genre for sublist in nf_movies_90s['genre_list'] for genre in sublist])
# print(all_genres)


short_action_movies = nf_movies_90s[nf_movies_90s['genre_list'].apply(lambda genres: 'Action & Adventure' in genres)]
short_movie_count = len(short_action_movies[short_action_movies['duration_mins'] < 90])


# short_movie_count = short_action_movies.count()
# print(short_movie_count)
print("Number of short Action & Adventure movies: " + str(short_movie_count))

Number of short Action & Adventure movies: 10


In [27]:
# Explore

Object `np.mode` not found.


In [None]:
# Visualize

In [None]:
# Analyze