**Netflix**! What started in 1997 as a DVD rental service has since exploded into one of the largest entertainment and media companies.

Given the large number of movies and series available on the platform, it is a perfect opportunity to flex your exploratory data analysis skills and dive into the entertainment industry.

You work for a production company that specializes in nostalgic styles. You want to do some research on movies released in the 1990's. You'll delve into Netflix data and perform exploratory data analysis to better understand this awesome movie decade!

You have been supplied with the dataset `netflix_data.csv`, along with the following table detailing the column names and descriptions. Feel free to experiment further after submitting!

## The data
### **netflix_data.csv**
| Column | Description |
|--------|-------------|
| `show_id` | The ID of the show |
| `type` | Type of show |
| `title` | Title of the show |
| `director` | Director of the show |
| `cast` | Cast of the show |
| `country` | Country of origin |
| `date_added` | Date added to Netflix |
| `release_year` | Year of Netflix release |
| `duration` | Duration of the show in minutes |
| `description` | Description of the show |
| `genre` | Show genre |

Perform exploratory data analysis on the netflix_data.csv data to understand more about movies from the 1990s decade.

What was the most frequent movie duration in the 1990s? Save an approximate answer as an integer called duration (use 1990 as the decade's start year). 

A movie is considered short if it is less than 90 minutes. Count the number of short action movies released in the 1990s and save this integer as short_movie_count.

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

In [2]:
df = pd.read_csv('netflix_data.csv', index_col = 0)
df.head()

Unnamed: 0_level_0,show_id,type,title,director,cast,country,date_added,release_year,duration,description,genre
index,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1
0,s2,Movie,7:19,Jorge Michel Grau,"Demián Bichir, Héctor Bonilla, Oscar Serrano, ...",Mexico,"December 23, 2016",2016,93,After a devastating earthquake hits Mexico Cit...,Dramas
1,s3,Movie,23:59,Gilbert Chan,"Tedd Chan, Stella Chung, Henley Hii, Lawrence ...",Singapore,"December 20, 2018",2011,78,"When an army recruit is found dead, his fellow...",Horror Movies
2,s4,Movie,9,Shane Acker,"Elijah Wood, John C. Reilly, Jennifer Connelly...",United States,"November 16, 2017",2009,80,"In a postapocalyptic world, rag-doll robots hi...",Action
3,s5,Movie,21,Robert Luketic,"Jim Sturgess, Kevin Spacey, Kate Bosworth, Aar...",United States,"January 1, 2020",2008,123,A brilliant group of students become card-coun...,Dramas
4,s6,TV Show,46,Serdar Akar,"Erdal Beşikçioğlu, Yasemin Allen, Melis Birkan...",Turkey,"July 1, 2017",2016,1,A genetics professor experiments with a treatm...,International TV


In [3]:
df.info()

<class 'pandas.core.frame.DataFrame'>
Index: 4812 entries, 0 to 4811
Data columns (total 11 columns):
 #   Column        Non-Null Count  Dtype 
---  ------        --------------  ----- 
 0   show_id       4812 non-null   object
 1   type          4812 non-null   object
 2   title         4812 non-null   object
 3   director      4812 non-null   object
 4   cast          4812 non-null   object
 5   country       4812 non-null   object
 6   date_added    4812 non-null   object
 7   release_year  4812 non-null   int64 
 8   duration      4812 non-null   int64 
 9   description   4812 non-null   object
 10  genre         4812 non-null   object
dtypes: int64(2), object(9)
memory usage: 451.1+ KB


In [4]:
# Unique values
df.nunique()

show_id         4812
type               2
title           4812
director        3615
cast            4690
country           72
date_added      1292
release_year      71
duration         193
description     4807
genre             31
dtype: int64

In [5]:
# Null Values
df.isnull().sum()

show_id         0
type            0
title           0
director        0
cast            0
country         0
date_added      0
release_year    0
duration        0
description     0
genre           0
dtype: int64

In [37]:
# Solve: What was the most frequent movie duration in the 1990s?
# Save an approximate answer as an integer called duration (use 1990 as the decade's start year).

filtered_movies = df[np.logical_and(df['release_year'] >= 1990, df['release_year'] < 1999)]

duration = filtered_movies['duration'].mode().iloc[0].astype(int)
print(duration)

94


In [40]:
# A movie is considered short if it is less than 90 minutes.
# Count the number of short action movies released in the 1990s and save this integer as short_movie_count

# ✅ Solución: Usar el operador '&' y paréntesis
short_movie_count = len(filtered_movies[
    (filtered_movies['duration'] < 90) &
    (filtered_movies['genre'] == 'Action')
])
print(short_movie_count)

7


In [38]:
print(filtered_movies)

      show_id   type                            title            director  \
index                                                                       
6          s8  Movie                              187      Kevin Reynolds   
118      s167  Movie                A Dangerous Woman  Stephen Gyllenhaal   
145      s211  Movie           A Night at the Roxbury    John Fortenberry   
167      s239  Movie  A Thin Line Between Love & Hate     Martin Lawrence   
194      s274  Movie                     Aashik Awara         Umesh Mehra   
...       ...    ...                              ...                 ...   
4566    s7363  Movie                 Unspeakable Acts          Linda Otto   
4597    s7415  Movie                 Victim of Beauty         Roger Young   
4689    s7571  Movie      What's Eating Gilbert Grape     Lasse Hallström   
4746    s7682  Movie                       Wyatt Earp     Lawrence Kasdan   
4756    s7695  Movie                      Yaar Gaddar         Umesh Mehra   