<center><img src="redpopcorn.jpg"></center>

**Netflix**! What started in 1997 as a DVD rental service has since exploded into one of the largest entertainment and media companies.

Given the large number of movies and series available on the platform, it is a perfect opportunity to flex your exploratory data analysis skills and dive into the entertainment industry.

You work for a production company that specializes in nostalgic styles. You want to do some research on movies released in the 1990's. You'll delve into Netflix data and perform exploratory data analysis to better understand this awesome movie decade!

You have been supplied with the dataset `netflix_data.csv`, along with the following table detailing the column names and descriptions. Feel free to experiment further after submitting!

## The data
### **netflix_data.csv**
| Column | Description |
|--------|-------------|
| `show_id` | The ID of the show |
| `type` | Type of show |
| `title` | Title of the show |
| `director` | Director of the show |
| `cast` | Cast of the show |
| `country` | Country of origin |
| `date_added` | Date added to Netflix |
| `release_year` | Year of Netflix release |
| `duration` | Duration of the show in minutes |
| `description` | Description of the show |
| `genre` | Show genre |

In [43]:
# Importing pandas and matplotlib
import pandas as pd
import matplotlib.pyplot as plt

# Read in the Netflix CSV as a DataFrame
netflix_df = pd.read_csv("netflix_data.csv")

In [44]:
# Pull out the three columns we need
movie_durations = netflix_df['duration']        
movie_years     = netflix_df['release_year']
movie_types     = netflix_df['type']

# Gather just the durations for 1990–1999 movies
durations_90s = []
for i in range(len(movie_years)):
    if movie_types.iloc[i] == 'Movie' and 1990 <= movie_years.iloc[i] <= 1999:
        durations_90s.append(movie_durations.iloc[i])

# Count frequencies
freq = {}
for d in durations_90s:
    freq[d] = freq.get(d, 0) + 1

# Pick the most common duration
duration = None
max_count = 0
for d, count in freq.items():
    if count > max_count:
        max_count = count
        duration = d

print("Most frequent movie duration in the 1990s:", duration)

Most frequent movie duration in the 1990s: 94


In [45]:
movie_genres    = netflix_df['genre']          

# Initialize the counter
short_movie_count = 0

# Loop over every row by index
for i in range(len(netflix_df)):
    # 1) Must be a Movie
    if movie_types.iloc[i] != 'Movie':
        continue
    
    # 2) Must be in the 1990s
    year = movie_years.iloc[i]
    if year < 1990 or year > 1999:
        continue

    # 3) Must contain "Action" in its genre string
    #    (some rows might have "Action, Thriller" etc.)
    if 'Action' not in movie_genres.iloc[i]:
        continue

    # 4) Must be shorter than 90 minutes
    if movie_durations.iloc[i] < 90:
        short_movie_count += 1

print("Number of short action movies in the 1990s:", short_movie_count)


Number of short action movies in the 1990s: 7
