# Netflix Genres Analysis

This notebook recreates the analysis from Chapter 1 of "Data Science from Scratch" by Joel Grus, applying it to Netflix content. We analyze the distribution and frequency of genres across Netflix's film and TV show catalog.

## Import Required Libraries

Load pandas for data manipulation and analysis.

In [None]:
import pandas as pd

## Load Netflix Dataset

Read the Netflix titles dataset containing information about all available titles and their associated genres.

In [None]:
df_netflix = pd.read_csv('netflix_titles.csv')
df_netflix.head()

## Extract Genre Strings

Convert the 'listed_in' column into a list. Each entry contains multiple comma-separated genres.

In [None]:
all_genre_strings = list(df_netflix.listed_in)
all_genre_strings

## Count Genre Frequencies

Split genre strings and count the occurrence of each individual genre across all titles.

In [None]:
from collections import Counter
genre_counts = Counter(
    individual_genre 
    for genre_string in all_genre_strings
    for individual_genre in genre_string.lower().split(', ')
)
genre_counts

## Display Top 5 Genres

Show the five most common genres with proper capitalization formatting.

In [None]:
print(f'The top five most common film genres in Netflix\n')
for rank, (genre_name, count) in enumerate(genre_counts.most_common(5)):
    # Capitalize each word in the genre name for proper formatting
    formatted_genre_name = ' '.join(word.capitalize() for word in genre_name.split())
    print(f'{rank+1}. {formatted_genre_name}\n')