# Netflix Movies and Shows Comparison - Group 8

# Project Description:

- This project analyzes which type of content has the greatest viewership over the time period of 6/15/21 to 11/1/22 on the Netflix streaming service. This project attempts to find which movies and TV shows are most popular and other factors that drive increased viewership.

1. Impact of genre vs English and Non-English movies and TV Shows
2. Which TV Shows and Movies drive increased viewership over this period of time
3. Impact of popularity between English and Non-English Movies and TV Shows

# Solution Approach:

1) Identify Data Sources for all required analysis. Data sources below:

- “Netflix Movies and TV Shows” dataset from Netflix.
- Merged with “Netflix Titles” dataset from Kaggle to populate specific information for each movie and TV show.
- API call from OMDB to fill in all “NaN” values that were not in the “Netflix Titles” merge.
- The data provided captures from the week of 6/15/21 to 11/1/22.

2) Loading and standardizing all inputs identified above into pandas dataframes and producing working .csv files that enabled all data visualizations.

- See team8_netflix_proj1_dataCleaning.ipynb for cleaning
- See team8_netflix_proj1_Analysis.ipynb for analysis


# Process

We used the "Netflix Movies and TV Shows" as our main dataset from Netflix then merged it with "Netflix Titles" from Kaggle to gather our content information. We then utilized an OMDB API call to fill all NaN value types to fill the gaps. After all data had been merged, we dropped what remained of the NaN values for our completed dataset. Countries, ratings, and genre columns were cleaned. Finally, we created columns to get the count of each genre as it appeared in the string.

# Exploritory Analysis

- After combining all datasets we were left with 644 unique titles which made up about 2069 rows of data.
- We found that 37% of titles across all categories stays in the Top 10 for 1 week before falling off.
- Most of Netflix hits on the Top 10 do not remain popular for a long time.
- Only 4 titles have remained in the Top 10 for more than 20 weeks, which makes up 1.6% of all the Netflix titles given for this period of time.


![](Outputs/output_data/Bar_CountofCumulativeWeeks_Top10.png)

# Dataset Overview

- When disecting the data, we found that the most movies and TV shows consisted on the genre "drama".
- Contrary to our belief, we found that Non-English TV shows outpaced English TV shows in our top 3 genres

![](Outputs/total_genre_bar.png)


## Additional Overview

- After cleaning the data, we were left with a total of 644 unique titles that were broken down into Non-English and English movies and TV shows.
- That data shown has a variance of 171 movie titles and 13 show titles with English content have more.

![](Outputs/native_language_bar.png)

# Genre Breakdown

- We performed 2 Chi-Square tests on Non-English and English films to see how different the datasets are that we are compairing.
- Non-English films vs. English films with a critical value of 11.07 proved to be very different with an output value of 43.53 and a of P Value of 2.8832517822160287e-08.
- Non-English shows vs. English shows with a critical value of 11.07 proved to also be different with an output value of 11.33 and a P Value of 0.045170554402416296.
- Although the output value is greater than 11.07 for TV shows at a 11.33, it is not as severe as a difference than comparing movies with an output value of 43.53.

# Who Created the most Impactful Content

- From the data provided, the best content is regularly created in the form of TV shows.
- With only 2 movies making the list, we can assume that movies only create a short impact.
- TV shows have a lasting impact on viewership and retains viewers more than movies.


![](Outputs/output_data/Bar_TopShowsMostWeeksTop10.png)

# Top 10 Analysis

- Large spikes in hours viewed align with some of the most highly watch TV show content

![](Outputs/output_data/Average_Hours_TopTenShowsbyWeek.png)

The popularity of movie genres doesn’t seem to change throughout the year and remains consistent.
- Spikes come from popular releases over this time period and not for specific genres.



![](Outputs/Picture1.png)

# Drivers

- The release dates of certain TV shows correlates with spikes in viewership that make up the top 10 TV shows.
- The show “Manifest” proved to have a lasting effect on viewers due to the longevity of viewer participation.
- The show “Squid Games” proved to have the greatest impact on viewer participation.

![](Outputs/output_data/Hours_TopTenShowsbyWeek.png)

- This data only includes data with over 1,000 votes so that our analysis was not skewed by low viewership with high ratings.
- From the Top 10 vs Bottom 10 viewership by Genre, we can assume that covering more Genres with net the film a better score.
- We can also assume that since all but one film covers drama, that drama is a genre that most viewers are searching for.

![](Outputs/output_data/HorzBar_TopandBottomGenrebyIMDBScore.png)

![](Outputs/output_data/Bar_TopShowsMost.png)

# Questions Answered

1) The most popular genres are drama, comedy, action, crime, romance and family with drama TV shows being the most popular.
2) The 2 TV shows that stayed in the Top 10 the longest are “Yo soy Betty, La Fea” and “Squid Games”.
- The 2 Movies that stayed in the Top 10 the longest are “Red Notice” and “Through My Window”.
3) After performing Chi-Square tests, the data indicates that the relationship between Non-English and English movies and TV shows is indeed different.


# More data for US
The USA is the largest producer of content in the Netflix Streaming Service. 
The Show Dahmer has had the greatest impact in viewership for English native TV shows.
The movie Red Notice has an almost equal high of Don't Look up, but has a longer lasting viewership impact on audiesnces. 

![](Outputs/country_production_map.png)

![](Outputs/output_data/Hours_TopTenShows_English_byWeek.png)

![](Outputs/output_data/Hours_TopTenFilms_English_byWeek.png)

# Next Steps

- How do ratings vary on the number of votes each film or TV show receives?
- Are there films or TV Shows that were in the Top 10 before our dataset begin date?
- How old were the TV shows or movies that made the Top 10 list on a consistent basis?
- If there was another year in the dataset we could see better seasonal trends.
