**For this project I practiced with the Dataset Netflix Top 10 and got inspired by Rabiat Ahmed.**

I analyzed Netflix daily top 10 in America during the pandemic. Here are a few things I wanted to find out:

* Number of titles
* Top ten titles
* Top ten least titles
* How many of them were Netflix exclusive and Not
* Most watched type
* Top 10 Tv Shows
* Top 10 Movies
* Top 6 Stand-Up Comedy


The time frame of the data is from 2020-04-01 to 2022-03-11. The dataset is a public dataset made available through Prasert Kanawattanachai on Kaggle.

In [1]:
# Install tidyverse, ggplot2 and dplyr and load them

install.packages("tidyverse")
install.packages("ggplot2")
install.packages("dplyr")

library(tidyverse)
library(magrittr)
library(readr)
library(dplyr)
library(ggplot2)

Installing package into ‘/usr/local/lib/R/site-library’
(as ‘lib’ is unspecified)



“unable to access index for repository http://cran.rstudio.com/src/contrib:
  cannot open URL 'http://cran.rstudio.com/src/contrib/PACKAGES'”


In [None]:
# Load the data, I downloaded it to a zip file and uploaded it in R
library(readr)

netflix_data <- read_csv("netflix daily top 10.csv")
View(netflix_data)

## Cleaning

In [None]:
# cleaning the data
netflix_data <- netflix_data %>% rename_with(tolower) %>%
  rename(date = "as of",
         year_to_date_rank = "year to date rank",
         last_week_rank = "last week rank",
         production = "netflix exclusive",
         netflix_release_date = "netflix release date",
         days_in_top_10 = "days in top 10",
         viewership_score = "viewership score")
    
View(netflix_data)

#replace NA values in production column to "others"
netflix_data <- netflix_data %<>% mutate(production = fct_explicit_na(production, na_level = "Others"))
View(netflix_data)

#Check titles
unique(netflix_data$title)

# this code changes the yes under production to Netflix Exclusive and Others to Not Netflix Exlusive
netflix_data <- mutate(netflix_data, production = recode(.x=production, "Yes" = "Netflix Exclusive"))
netflix_data <- mutate(netflix_data, production = recode(.x=production, "Others" = "Not Netflix Exclusive"))

In [None]:
#Correct duplicate
#Correct misspellings
#Full spellings

netflix_data <- mutate(netflix_data, title = recode(.x=title, "Tiger King"="Tiger King: Murder, Mayhem, and Madness"))
netflix_data <- mutate(netflix_data, title = recode(.x=title, "Tiger King: Murder, Mayhem …"="Tiger King: Murder, Mayhem, and Madness"))
netflix_data <- mutate(netflix_data, title = recode(.x=title, "Jerry Seinfeld: 23 Hours to…"="Jerry Seinfeld: 23 Hours to Kill"))
netflix_data <- mutate(netflix_data, title = recode(.x=title, "George Lopez: Weâll Do It f…"="George Lopez: We'll Do It for Half"))
netflix_data <- mutate(netflix_data, title = recode(.x=title, "The Queenâs Gambit"="The Queen's Gambit"))
netflix_data <- mutate(netflix_data, title = recode(.x=title, "Workinâ Moms"="Workin' Moms"))
netflix_data <- mutate(netflix_data, title = recode(.x=title, "Greyâs Anatomy"="Grey's Anatomy"))
netflix_data <- mutate(netflix_data, title = recode(.x=title, "Cloudy with a Chance of Mea…"="Cloudy with a Chance of Meatballs"))
netflix_data <- mutate(netflix_data, title = recode(.x=title, "Have a Good Trip: Adventure…"="Have a Good Trip: Adventures in Psychedelics"))
netflix_data <- mutate(netflix_data, title = recode(.x=title, "The Last Days of American C…"="The Last Days of American Crime"))
netflix_data <- mutate(netflix_data, title = recode(.x=title, "PokÃ©mon Journeys: The Series"="Pokemon Journeys: The Series"))
netflix_data <- mutate(netflix_data, title = recode(.x=title, "Eurovision Song Contest: Th…"="Eurovision Song Contest: The Story of Fire Saga"))
netflix_data <- mutate(netflix_data, title = recode(.x=title, "I Now Pronounce You Chuck a…"="I Now Pronounce You Chuck and Larry"))
netflix_data <- mutate(netflix_data, title = recode(.x=title, "Murder to Mercy: The Cyntoi…"="Murder to Mercy: The Cyntoia Brown Story"))
netflix_data <- mutate(netflix_data, title = recode(.x=title, "Dr. Seussâ The Lorax"="Dr. Seuss' The Lorax"))
netflix_data <- mutate(netflix_data, title = recode(.x=title, "The Epic Tales of Captain U…"="The Epic Tales of Captain Underpants"))
netflix_data <- mutate(netflix_data, title = recode(.x=title, "Fear City: New York vs. The…"="Fear City: New York vs The Mafia"))
netflix_data <- mutate(netflix_data, title = recode(.x=title, "Transformers: War for Cyber…"="Transformers: War for Cybertron"))
netflix_data <- mutate(netflix_data, title = recode(.x=title, "Worldâs Most Wanted"="Worlds Most Wanted"))
netflix_data <- mutate(netflix_data, title = recode(.x=title, "Game On: A Comedy Crossover…"="Game On: A Comedy Crossover Event"))
netflix_data <- mutate(netflix_data, title = recode(.x=title, "Chefâs Table: BBQ"="Chef's Table: BBQ"))
netflix_data <- mutate(netflix_data, title = recode(.x=title, "Jurassic World: Camp Cretac…"="Jurassic World Camp Cretaceous"))
netflix_data <- mutate(netflix_data, title = recode(.x=title, "American Murder: The Family…"="American Murder: The Family Next Door"))
netflix_data <- mutate(netflix_data, title = recode(.x=title, "Schittâs Creek"="Schitt's Creeks"))
netflix_data <- mutate(netflix_data, title = recode(.x=title, "American Pie Presents: Girl…"="American Pie Presents: Girls' Rules"))
netflix_data <- mutate(netflix_data, title = recode(.x=title, "A Babysitterâs Guide to Mon…"="A Babysitter's Guide to Monster Hunting"))
netflix_data <- mutate(netflix_data, title = recode(.x=title, "My Next Guest Needs No Intr…"="My Next Guest Needs No Introduction with David Letterman"))
netflix_data <- mutate(netflix_data, title = recode(.x=title, "Chappelleâs Show"="Chappelle's Show"))
netflix_data <- mutate(netflix_data, title = recode(.x=title, "Illumination Presents The G…"="Illumination Presents The Grinch"))
netflix_data <- mutate(netflix_data, title = recode(.x=title, "Jingle Jangle: A Christmas …"="Jingle Jangle: A Christmas Journey"))
netflix_data <- mutate(netflix_data, title = recode(.x=title, "The Boss Baby: Back in Busi…"="The Boss Baby: Back in Business"))
netflix_data <- mutate(netflix_data, title = recode(.x=title, "The Princess Switch: Switch…"="The Princess Switch: Switched Again"))
netflix_data <- mutate(netflix_data, title = recode(.x=title, "Ma Raineyâs Black Bottom"="Ma Rainey's Black Bottom"))
netflix_data <- mutate(netflix_data, title = recode(.x=title, "Ariana Grande: Excuse Me, I…"="Ariana Grande: Excuse Me, I Love You"))
netflix_data <- mutate(netflix_data, title = recode(.x=title, "Jenni Rivera: Mariposa de B…"="Jenni Rivera: Mariposa de Barrio"))
netflix_data <- mutate(netflix_data, title = recode(.x=title, "Night Stalker: The Hunt for…"="Night Stalker: The Hunt for a Serial Killer"))
netflix_data <- mutate(netflix_data, title = recode(.x=title, "Finding âOhana"="Finding 'Ohana"))
netflix_data <- mutate(netflix_data, title = recode(.x=title, "Crime Scene: The Vanishing …"="Crime Scene: The Vanishing at the Cecil Hotel"))
netflix_data <- mutate(netflix_data, title = recode(.x=title, "To All the Boys Always and …"="To All the Boys: Always and Forever"))
netflix_data <- mutate(netflix_data, title = recode(.x=title, "Animals on the Loose: A You…"="Animals on the Loose: A You vs. Wild Movie"))
netflix_data <- mutate(netflix_data, title = recode(.x=title, "Age of Samurai: Battle for …"="Age of Samurai: Battle for Japan"))
netflix_data <- mutate(netflix_data, title = recode(.x=title, "This is a Robbery: The Worl…"="This is a Robbery: The World's Biggest Art Heist"))
netflix_data <- mutate(netflix_data, title = recode(.x=title, "Life in Color with David At…"="Life in Color with David Attenborough"))
netflix_data <- mutate(netflix_data, title = recode(.x=title, "Madagascar 3: Europe's Most…"="Madagascar 3: Europe's Most Wanted"))
netflix_data <- mutate(netflix_data, title = recode(.x=title, "The Sons of Sam: A Descent …"="The Sons of Sam: A Descent Into Darkness"))
netflix_data <- mutate(netflix_data, title = recode(.x=title, "Jupiterâs Legacy"="Jupiter's Legacy"))
netflix_data <- mutate(netflix_data, title = recode(.x=title, "Ãlite"="Elite"))
netflix_data <- mutate(netflix_data, title = recode(.x=title, "The Worldâs Most Amazing Va…"="The World's Most Amazing Vacation Rentals"))
netflix_data <- mutate(netflix_data, title = recode(.x=title, "Talladega Nights: The Balla…"="Talladega Nights: The Ballad of Ricky Bobby"))
netflix_data <- mutate(netflix_data, title = recode(.x=title, "Resident Evil: Infinite Dar…"="Resident Evil: Infinite Darkness"))
netflix_data <- mutate(netflix_data, title = recode(.x=title, "The Twilight Saga: Breaking…"="The Twilight Saga: Breaking Dawn"))
netflix_data <- mutate(netflix_data, title = recode(.x=title, "Trollhunters: Rise of the T…"="Trollhunters: Rise of the Titans"))
netflix_data <- mutate(netflix_data, title = recode(.x=title, "Masters of the Universe: Re…"="Masters of the Universe: Revelation"))
netflix_data <- mutate(netflix_data, title = recode(.x=title, "The Last Letter From Your L…"="The Last Letter From Your Lover"))
netflix_data <- mutate(netflix_data, title = recode(.x=title, "Top Secret UFO Projects: De…"="Top Secret UFO Projects: Declassified"))
netflix_data <- mutate(netflix_data, title = recode(.x=title, "Cocaine Cowboys: The Kings …"="Cocaine Cowboys: The Kings of Miami"))
netflix_data <- mutate(netflix_data, title = recode(.x=title, "The Witcher: Nightmare of t…"="The Witcher: Nightmare of the Wolf"))
netflix_data <- mutate(netflix_data, title = recode(.x=title, "Bob Ross: Happy Accidents, …"="Bob Ross: Happy Accidents, Betrayal & Greed"))
netflix_data <- mutate(netflix_data, title = recode(.x=title, "Heâs All That"="He's All That"))
netflix_data <- mutate(netflix_data, title = recode(.x=title, "Turning Point: 9/11 and the…"="Turning Point: 9/11 and the War on Terror"))
netflix_data <- mutate(netflix_data, title = recode(.x=title, "Monsters Inside: The 24 Fac…"="Monsters Inside: The 24 Faces of Billy Milligan"))
netflix_data <- mutate(netflix_data, title = recode(.x=title, "My Little Pony: A New Gener…"="My Little Pony: A New Generation"))
netflix_data <- mutate(netflix_data, title = recode(.x=title, "Thereâs Someone Inside Your…"="There's Someone Inside Your House"))
netflix_data <- mutate(netflix_data, title = recode(.x=title, "King Arthur: Legend of the …"="King Arthur: Legend of the Sword"))
netflix_data <- mutate(netflix_data, title = recode(.x=title, "The Princess Switch 3: Roma…"="The Princess Switch 3: Romancing the Star"))
netflix_data <- mutate(netflix_data, title = recode(.x=title, "Jojoâs Bizarre Adventure"="Jojo's Bizarre Adventure"))
netflix_data <- mutate(netflix_data, title = recode(.x=title, "A California Christmas: Cit…"="A California Christmas: City Lights"))
netflix_data <- mutate(netflix_data, title = recode(.x=title, "Donât Look Up"="Don't Look Up"))
netflix_data <- mutate(netflix_data, title = recode(.x=title, "Crime Scene: The Times Squa…"="Crime Scene: The Times Square Killer"))
netflix_data <- mutate(netflix_data, title = recode(.x=title, "Journey 2: The Mysterious I…"="Journey 2: The Mysterious Island"))
netflix_data <- mutate(netflix_data, title = recode(.x=title, "Scary Stories to Tell in th…"="Scary Stories to Tell in the Dark"))
netflix_data <- mutate(netflix_data, title = recode(.x=title, "The Puppet Master: Hunting …"="The Puppet Master: Hunting the Ultimate Conman"))
netflix_data <- mutate(netflix_data, title = recode(.x=title, "Transformers: Revenge of th…"="Transformers: Revenge of the Fallen"))
netflix_data <- mutate(netflix_data, title = recode(.x=title, "Tyler Perryâs A Madea Homec…"="Tyler Perry's A Madea Homecoming"))
netflix_data <- mutate(netflix_data, title = recode(.x=title, "Get Organized with The Home…"="Get Organized with The Home Edit"))
netflix_data <- mutate(netflix_data, title = recode(.x=title, "I Care a Lot."="I Care a Lot"))
netflix_data <- mutate(netflix_data, title = recode(.x=title, "The Woman in the House Acro…"="The Woman in the House Across the Street from the Girl in the Window"))
netflix_data <- mutate(netflix_data, title = recode(.x=title, "Single all the Way"="Single All the Way"))
netflix_data <- mutate(netflix_data, type = recode(.x=type, "Concert/Perf…"="Concert/Performance"))
netflix_data <- mutate(netflix_data, title = recode(.x=title, "Chris DâElia: No Pain"="Chris D'Elia: No Pain"))
netflix_data <- mutate(netflix_data, title = recode(.x=title, "Bunkâd"="Bunk'd"))

View(netflix_data)

netflix_data <- netflix_data %>% mutate(title = tolower(title)) %>% mutate(type = tolower(type)) %>%
  mutate(production = tolower(production))

View(netflix_data)

#Change coloumns to lowercase
netflix_data <- netflix_data %>% mutate(title = tolower(title)) %>% mutate(type = tolower(type)) %>%
  mutate(production = tolower(production))
View(netflix_data)

## Analyze

In [None]:
#Number of titles
unique(netflix_data$title)

#Top 10 titles
title_count_top <- netflix_data %>% count(title) %>% rename(days_in_top_10 = n) %>%
  arrange(-days_in_top_10) %>% head(10)
View(title_count_top)

# 10 least titles
title_count_least <- netflix_data %>% count(title) %>% rename(days_in_top_10 = n) %>%
  arrange(days_in_top_10) %>% head(10)
View(title_count_least)

#To find the Type count and Production count and unique function elimintates duplicate values
netflix_unique <- netflix_data %>% select(title, production, type)
View(netflix_unique)

netflix_unique <- unique(netflix_unique) %>% arrange(title)
View(netflix_unique)

#Type count
type_count <- netflix_unique %>% count(type) %>% rename(count = n)
View(type_count)

#Production count
production_count <- netflix_unique %>% count(production) %>% rename(count = n)
View(production_count)

# Top 10 TV Shows
tv_show <- filter(netflix_data,type=="tv show") %>% count(title) %>% rename(days_in_top_10 = n) %>%
  arrange(-days_in_top_10) %>% head(10)
View(tv_show)

# Top 10 Movies
movies <- filter(netflix_data,type=="movie") %>% count(title) %>% rename(days_in_top_10 = n) %>%
  arrange(-days_in_top_10) %>% head(10)
View(movies)

# Top 6 Stand-Up Comedy
comedy <- filter(netflix_data,type=="stand-up comedy") %>% count(title) %>% rename(days_in_top_10 = n) %>%
  arrange(-days_in_top_10) %>% head(10)
View(comedy)

# 1 concert/performance
concert_and_performance <- filter(netflix_data,type=="concert/performance") %>% count(title) %>% rename(days_in_top_10 = n) %>%
  arrange(-days_in_top_10) %>% head(10)
View(concert_and_performance)

## Visualize

In [None]:
# Visualization

#Title visualization and based on research and inspirations fct_rev reverse the order of the Levels of a factor.
ggplot(data = title_count_top) +
  geom_col(mapping = aes(x=fct_rev(fct_reorder(title, days_in_top_10)), y=days_in_top_10, fill=days_in_top_10)) +
  labs(title = "Top 10 Titles", x=NULL, y="Number of days" ) +
  theme(legend.position = "none") +
  theme(axis.text.x = element_text(angle = 45, vjust = 1, hjust = 1)) +
  geom_text(aes(x= title, y = days_in_top_10, label = days_in_top_10, vjust = 0))
    
ggplot(data = title_count_least) +
  geom_col(mapping = aes(x=fct_rev(fct_reorder(title, days_in_top_10)), y=days_in_top_10, fill=days_in_top_10)) +
  labs(title = "Top 10 Least Titles", x=NULL, y="Number of days" ) +
  theme(legend.position = "none") +
  theme(axis.text.x = element_text(angle = 45, vjust = 1, hjust = 1)) +
  geom_text(aes(x= title, y = days_in_top_10, label = days_in_top_10, vjust = 0))
    
#Type visualization
ggplot(data=type_count) +
  geom_col(mapping = aes(x= fct_rev(fct_reorder(type, count)), y=count, fill=count)) +
  labs(title = "Types Count", x = "Types", y = "Count")+
  theme(legend.position = "none") +
  geom_text(aes(x= type, y = count, label = count, vjust = -0.2))

#Production visualization
ggplot(data=production_count) +
  geom_col(mapping = aes(x= fct_rev(fct_reorder(production, count)), y=count, fill=count)) +
  labs(title = "Production Count", x = NULL, y = "Count") +
  theme(legend.position = "none") +
  geom_text(aes(x=production, y = count, label = count, vjust = -0.2))

#Top 10 Tv Shows Viz
ggplot(data = tv_show) +
  geom_col(mapping = aes(x=fct_rev(fct_reorder(title, days_in_top_10)), y=days_in_top_10, fill=days_in_top_10)) +
  labs(title = "Top 10 TV Shows", x=NULL, y="Number of days" ) +
  theme(legend.position = "none") +
  theme(axis.text.x = element_text(angle = 45, vjust = 1, hjust = 1)) +
  geom_text(aes(x= title, y = days_in_top_10, label = days_in_top_10, vjust = 0))
    
#Top 10 Movies
ggplot(data = movies) +
  geom_col(mapping = aes(x=fct_rev(fct_reorder(title, days_in_top_10)), y=days_in_top_10, fill=days_in_top_10)) +
  labs(title = "Top 10 Movies", x=NULL, y="Number of days" ) +
  theme(legend.position = "none") +
  theme(axis.text.x = element_text(angle = 45, vjust = 1, hjust = 1)) +
  geom_text(aes(x= title, y = days_in_top_10, label = days_in_top_10, vjust = 0))
    
#Top 6 Stand-up Comedy
ggplot(data = comedy) +
  geom_col(mapping = aes(x=fct_rev(fct_reorder(title, days_in_top_10)), y=days_in_top_10, fill=days_in_top_10)) +
  labs(title = "Top 6 Stand-Up Comedy", x=NULL, y="Number of days" ) +
  theme(legend.position = "none") +
  theme(axis.text.x = element_text(angle = 45, vjust = 1, hjust = 1)) +
  geom_text(aes(x= title, y = days_in_top_10, label = days_in_top_10, vjust = 0))

# Findings
1. There are 595 titles.
2. Cocomelon had the highest number of days during the pandemic. All other titles on the top 10 titles had way less number of days than Cocomelon.
3. 377 of the titles are Netflix exclusive.
4. Movie is the most watched type.
5. The top 10 Tv shows were Cocomelon, Manifest, Queen's Gambit, Outer banks, Squid game, All American, Bridgerton, Cobra Kai, Lucifer and Virgin River
6. Top 10 Movies were The Mitchell vs. The machines, How the grinch stole christmas, Vivo, 365 days, Illumination presents the grinch. The Christmas Chronicles 2, We can be heroes, Red notice, The Unforgivable and Home
7. Top 6 Stand-Up Comedy were Dave Chappelle: The Closer, Kevin Hartz: Zero fucks given, George Lopez: We'll do it for half, Chris D'elia: No Pain, Bo Burnham: Inside