## YouTube Trending Project
Analyzing data from the YouTube trending page in english speaking countries
over the span of a few days (10/23-27/2020)

Goal: 
* To understand common characteristics of trending videos in different countries

* To predict engagement (likes or comments) on a video in english speaking countries

## Table of Contents:
* 1. Data Overview
    * 1.1 Data Analysis 
* 2. Cleaning
* 3. Modeling

### 1. Data Overview
* 1.1 Data Analysis

In [None]:
#import pandas as pd
import numpy as np

import matplotlib.pyplot as plt

# Reading the individual data
us_data = pd.read_csv("../YouTube-Trending/Data/US_10.23-27.20.csv")
ca_data = pd.read_csv("../YouTube-Trending/Data/CA_10.23-27.20.csv")
gb_data = pd.read_csv("../YouTube-Trending/Data/GB_10.23-27.20.csv")

# Reading the stitched data
trend_data = pd.read_csv("../YouTube-Trending/Data/INT_10.23-27.20.csv")

trend_data.head()

Unnamed: 0,video_id,title,publishedAt,channelId,channelTitle,categoryId,trending_date,tags,view_count,likes,dislikes,comment_count,thumbnail_link,comments_disabled,ratings_disabled,description,duration,country
0,bPiofmZGb8o,Second 2020 Presidential Debate between Donald...,2020-10-23T02:49:33Z,UCb--64Gl51jIEVE-GLDAVTg,C-SPAN,25,20.23.10,C-SPAN|CSPAN|2020|Donald Trump|Republican|Whit...,6641600,94601,6209,59293,https://i.ytimg.com/vi/bPiofmZGb8o/default.jpg,False,False,President Donald Trump and former Vice Preside...,1H59M15S,US
1,tcYodQoapMg,Ariana Grande - positions (official video),2020-10-23T04:00:10Z,UC0VOyT2OCBKdQhF3BAbZ-1g,ArianaGrandeVevo,10,20.23.10,ariana grande positions|positions ariana grand...,7516529,1485130,10810,140549,https://i.ytimg.com/vi/tcYodQoapMg/default.jpg,False,False,The official “positions” music video by Ariana...,2M58S,US
2,np9Ub1LilKU,Jack Harlow - Tyler Herro [Official Video],2020-10-22T19:00:14Z,UC6vZl7Qj7JglLDmN_7Or-ZQ,Jack Harlow,10,20.23.10,jack harlow|jack rapper|harlow rapper|private ...,1499338,153028,2006,11013,https://i.ytimg.com/vi/np9Ub1LilKU/default.jpg,False,False,Jack Harlow - Tyler HerroListen now: https://J...,3M,US
3,5S4bm3bAt9Y,SURPRISING BEST FRIEND WITH BORAT!!,2020-10-21T19:56:24Z,UCef29bYGgUSoJjVkqhcAPkw,David Dobrik Too,22,20.23.10,[none],5320147,596894,7044,33648,https://i.ytimg.com/vi/5S4bm3bAt9Y/default.jpg,False,False,Thank you Borat for coming over!! I like youWa...,5M55S,US
4,GuEkHIgR46k,Bryson Tiller - Always Forever (Official Video),2020-10-22T16:00:08Z,UCwhe-6skwaZxLomc-U6Wy1w,BrysonTillerVEVO,10,20.23.10,Bryson Tiller 2020|Bryson Tiller Serenity|Brys...,862087,82059,657,4459,https://i.ytimg.com/vi/GuEkHIgR46k/default.jpg,False,False,A N N I V E R S A R Y OUT NOW!Stream/Download:...,2M59S,US


In [None]:
# Descriptive Statistics of the dataset
trend_data.describe()

Unnamed: 0,categoryId,view_count,likes,dislikes,comment_count
count,3000.0,3000.0,3000.0,3000.0,3000.0
mean,19.094667,2359126.0,140185.9,2750.479,13516.542333
std,7.349493,4063331.0,285631.4,5968.530741,38313.036653
min,1.0,0.0,0.0,0.0,0.0
25%,10.0,464569.5,17962.0,335.0,1558.75
50%,22.0,1019797.0,50548.5,776.0,4117.0
75%,24.0,2503263.0,138024.2,2332.75,10550.0
max,29.0,47984270.0,2934776.0,76339.0,667717.0


In [None]:
np.shape(trend_data)

(3000, 18)

In [None]:
trend_data.dtypes

video_id             object
title                object
publishedAt          object
channelId            object
channelTitle         object
categoryId            int64
trending_date        object
tags                 object
view_count            int64
likes                 int64
dislikes              int64
comment_count         int64
thumbnail_link       object
comments_disabled      bool
ratings_disabled       bool
description          object
duration             object
country              object
dtype: object

In [None]:
# Checking for Missing Data
missing_val = trend_data.isnull().sum()
print(missing_val[0:10])

video_id         0
title            0
publishedAt      0
channelId        0
channelTitle     0
categoryId       0
trending_date    0
tags             0
view_count       0
likes            0
dtype: int64


# Category Dictionary
- 2 Autos & Vehicles 
- 1 Film & Animation
- 10 Music
- 17 - Sports
- 19 - Travel & Events
- 20 - Gaming
- 22 - People & Blogs
- 23 - Comedy
- 24 - Entertainment
- 25 - News & Politics
- 26 - Howto & Style
- 27 - Education
- 28 - Science & Technology
- 29 - Nonprofits & Activism 
<br> More at: https://gist.github.com/dgp/1b24bf2961521bd75d6c



In [None]:
# Figuring out the popular categories
trend_data.categoryId.value_counts()

24    656
10    622
17    408
28    303
22    233
23    168
20    167
26    117
2      92
1      70
25     63
27     60
29     30
19     11
Name: categoryId, dtype: int64

In [None]:
trend_data

Unnamed: 0,video_id,title,publishedAt,channelId,channelTitle,categoryId,trending_date,tags,view_count,likes,dislikes,comment_count,thumbnail_link,comments_disabled,ratings_disabled,description,duration,country
0,bPiofmZGb8o,Second 2020 Presidential Debate between Donald...,2020-10-23T02:49:33Z,UCb--64Gl51jIEVE-GLDAVTg,C-SPAN,25,20.23.10,C-SPAN|CSPAN|2020|Donald Trump|Republican|Whit...,6641600,94601,6209,59293,https://i.ytimg.com/vi/bPiofmZGb8o/default.jpg,False,False,President Donald Trump and former Vice Preside...,1H59M15S,US
1,tcYodQoapMg,Ariana Grande - positions (official video),2020-10-23T04:00:10Z,UC0VOyT2OCBKdQhF3BAbZ-1g,ArianaGrandeVevo,10,20.23.10,ariana grande positions|positions ariana grand...,7516529,1485130,10810,140549,https://i.ytimg.com/vi/tcYodQoapMg/default.jpg,False,False,The official “positions” music video by Ariana...,2M58S,US
2,np9Ub1LilKU,Jack Harlow - Tyler Herro [Official Video],2020-10-22T19:00:14Z,UC6vZl7Qj7JglLDmN_7Or-ZQ,Jack Harlow,10,20.23.10,jack harlow|jack rapper|harlow rapper|private ...,1499338,153028,2006,11013,https://i.ytimg.com/vi/np9Ub1LilKU/default.jpg,False,False,Jack Harlow - Tyler HerroListen now: https://J...,3M,US
3,5S4bm3bAt9Y,SURPRISING BEST FRIEND WITH BORAT!!,2020-10-21T19:56:24Z,UCef29bYGgUSoJjVkqhcAPkw,David Dobrik Too,22,20.23.10,[none],5320147,596894,7044,33648,https://i.ytimg.com/vi/5S4bm3bAt9Y/default.jpg,False,False,Thank you Borat for coming over!! I like youWa...,5M55S,US
4,GuEkHIgR46k,Bryson Tiller - Always Forever (Official Video),2020-10-22T16:00:08Z,UCwhe-6skwaZxLomc-U6Wy1w,BrysonTillerVEVO,10,20.23.10,Bryson Tiller 2020|Bryson Tiller Serenity|Brys...,862087,82059,657,4459,https://i.ytimg.com/vi/GuEkHIgR46k/default.jpg,False,False,A N N I V E R S A R Y OUT NOW!Stream/Download:...,2M59S,US
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2995,MwOpX9pWsdU,The Sims 4 Snowy Escape: Official Reveal Trailer,2020-10-20T15:00:10Z,UCFXKLSrT-4-Mf5TRqv40rgw,The Sims,20,20.27.10,The Sims 4|The Sims 4 Trailer|The Sims 4 Gamep...,1388968,87260,4820,16863,https://i.ytimg.com/vi/MwOpX9pWsdU/default.jpg,False,False,Experience the chilly thrills of mountain life...,1M30S,CA
2996,DqhMZ6WSR2c,Gold iPhone 12 Pro & 12 Unboxing + MagSafe Acc...,2020-10-22T02:22:01Z,UCx8ZK3Ke8az_sRfGDn0CTOg,TechMe0ut,28,20.27.10,iPhone 12|iPhone 12 pro|iPhone 12 unboxing|iPh...,227054,8898,176,1064,https://i.ytimg.com/vi/DqhMZ6WSR2c/default.jpg,False,False,Unboxing the new iPhone 12 in product red and ...,9M29S,CA
2997,JLcdjysXPgs,Koffee - Pressure (Remix) [Official Video] ft....,2020-10-20T10:00:04Z,UCIOoP9FirTzjvgeAYTJWLcg,KoffeeVEVO,10,20.27.10,koffee|koffee pressure|koffee lockdown|koffee ...,1395821,69083,755,3450,https://i.ytimg.com/vi/JLcdjysXPgs/default.jpg,False,False,Directed by Alicia K. HarrisKoffee feat. Buju ...,4M,CA
2998,k6c6M0rKW7s,Destiny 2 – Beyond Light – Story Reveal Trailer,2020-10-20T13:59:34Z,UC52XYgEExV9VG6Rt-6vnzVA,destinygame,20,20.27.10,Destiny|destiny 2|beyond light|games|gming|bun...,976957,44495,831,6039,https://i.ytimg.com/vi/k6c6M0rKW7s/default.jpg,False,False,"Be careful, Guardians. You aren’t the only one...",2M12S,CA
