# Diving into YouTube Analytics
Deepnote presented a live stream featuring a showcase of how to use Google's YouTube API. We had a few Deepnote coders and a Deepnote user Allan who helped crack the API and start getting data. Below is a project inspired from that stream!

In this notebook we are going to dive in and check out the popular YouTube channel Good Mythical Morning.

In [1]:
from googleapiclient.discovery import build
import os
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

In [10]:
CHANNEL_ID = "UCfzlCWGWYyIQ0aLC5w48gBQ"
API_KEY = 'AIzaSyBYOWoFmf3cG5Ez653Qdmw9xHmchEMz4Ys'
youtube = build('youtube', 'v3', developerKey=API_KEY)

## API Functions
In this section we setup some API functions to help us gather data. We create a function to gather channel stats and video stats. To do this in a effecient way and save API calls we do not use the search function. We pull in GMM's upload playlist ( containing all of their videos ) and use that to build a video list. Then grab the data from each video.

In [11]:
# Function to get the channels stats
# It will also contain the upload playlist ID we can use to grab videos.
def get_channel_stats(youtube, channel_id):
    request = youtube.channels().list(
        part="snippet,contentDetails,statistics",
        id=channel_id
    )
    response = request.execute()
    
    return response['items']

In [12]:
# This will get us a list of videos from a playlist.
# Note a page of results has a max value of 50 so we will
# need to loop over our results with a pageToken

def get_video_list(youtube, upload_id):
    video_list = []
    request = youtube.playlistItems().list(
        part="snippet,contentDetails",
        playlistId=upload_id,
        maxResults=50
    )
    next_page = True
    while next_page:
        response = request.execute()
        data = response['items']

        for video in data:
            video_id = video['contentDetails']['videoId']
            if video_id not in video_list:
                video_list.append(video_id)

        # Do we have more pages?
        if 'nextPageToken' in response.keys():
            next_page = True
            request = youtube.playlistItems().list(
                part="snippet,contentDetails",
                playlistId=upload_id,
                pageToken=response['nextPageToken'],
                maxResults=50
            )
        else:
            next_page = False

    return video_list

In [18]:
# Once we have our video list we can pass it to this function to get details.
# Again we have a max of 50 at a time so we will use a for loop to break up our list. 

def get_video_details(youtube, video_list):
    stats_list=[]

    # Can only get 50 videos at a time.
    for i in range(0, len(video_list), 50):
        request= youtube.videos().list(
            part="snippet,contentDetails,statistics",
            id=video_list[i:i+50]
        )

        data = request.execute()
        for video in data['items']:
            title=video['snippet']['title']
            published=video['snippet']['publishedAt']
            description=video['snippet']['description']
            view_count=video['statistics'].get('viewCount',0)
            like_count=video['statistics'].get('likeCount',0)
            dislike_count=video['statistics'].get('dislikeCount',0)
            comment_count=video['statistics'].get('commentCount',0)
            stats_dict=dict(title=title, description=description, published=published, view_count=view_count, like_count=like_count, dislike_count=dislike_count, comment_count=comment_count)
            stats_list.append(stats_dict)

    return stats_list

## Create our Channel Stats

In [19]:
channel_stats = get_channel_stats(youtube, CHANNEL_ID)

## Find our Upload Playlist ( will contain all video uploads )

In [20]:
upload_id = channel_stats[0]['contentDetails']['relatedPlaylists']['uploads']
upload_id

'UUfzlCWGWYyIQ0aLC5w48gBQ'

## Get our Video List

In [21]:
video_list = get_video_list(youtube, upload_id)

## Get our Video Details
Finally we will get all of our videos details returned in a dictionary.

In [22]:
video_data = get_video_details(youtube, video_list)

## Creating Visualizations
 In this section we will covert our data to pandas dataframe and start visualizing. I have also created a CSV of data in the cell below if you don't want to do the API calls yourself.

In [26]:
df=pd.DataFrame(video_data)
df['title_length'] = df['title'].str.len()
df["view_count"] = pd.to_numeric(df["view_count"])
df["like_count"] = pd.to_numeric(df["like_count"])
df["dislike_count"] = pd.to_numeric(df["dislike_count"])
df["comment_count"] = pd.to_numeric(df["comment_count"])
# reaction used later add up likes + dislikes + comments
df["reactions"] = df["like_count"] + df["dislike_count"] + df["comment_count"] + df["comment_count"]
df.to_csv("GMM-Data.csv")
df.head()

Unnamed: 0,title,description,published,view_count,like_count,dislike_count,comment_count,title_length,reactions
0,Better tracking for your deep learning trainin...,Introduction and overview of Weights and Biase...,2022-03-23T15:50:54Z,11946,323,0,30,77,383
1,Why & how two or more hidden layers w/ nonline...,More information: https://nnfs.io/mvp\n\nChann...,2022-03-21T14:01:08Z,9471,284,0,32,99,348
2,WHERE NNFS?! Signed 3090 & Upcoming Content,GTC Signup: https://nvda.ws/3oi1vfE\n\nTo be e...,2022-03-09T16:31:50Z,12083,455,0,74,43,603
3,Attacking and Defeating the Enemy - Starcraft ...,Welcome to part 3 of the Starcraft AI with Pyt...,2022-03-07T15:21:37Z,11826,304,0,43,68,390
4,Starcraft 2 AI with Python - Building Defenses...,Adding logic to the Starcraft 2 artificial int...,2022-03-06T16:05:21Z,15185,465,0,36,52,537


## Most Viewed Videos

In [27]:
df_highest_views = df.nlargest(10, 'view_count')
df_highest_views['title'] = df_highest_views['title'].str[:40]
df_highest_views['view_count_millions'] = df_highest_views['view_count'] / 1000000
df_highest_views

Unnamed: 0,title,description,published,view_count,like_count,dislike_count,comment_count,title_length,reactions,view_count_millions
488,Practical Machine Learning Tutorial with,The objective of this course is to give you a ...,2016-04-11T00:18:14Z,2456599,23347,0,1005,57,25357,2.456599
320,Self driving car neural network in the c,"In this self-driving car with Python video, I ...",2017-04-21T18:16:48Z,1549485,22202,0,1534,84,25270,1.549485
487,Regression Intro - Practical Machine Lea,"To begin, what is regression in terms of us us...",2016-04-11T00:18:35Z,1291968,9929,0,1451,70,12831,1.291968
509,Introduction - Django Web Development wi,Welcome to a Django web development with Pytho...,2016-01-19T15:39:38Z,1078482,9024,0,740,51,10504,1.078482
703,How to download and install Python Packa,This tutorial covers how to download and insta...,2015-01-21T22:29:46Z,1070970,4964,0,1163,64,7290,1.07097
193,"Deep Learning with Python, TensorFlow, a",An updated deep learning introduction using Py...,2018-08-11T14:17:31Z,1060731,17820,0,1448,57,20716,1.060731
798,Game Development in Python 3 With PyGame,"In this video, we introduce how to make video ...",2014-08-27T18:34:41Z,930475,8679,0,1154,52,10987,0.930475
419,What I do for a living - Q&A #1,"Sentdex Q&A. To start, I answer how I learned ...",2016-11-05T13:28:35Z,820113,13925,0,978,31,15881,0.820113
52,Neural Networks from Scratch - P.1 Intro,Building neural networks from scratch in Pytho...,2020-04-11T13:49:09Z,817014,23098,0,1520,56,26138,0.817014
385,Intro and Getting Stock Price Data - Pyt,Welcome to a Python for Finance tutorial serie...,2017-01-17T18:19:51Z,773862,8897,0,617,71,10131,0.773862
