# Introduction

The [youtube data api](https://developers.google.com/youtube/v3) allows users to access data on videos using specific search terms, topics, locations, publication dates, and much more. 

Follow [this guide](https://developers.google.com/youtube/v3/getting-started) to create your own API key.

# Google API

The code below builds a service object (the resource variable) for the Google Python API client which allows users to easily use built-in methods to access API endpoints.

In [None]:
#!pip install google-api-python-client

# import library
import googleapiclient.discovery

# connect API
api_service_name = 'youtube'
api_version = 'v3'
developer_key = 'yourkey'

youtube = googleapiclient.discovery.build(
        api_service_name, api_version, developerKey = developer_key)

# Get Channel Overview
Given a channel id, the code below gets some overview data like total subscriber and view counts. Use [this website](https://commentpicker.com/youtube-channel-id.php) to find any specific channel id.

In [None]:
# function to get channel data
def getChannel(channel_id):
  request = youtube.channels().list(
      part='snippet,contentDetails,statistics',
      id=channel_id)
  response = request.execute()
  return response['items'][0]

# function to get channel stats
def getChannelStats(channel_id):
  result = getChannel(channel_id)
  return result['statistics']

# Get Video Metadata

Given a channel id, the code below gets the uploads playlist id. The uploads playlist contains all uploaded videos on a channel.

In [None]:
# function to get uploads playlist id
def getUploadsId(channel_id):
  result = getChannel(channel_id)
  return result['contentDetails']['relatedPlaylists']['uploads']

Get all playlist ids.

In [None]:
# function to get playlists
def getPlaylists(channel_id):
  request = youtube.playlists().list(
      part='snippet',
      id=channel_id)
  response = request.execute()
  return response

In [None]:
getPlaylists('UCLtREJY21xRfCuEKvdki1Kw')

{'kind': 'youtube#playlistListResponse',
 'etag': 'RuuXzTIr0OoDqI4S0RU6n4FqKEM',
 'pageInfo': {'totalResults': 0, 'resultsPerPage': 5},
 'items': []}

Given an uploads playlist id, the code below gets a list of video ids of all videos in the playlist.

In [None]:
# helper function to get list of 50 video ids
def getVideoIdsPage(uploads_id, page_token):
  request = youtube.playlistItems().list(
      part='snippet',
      playlistId = uploads_id,
      pageToken = page_token,
      maxResults = 50)
  response = request.execute()
  return response

# function to get list of all video ids
def getVideoIds(uploads_id):
  # output list
  output = []
  # variable for page token
  nextPageToken = None
  # loop through pages
  while True:
    temp = getVideoIdsPage(uploads_id,nextPageToken)
    # extract video ids into list
    video_id_list = [i['snippet']['resourceId']['videoId'] for i in temp['items']]
    # append to output list
    output.extend(video_id_list)
    try: 
      nextPageToken = temp['nextPageToken']
      print('added 50 videos')
    except:
      print('no more pages')
      break
  return output

Given a list of video ids, the code below gets a dataframe populated by all videos and metadata.

In [None]:
# import library
import pandas as pd

# helper function to get video metadata
def getVideoStatsPage(video_id):
  request = youtube.videos().list(
      part='snippet,contentDetails,liveStreamingDetails,statistics',
      id=video_id)
  response = request.execute()
  output = pd.json_normalize(response,record_path=['items'])
  return output

# function to get video metadata
def getVideoStats(video_id_list):
  output = pd.DataFrame()
  for i in video_id_list:
    output = pd.concat([output,getVideoStatsPage(i)])
  return output[['id','snippet.publishedAt','snippet.title',
                 'snippet.description','snippet.tags','contentDetails.duration',
                 'liveStreamingDetails.scheduledStartTime','liveStreamingDetails.actualStartTime',
                 #'liveStreamingDetails.scheduledEndTime','liveStreamingDetails.actualEndTime',
                 #'liveStreamingDetails.concurrentViewers','liveStreamingDetails.activeLiveChatId',
                 'statistics.viewCount','statistics.favoriteCount',
                 'statistics.likeCount','statistics.commentCount']]

# Get Subscribers Metadata

In [None]:
# function to get channel data
def getSubscriptions():
  request = youtube.subscriptions().list(
      part='snippet,contentDetails,subscriberSnippet',
      mine = True)
  response = request.execute()
  return response

In [None]:
getSubscriptions()

# Example implementation
The code below tests all functions on an example channel.

In [None]:
h3 = getVideoStats(getVideoIds(getUploadsId('UCLtREJY21xRfCuEKvdki1Kw')))

added 50 videos
added 50 videos
added 50 videos
added 50 videos
added 50 videos
added 50 videos
added 50 videos
added 50 videos
added 50 videos
added 50 videos
added 50 videos
no more pages


# Export Data
The code below exports data onto the local desktop as a csv file. 

In [None]:
# import libraries
from google.colab import files
import io

h3.to_csv('h3Videos.csv')
files.download('h3Videos.csv')

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>