# How to use the YouTube Data API: YoutubeDataApi
To find this demo: https://github.com/SMAPPNYU/krn_tools_demo

Megan Brown<br>
Center for Social Media and Politics at NYU<br>
October 13, 2021
<hr>


## Agenda
Today we will discuss:

1. A brief overview of data available in the YouTube Data API
2. How to install the package
3. How to create an API key
4. A brief look at how to use the package

# FAQs

### So what kind of data can you get?

* Short answer: a lot

* Comprehensive answer: [here](https://developers.google.com/youtube/v3/docs/)

* What is included in the package:
    * video metadata
    * channel metadata
    * playlist metadata
    * subscription metadata
    * featured channel metadata
    * comment metadata
    * search results
    
    
#### [Package Reference](https://youtube-data-api.readthedocs.io/en/latest/youtube_api.html)



### What is the difference between a user and a channel?
* Essentially: how YouTube stores the data internally.<br>
* A user is the name that a content creator registers (ex: **LastWeekTonight**). You cannot use this value to get more information from a user.<br>
* The channel id is the internal ID for a given user (ex: **UC3XTzVzaHQEd30rQbuvCtTQ**). You can use this value to get more data about a channel. <br>

###### But fear not, there is a solution!
Use `yt.get_channel_id_from_user(username)` to get the channel id for a given user.

### What is the difference between a featured channel and a subscription?
* A subscription is a channel that a user opts into getting updates for
* A featured channel is a feature a channel can use to direct their viewers towards other channels

## How to Install

   The software is on PyPI, so you can download it via `pip`
   
   
   `pip install youtube-data-api`
   
   If you are following along with the tutorial, run `pip install pandas` too.

## How to get an API key

### A quick guide: https://developers.google.com/youtube/v3/getting-started

1. You need a Google Account to access the Google API Console, request an API key, and register your application. You can use your GMail account for this if you have one.

2. Create a project in the <a href="https://console.developers.google.com/apis/">Google Developers Console</a> and <a href="https://developers.google.com/youtube/registering_an_application">obtain authorization credentials</a> so your application can submit API requests.

3. After creating your project, make sure the YouTube Data API is one of the services that your application is registered to use.

    a. Go to the <a href="https://console.developers.google.com/apis/">API Console</a> and select the project that you just registered.

    b. Visit the <a href="https://console.developers.google.com/apis/enabled">Enabled APIs page</a>. In the list of APIs, make sure the status is ON for the YouTube Data API v3. You do not need to enable OAuth 2.0 since there are no methods in the package that require it.

## A brief overview of how to use the package

In [1]:
import os
import datetime
import pandas as pd

In [2]:
from youtube_api import YoutubeDataApi
from youtube_api.youtube_api_utils import *

yt = YoutubeDataApi(os.environ.get('YT_KEY'))

### Starting with a channel name and getting some basic metadata
https://www.youtube.com/user/LastWeekTonight

In [3]:
channel_id = yt.get_channel_id_from_user('LastWeekTonight')
print(channel_id)

UC3XTzVzaHQEd30rQbuvCtTQ


You can get more information from this `channel_id`

In [4]:
yt.get_channel_metadata(channel_id)

{'channel_id': 'UC3XTzVzaHQEd30rQbuvCtTQ',
 'title': 'LastWeekTonight',
 'account_creation_date': 1395178899.0,
 'keywords': None,
 'description': 'Breaking news on a weekly basis. Sundays at 11PM - only on HBO.\nSubscribe to the Last Week Tonight channel for the latest videos from John Oliver and the LWT team.',
 'view_count': '3123253533',
 'video_count': '362',
 'subscription_count': '8780000',
 'playlist_id_likes': '',
 'playlist_id_uploads': 'UU3XTzVzaHQEd30rQbuvCtTQ',
 'topic_ids': 'https://en.wikipedia.org/wiki/Television_program|https://en.wikipedia.org/wiki/Film|https://en.wikipedia.org/wiki/Society|https://en.wikipedia.org/wiki/Entertainment',
 'country': None,
 'collection_date': datetime.datetime(2021, 10, 12, 17, 56, 3, 728509)}

The default paerser returns the items in the JSON as an `OrderedDict`. Passing `parser = None` returns the raw JSON.

In [5]:
yt.get_channel_metadata(channel_id, parser=None)

{'kind': 'youtube#channel',
 'etag': 'YWXVCLxohx4I-Yu7W6n4pIMT2aQ',
 'id': 'UC3XTzVzaHQEd30rQbuvCtTQ',
 'snippet': {'title': 'LastWeekTonight',
  'description': 'Breaking news on a weekly basis. Sundays at 11PM - only on HBO.\nSubscribe to the Last Week Tonight channel for the latest videos from John Oliver and the LWT team.',
  'publishedAt': '2014-03-18T17:41:39Z',
  'thumbnails': {'default': {'url': 'https://yt3.ggpht.com/ytc/AKedOLQ1OolNuEHCzxypefsS-cmOuBMaRoS3bbgkkN2Ocw=s88-c-k-c0x00ffffff-no-rj',
    'width': 88,
    'height': 88},
   'medium': {'url': 'https://yt3.ggpht.com/ytc/AKedOLQ1OolNuEHCzxypefsS-cmOuBMaRoS3bbgkkN2Ocw=s240-c-k-c0x00ffffff-no-rj',
    'width': 240,
    'height': 240},
   'high': {'url': 'https://yt3.ggpht.com/ytc/AKedOLQ1OolNuEHCzxypefsS-cmOuBMaRoS3bbgkkN2Ocw=s800-c-k-c0x00ffffff-no-rj',
    'width': 800,
    'height': 800}},
  'localized': {'title': 'LastWeekTonight',
   'description': 'Breaking news on a weekly basis. Sundays at 11PM - only on HBO.\nSubsc

In [6]:
pd.DataFrame(yt.get_subscriptions(channel_id)[:5])

Unnamed: 0,subscription_title,subscription_channel_id,subscription_kind,subscription_publish_date,collection_date
0,trueblood,UCPnlBOg4_NU9wdhRN-vzECQ,youtube#channel,1395357000.0,2021-10-12 17:56:04.386340
1,GameofThrones,UCQzdMyuz0Lf4zo4uGcEujFw,youtube#channel,1395357000.0,2021-10-12 17:56:04.386361
2,HBO,UCVTQuK2CaWaTgSsoNkn5AiQ,youtube#channel,1395357000.0,2021-10-12 17:56:04.386376
3,HBOBoxing,UCWPQB43yGKEum3eW0P9N_nQ,youtube#channel,1395357000.0,2021-10-12 17:56:04.386391
4,Cinemax,UCYbinjMxWwjRpp4WqgDqEDA,youtube#channel,1424812000.0,2021-10-12 17:56:04.386406


In [7]:
yt.get_featured_channels(channel_id)[:2]

[{'UC3XTzVzaHQEd30rQbuvCtTQ': []}]

You can convert the `channel_id` into a playlist id to get all the videos ever posted by a channel using a function from the `youtube_api_utils` in the package.

In [8]:
from youtube_api.youtube_api_utils import *

playlist_id = get_upload_playlist_id(channel_id)
print(playlist_id)

UU3XTzVzaHQEd30rQbuvCtTQ


You can now get the videos from this `playlist_id`

In [9]:
videos = yt.get_videos_from_playlist_id(playlist_id)
pd.DataFrame(videos[:5])

Unnamed: 0,video_id,channel_id,publish_date,collection_date
0,l5jtFqWq5iU,UC3XTzVzaHQEd30rQbuvCtTQ,1633924000.0,2021-10-12 17:56:04.619552
1,9W74aeuqsiU,UC3XTzVzaHQEd30rQbuvCtTQ,1633318000.0,2021-10-12 17:56:04.619573
2,bl-ABuxeWrE,UC3XTzVzaHQEd30rQbuvCtTQ,1632948000.0,2021-10-12 17:56:04.619590
3,EN9OdruH_qM,UC3XTzVzaHQEd30rQbuvCtTQ,1632716000.0,2021-10-12 17:56:04.619606
4,27FpoRiStgk,UC3XTzVzaHQEd30rQbuvCtTQ,1631510000.0,2021-10-12 17:56:04.619621


In [10]:
df = pd.DataFrame(videos)

From here we can get the full video metadata from the videos

In [11]:
video_meta = yt.get_video_metadata(df.video_id.tolist()[:5])
pd.DataFrame(video_meta[:2])

Unnamed: 0,video_id,channel_title,channel_id,video_publish_date,video_title,video_description,video_category,video_view_count,video_comment_count,video_like_count,video_dislike_count,video_thumbnail,video_tags,collection_date
0,l5jtFqWq5iU,LastWeekTonight,UC3XTzVzaHQEd30rQbuvCtTQ,1633948000.0,Misinformation: Last Week Tonight with John Ol...,John Oliver discusses how misinformation sprea...,24,2593208,10032,92165,4254,https://i.ytimg.com/vi/l5jtFqWq5iU/hqdefault.jpg,,2021-10-12 17:56:05.162524
1,9W74aeuqsiU,LastWeekTonight,UC3XTzVzaHQEd30rQbuvCtTQ,1633343000.0,PFAS: Last Week Tonight with John Oliver (HBO),John Oliver discusses PFAS — a class of chemic...,24,3483363,8824,106997,2324,https://i.ytimg.com/vi/9W74aeuqsiU/hqdefault.jpg,,2021-10-12 17:56:05.162548


It is also possible to get the search results from YouTube!

In [12]:
pd.DataFrame(yt.search(q='john oliver', max_results=2))

Unnamed: 0,video_id,channel_title,channel_id,video_publish_date,video_title,video_description,video_category,video_thumbnail,collection_date
0,l5jtFqWq5iU,LastWeekTonight,UC3XTzVzaHQEd30rQbuvCtTQ,1633948000.0,Misinformation: Last Week Tonight with John Ol...,John Oliver discusses how misinformation sprea...,,https://i.ytimg.com/vi/l5jtFqWq5iU/hqdefault.jpg,2021-10-12 17:56:05.533022
1,9W74aeuqsiU,LastWeekTonight,UC3XTzVzaHQEd30rQbuvCtTQ,1633343000.0,PFAS: Last Week Tonight with John Oliver (HBO),John Oliver discusses PFAS — a class of chemic...,,https://i.ytimg.com/vi/9W74aeuqsiU/hqdefault.jpg,2021-10-12 17:56:05.533045


For videos, you can get the comments as well

In [13]:
comments = yt.get_video_comments('l5jtFqWq5iU', max_results=10)
pd.DataFrame(comments[:5])

Unnamed: 0,video_id,commenter_channel_url,commenter_channel_id,commenter_channel_display_name,comment_id,comment_like_count,comment_publish_date,text,commenter_rating,comment_parent_id,collection_date,reply_count
0,l5jtFqWq5iU,http://www.youtube.com/channel/UCczD4NbC6xYgwy...,UCczD4NbC6xYgwyl7omx1JBw,Mayank Arora,UgyuNPDriYURYt4yJ1V4AaABAg,0,1634090000.0,HE’S NOT WRONG ABOUT INDIANS LMFAOOOOOO,none,,2021-10-12 17:56:05.734476,0
1,l5jtFqWq5iU,http://www.youtube.com/channel/UCfzlGh0-ptX3T6...,UCfzlGh0-ptX3T6h1zNsSwhQ,marcbwilson,UgziXJAyIT_h7BKntAF4AaABAg,0,1634090000.0,"""or as it's known in America, the Marvel Cinem...",none,,2021-10-12 17:56:05.734501,0
2,l5jtFqWq5iU,http://www.youtube.com/channel/UCJGL2cLxE0kcU8...,UCJGL2cLxE0kcU8as1Cdgkmw,KasonWhitsell,UgzFy7PYJYuSVteT1HV4AaABAg,0,1634090000.0,A large problem is the quantity of what bullsh...,none,,2021-10-12 17:56:05.734519,0
3,l5jtFqWq5iU,http://www.youtube.com/channel/UC8_aUc9vnTE_ue...,UC8_aUc9vnTE_uehi8IbYMjQ,amit nagpal,UgxL_YXfDhOl6Bxpurl4AaABAg,0,1634090000.0,One of my uncles got so offended because I nev...,none,,2021-10-12 17:56:05.734537,0
4,l5jtFqWq5iU,http://www.youtube.com/channel/UCuXH1QkoSrXxxa...,UCuXH1QkoSrXxxad2FVKqrXA,manb00,UgwjIaTFV6XY7A8zyZl4AaABAg,0,1634090000.0,Stopped listening after the drug that won a no...,none,,2021-10-12 17:56:05.734553,0


Thank you!

You can find out more about the package at https://github.com/SMAPPNYU/youtube-data-api.git