# How to use the Youtube Data API: YoutubeDataApi
[Youtube Package Documentation](https://youtube-data-api.readthedocs.io/en/latest/index.html) <br>

[Slides](https://www.leonyin.org/presentations/yt-pt-1.slides.html#/) | [GitHub](https://github.com/mabrownnyu/youtube-api-demo/blob/master/nbs/demo_notebook.ipynb) | [NBViewer](https://nbviewer.jupyter.org/github/mabrownnyu/youtube-api-demo/blob/master/nbs/demo_notebook.ipynb)

Authors: Megan Brown & Leon Yin <br>
Presented on: 2018/10/09
<hr>


## Agenda
Today we will discuss:

1. A brief overview of data available in the YouTube Data API
2. How to install the package
3. How to create an API key
4. A brief look at how to use the package
5. A cool example that I have yet to come up with

Necessary packages for this tutorial are in `requirements.txt`
```
pip install -r requirements.txt
```


# FAQs

### So what kind of data can you get?

* Short answer: a lot

* Comprehensive answer: [here](https://developers.google.com/youtube/v3/docs/)

* What is included in the package:
    * video metadata
    * channel metadata
    * playlist metadata
    * subscription metadata
    * featured channel metadata
    * comment metadata
    * captions metadata
    * search results
    * recommended video results
#### [Package Reference](https://youtube-data-api.readthedocs.io/en/latest/youtube_api.html)



### What is the difference between a user and a channel?
* Essentially: how YouTube stores the data internally.<br>
* A user is the name that a content creator registers (ex: **LastWeekTonight**). You cannot use this value to get more information from a user.<br>
* The channel id is the internal ID for a given user (ex: **UC3XTzVzaHQEd30rQbuvCtTQ**). You can use this value to get more data about a channel. <br>

###### But fear not, there is a solution!
Use `yt.get_channel_id_from_user(username)` to get the channel id for a given user.

### What is the difference between a featured channel and a subscription?
* A subscription is a channel that a user opts into getting updates for
* A featured channel is a feature a channel can use to direct their viewers towards other channels

## How to Install

   The software is on PyPI, so you can download it via `pip`
   
   
   `pip install youtube-data-api`

## How to get an API key

### A quick guide: https://developers.google.com/youtube/v3/getting-started

1. You need a Google Account to access the Google API Console, request an API key, and register your application. You already have this as an NYU student/affiliate.

2. Create a project in the <a href="https://console.developers.google.com/apis/">Google Developers Console</a> and <a href="https://developers.google.com/youtube/registering_an_application">obtain authorization credentials</a> so your application can submit API requests.

3. After creating your project, make sure the YouTube Data API is one of the services that your application is registered to use.

    a. Go to the <a href="https://console.developers.google.com/apis/">API Console</a> and select the project that you just registered.

    b. Visit the <a href="https://console.developers.google.com/apis/enabled">Enabled APIs page</a>. In the list of APIs, make sure the status is ON for the YouTube Data API v3. You do not need to enable OAuth 2.0 since ther are no methods in the package that require it.

## A brief overview of how to use the package

In [2]:
import os
import datetime
import pandas as pd

In [3]:
from youtube_api import YoutubeDataApi
from youtube_api.youtube_api_utils import *

yt = YoutubeDataApi(os.environ.get('YT_KEY'))

### Starting with a channel name and getting some basic metadata
https://www.youtube.com/user/LastWeekTonight

In [4]:
channel_id = yt.get_channel_id_from_user('LastWeekTonight')
print(channel_id)

UC3XTzVzaHQEd30rQbuvCtTQ


You can get more information from this `channel_id`

In [5]:
yt.get_channel_metadata(channel_id)

OrderedDict([('channel_id', 'UC3XTzVzaHQEd30rQbuvCtTQ'),
             ('title', 'LastWeekTonight'),
             ('account_creation_date',
              datetime.datetime(2014, 3, 18, 17, 41, 39)),
             ('keywords', None),
             ('description',
              'Breaking news on a weekly basis. Sundays at 11PM - only on HBO.\nSubscribe to the Last Week Tonight channel for the latest videos from John Oliver and the LWT team.'),
             ('view_count', '1716448004'),
             ('video_count', '252'),
             ('subscription_count', '6479056'),
             ('playlist_id_likes', 'LL3XTzVzaHQEd30rQbuvCtTQ'),
             ('playlist_id_uploads', 'UU3XTzVzaHQEd30rQbuvCtTQ'),
             ('topic_ids',
              'https://en.wikipedia.org/wiki/Entertainment|https://en.wikipedia.org/wiki/Television_program|https://en.wikipedia.org/wiki/Humour'),
             ('country', None),
             ('collection_date',
              datetime.datetime(2018, 10, 9, 11, 47, 57, 36

The default paerser returns the items in the JSON as an `OrderedDict`. Passing `parser = None` returns the raw JSON.

In [6]:
yt.get_channel_metadata(channel_id, parser=None)

{'kind': 'youtube#channel',
 'etag': '"XI7nbFXulYBIpL0ayR_gDh3eu1k/LVe_y6zl5BF7KPI-kBB3PFReyvw"',
 'id': 'UC3XTzVzaHQEd30rQbuvCtTQ',
 'snippet': {'title': 'LastWeekTonight',
  'description': 'Breaking news on a weekly basis. Sundays at 11PM - only on HBO.\nSubscribe to the Last Week Tonight channel for the latest videos from John Oliver and the LWT team.',
  'customUrl': 'LastWeekTonight',
  'publishedAt': '2014-03-18T17:41:39.000Z',
  'thumbnails': {'default': {'url': 'https://yt3.ggpht.com/a-/AN66SAxIEUI6f-101_t2Dy8703mNjD8eikQOVffxBw=s88-mo-c-c0xffffffff-rj-k-no',
    'width': 88,
    'height': 88},
   'medium': {'url': 'https://yt3.ggpht.com/a-/AN66SAxIEUI6f-101_t2Dy8703mNjD8eikQOVffxBw=s240-mo-c-c0xffffffff-rj-k-no',
    'width': 240,
    'height': 240},
   'high': {'url': 'https://yt3.ggpht.com/a-/AN66SAxIEUI6f-101_t2Dy8703mNjD8eikQOVffxBw=s800-mo-c-c0xffffffff-rj-k-no',
    'width': 800,
    'height': 800}},
  'localized': {'title': 'LastWeekTonight',
   'description': 'Breaking

In [21]:
pd.DataFrame(yt.get_subscriptions(channel_id)[:2])

Unnamed: 0,subscription_title,subscription_channel_id,subscription_kind,subscription_publish_date,collection_date
0,HBOBoxing,UCWPQB43yGKEum3eW0P9N_nQ,youtube#channel,2014-03-20 19:05:54,2018-10-09 11:58:40.405346
1,Real Time with Bill Maher,UCy6kyFxaMqGtpE3pQTflK8A,youtube#channel,2014-12-11 18:55:41,2018-10-09 11:58:40.405346


In [8]:
yt.get_featured_channels(channel_id)[:2]

[{'UC3XTzVzaHQEd30rQbuvCtTQ': ['UCVTQuK2CaWaTgSsoNkn5AiQ',
   'UCYbinjMxWwjRpp4WqgDqEDA',
   'UCWPQB43yGKEum3eW0P9N_nQ',
   'UCbKo3HsaBOPhdRpgzqtRnqA',
   'UCy6kyFxaMqGtpE3pQTflK8A',
   'UCQzdMyuz0Lf4zo4uGcEujFw',
   'UCPnlBOg4_NU9wdhRN-vzECQ',
   'UCeKum6mhlVAjUFIW15mVBPg']}]

You can convert the `channel_id` into a playlist id to get all the videos ever posted by a channel using a function from the `youtube_api_utils` in the package.

In [10]:
from youtube_api.youtube_api_utils import *

playlist_id = get_upload_playlist_id(channel_id)
print(playlist_id)

UU3XTzVzaHQEd30rQbuvCtTQ


You can now get the videos from this `playlist_id`

In [20]:
videos = yt.get_videos_from_playlist_id(playlist_id)
pd.DataFrame(videos[:5])

Unnamed: 0,video_id,channel_id,publish_date,collection_date
0,FsZ3p9gOkpY,UC3XTzVzaHQEd30rQbuvCtTQ,2018-10-08 06:30:00,2018-10-09 11:58:28.811250
1,opi8X9hQ7q8,UC3XTzVzaHQEd30rQbuvCtTQ,2018-10-01 06:30:01,2018-10-09 11:58:28.811250
2,OjPYmEZxACM,UC3XTzVzaHQEd30rQbuvCtTQ,2018-09-24 06:30:00,2018-10-09 11:58:28.811250
3,NpPyLcQ2vdI,UC3XTzVzaHQEd30rQbuvCtTQ,2018-09-10 06:30:02,2018-10-09 11:58:28.811250
4,2nXYbGmF3_Q,UC3XTzVzaHQEd30rQbuvCtTQ,2018-08-27 03:00:00,2018-10-09 11:58:28.811250


In [12]:
df = pd.DataFrame(videos)
df.head(2)

Unnamed: 0,video_id,channel_id,publish_date,collection_date
0,FsZ3p9gOkpY,UC3XTzVzaHQEd30rQbuvCtTQ,2018-10-08 06:30:00,2018-10-09 11:48:09.641390
1,opi8X9hQ7q8,UC3XTzVzaHQEd30rQbuvCtTQ,2018-10-01 06:30:01,2018-10-09 11:48:09.641390


From here we can get the full video metadata from the videos

In [19]:
video_meta = yt.get_video_metadata(df.video_id.tolist()[:5])
pd.DataFrame(video_meta[:2])

Unnamed: 0,video_id,channel_title,channel_id,video_publish_date,video_title,video_description,video_category,video_view_count,video_comment_count,video_like_count,video_dislike_count,video_thumbnail,video_tags,collection_date
0,FsZ3p9gOkpY,LastWeekTonight,UC3XTzVzaHQEd30rQbuvCtTQ,2018-10-08 06:30:00,Brazilian Elections: Last Week Tonight with Jo...,Brazil is about to elect a new president durin...,24,2775977,41417,80019,102526,https://i.ytimg.com/vi/FsZ3p9gOkpY/hqdefault.jpg,last week tonight brazilian elections|last wee...,2018-10-09 11:58:13.120384
1,opi8X9hQ7q8,LastWeekTonight,UC3XTzVzaHQEd30rQbuvCtTQ,2018-10-01 06:30:01,Brett Kavanaugh: Last Week Tonight with John O...,John Oliver discusses the ongoing controversy ...,24,7365403,33141,146449,25071,https://i.ytimg.com/vi/opi8X9hQ7q8/hqdefault.jpg,last week tonight brett kavanaugh|john oliver ...,2018-10-09 11:58:13.120384


It is also possible to get the search results from YouTube!

In [18]:
pd.DataFrame(yt.search(q='john oliver', max_results=2))

Unnamed: 0,video_id,channel_title,channel_id,video_publish_date,video_title,video_description,video_category,video_thumbnail,collection_date
0,FsZ3p9gOkpY,LastWeekTonight,UC3XTzVzaHQEd30rQbuvCtTQ,2018-10-08 06:30:00,Brazilian Elections: Last Week Tonight with Jo...,Brazil is about to elect a new president durin...,,https://i.ytimg.com/vi/FsZ3p9gOkpY/hqdefault.jpg,2018-10-09 11:57:57.435617
1,opi8X9hQ7q8,LastWeekTonight,UC3XTzVzaHQEd30rQbuvCtTQ,2018-10-01 06:30:01,Brett Kavanaugh: Last Week Tonight with John O...,John Oliver discusses the ongoing controversy ...,,https://i.ytimg.com/vi/opi8X9hQ7q8/hqdefault.jpg,2018-10-09 11:57:57.435617


In [17]:
recommendations = yt.get_recommended_videos(df.video_id.tolist()[0], max_results = 2)
pd.DataFrame(recommendations)

Unnamed: 0,video_id,channel_title,channel_id,video_publish_date,video_title,video_description,video_category,video_thumbnail,collection_date
0,opi8X9hQ7q8,LastWeekTonight,UC3XTzVzaHQEd30rQbuvCtTQ,2018-10-01 02:10:34,Brett Kavanaugh: Last Week Tonight with John O...,John Oliver discusses the ongoing controversy ...,,https://i.ytimg.com/vi/opi8X9hQ7q8/hqdefault.jpg,2018-10-09 11:57:48.430988
1,oC4VXwFibWY,Saturday Night Live,UCqFzWxSCi39LnW1JKFR3efg,2016-10-02 06:37:32,Celebrity Family Feud: Political Edition - SNL,"Kellyanne Conway (Kate McKinnon), Ivanka Trump...",,https://i.ytimg.com/vi/oC4VXwFibWY/hqdefault.jpg,2018-10-09 11:57:48.430988


[Part 2](https://www.leonyin.org/presentations/yt-pt-2.slides.html#/)