# Accessing the Youtube API
This Notebook explores convenience functions for accessing the Youtube API.
Writen by Leon Yin and Megan Brown

In [6]:
import os
import sys
import json
import datetime
import pandas as pd

# this is to import youtube_api from the py directory
sys.path.append(os.path.abspath('../youtube-data-api')) 
import youtube_api as yt

from runtimestamp.runtimestamp import runtimestamp
runtimestamp()

Updated 2018-07-01 15:24:22.767867
By None
Using Python 3.6.5
On Windows-10-10.0.17134-SP0


In [8]:
key = os.environ.get('YT_API')

We can go from Username to `channel_id`. `channel_id` is required to get uploaded videos, user metadata, and relationships like subscriptions and featured channels.

In [7]:
yt.get_channel_id_from_user('munchies', key)

'UCaLfMkkHhSA_LaCta0BzyhQ'

In [8]:
channel_id = 'UCaLfMkkHhSA_LaCta0BzyhQ'

We can collect channel-level metrics and metadata:

In [9]:
channel_meta = yt.get_channel_metadata(channel_id, key)
channel_meta

OrderedDict([('id', 'UCaLfMkkHhSA_LaCta0BzyhQ'),
             ('title', 'Munchies'),
             ('publish_date', datetime.datetime(2014, 3, 24, 21, 21, 29)),
             ('keywords',
              'food vice "how to" cooking recipe "fresh off the boat" munchies eating "epic mealtime" tutorial "cooking show"'),
             ('description',
              'MUNCHIES is a website and digital video channel from VICE dedicated to food and its global purpose. Launched in 2014, MUNCHIES offers groundbreaking content from a youth driven perspective. In today\'s modern world, the formerly tangible pleasures of music, film, and emerging media are just one click away. Food and the events that manifest around it are one of the everlasting experiences that cannot be replicated by arcs and zeroes. MUNCHIES chronicles the wide spectrum of the global culinary experience and the diverse voices that are pulling us forward: chefs and home cooks, makers and consumers, the politics and policies of food, "

Note that `topic_ids` is a json serialized list.

In [10]:
channel_meta['topic_ids']

'null'

In [11]:
channel_meta['topic_ids_json'] = json.loads(channel_meta['topic_ids'])
channel_meta['topic_ids_json']

In [12]:
with open('../data/channel_topic.json', 'r') as f:
    topics = json.load(f)

In [16]:
[topics.get(c) for c in channel_meta['topic_ids_json']]

TypeError: 'NoneType' object is not iterable

Note for some API calls that require a "playlist ID", you need to use the playlist id (from uploads or likes) rather than the `channel_id`.

In [17]:
playlist_id = channel_meta['playlist_id_uploads']
playlist_id

'UUaLfMkkHhSA_LaCta0BzyhQ'

In [18]:
channel_id == playlist_id

False

For user uploads the channel ID's first two letters are replaced with "UU" (User Upload) and "LL" (Likes), <br>these relationships are captured in two helper-functions.<br> `yt.get_upload_playlist_id()` and `yt.get_liked_playlist_id()`

In [19]:
yt.get_upload_playlist_id(channel_id)

'UUaLfMkkHhSA_LaCta0BzyhQ'

We can use the following function to get all the video IDs from any playlist id.<br>
`cutoff_date` can be used to filter out videos after a certain date and `stop_after_n_iterations` can be used for testing to return the first N * 50 video ids.

In [20]:
video_ids = yt.get_video_urls_from_playlist_id(playlist_id, key,
                                               cutoff_date=datetime.datetime(2017,1,1))

>> 50 Videos parsed. Next Token = CDIQAA
>> 100 Videos parsed. Next Token = CGQQAA
>> 125 Videos parsed. Next Token = CJYBEAA


In [21]:
df = pd.DataFrame(video_ids)
df.head()

Unnamed: 0,publish_date,video_id,channel_id
0,2018-05-19 15:00:03,iA9CFJeftJU,UCaLfMkkHhSA_LaCta0BzyhQ
1,2018-05-11 15:30:00,fyNcJlwXurk,UCaLfMkkHhSA_LaCta0BzyhQ
2,2018-05-01 11:00:03,cThgwSs584U,UCaLfMkkHhSA_LaCta0BzyhQ
3,2018-05-02 11:00:02,fPHjX6p-RbQ,UCaLfMkkHhSA_LaCta0BzyhQ
4,2018-05-10 11:00:01,GAKp-QOa7fo,UCaLfMkkHhSA_LaCta0BzyhQ


Let's look at the data we can collect on a video level...

In [22]:
video_id = df['video_id'].tolist()
video_id[:2]

['iA9CFJeftJU', 'fyNcJlwXurk']

In [23]:
yt.get_video_metadata(video_id[0], key)

OrderedDict([('video_id', 'iA9CFJeftJU'),
             ('channel_title', 'Munchies'),
             ('channel_id', 'UCaLfMkkHhSA_LaCta0BzyhQ'),
             ('video_publish_date', datetime.datetime(2018, 5, 19, 15, 0, 3)),
             ('video_title',
              'Savory Pancakes and a Taco Spread with Claw Money: The Hangover Show'),
             ('video_description',
              'Cara Nicoletti sets up a taco spread with savory pancakes for graffiti artist Claw Money and her crew.\n\nSubscribe to Munchies here: http://bit.ly/Subscribe-to-MUNCHIES\n\nCheck out http://munchies.tv for more!\n\nFollow Munchies here:\nFacebook: http://facebook.com/munchies\nTwitter: http://twitter.com/munchies\nTumblr: http://munchies.tumblr.com\nInstagram: http://instagram.com/munchies\nPinterest: https://www.pinterest.com/munchies\nFoursquare: https://foursquare.com/munchies\nMore videos from the VICE network: https://www.fb.com/vicevideo'),
             ('video_category', '24'),
             ('video

The function also works for a list of up to 50 video ids

In [24]:
video_meta = yt.get_video_metadata(video_id, key)

Exception: Max length of list is 50!

To get around this, I suggest breaking the input into chunks

In [25]:
def chunks(list_, n):
    """Yield successive n-sized chunks from l."""
    for i in range(0, len(list_), n):
        yield list_[i:i + n]

In [26]:
video_meta = []
for chunk in chunks(video_id, n=40):
    vm_ = yt.get_video_metadata(chunk, key)
    video_meta.extend(vm_)
len(video_id)

125

In [27]:
df_video_meta = pd.DataFrame(video_meta)
df_video_meta.head(2)

Unnamed: 0,video_id,channel_title,channel_id,video_publish_date,video_title,video_description,video_category,video_view_count,video_comment_count,video_like_count,video_dislike_count,video_thumbnail,collection_date
0,iA9CFJeftJU,Munchies,UCaLfMkkHhSA_LaCta0BzyhQ,2018-05-19 15:00:03,Savory Pancakes and a Taco Spread with Claw Mo...,Cara Nicoletti sets up a taco spread with savo...,24,22080,157,273,754,https://i.ytimg.com/vi/iA9CFJeftJU/hqdefault.jpg,2018-05-21 16:02:39.296718
1,fyNcJlwXurk,Munchies,UCaLfMkkHhSA_LaCta0BzyhQ,2018-05-11 15:30:00,How-To Make a BLT with Matty Matheson,Leave it to Matty Matheson to find a way to ma...,24,1091165,2080,26235,1032,https://i.ytimg.com/vi/fyNcJlwXurk/hqdefault.jpg,2018-05-21 16:02:39.296782


For establishing relationships you can list featured channels and subscriptions

In [28]:
yt.get_featured_channels(channel_id, key)

{'UCaLfMkkHhSA_LaCta0BzyhQ': ['UCn8zNIfYAQNdrFRrr8oibKw',
  'UCWF0PiUvUi3Jma2oFgaiX2w',
  'UCfQDD-pbllOCXHYwiXxjJxA',
  'UCZaT_X_mc0BI-djXOlfhqWQ',
  'UCB6PV0cvJpzlcXRG7nz6PpQ',
  'UC0iwHRFpv2_fpojZgQhElEQ',
  'UC_NaA2HkWDT6dliWVcvnkuQ',
  'UCS6R2iiAJ1FvEYl4B3zmljw',
  'UC8C8WuWSsFjWFaTHcUQeQxA',
  'UC9ISPZsMaBi5mutsgX6LC1g',
  'UCiZCX1R1F3xYGbeXq1JscKA',
  'UCVfmHpXONv-LVACBV68tq5Q',
  'UC5e0xSqwDGlRg3sdvGQh7lg',
  'UClW2OsdCa2E_KkLZNpm_9VQ',
  'UC9XpoCBNvStSmp3gVf_jG1g',
  'UCflb1gG-X1dy1Ru5JIk5sPw',
  'UCNDUud96oGK5xQ9gyg913vw']}

You can save on time by using a list of inputs for some api calls:

In [29]:
channel_ids = ['UCaLfMkkHhSA_LaCta0BzyhQ', 'UC6MFZAOHXlKK1FI7V0XQVeA']

In [30]:
yt.get_featured_channels(channel_ids, key)

[{'UCaLfMkkHhSA_LaCta0BzyhQ': ['UCn8zNIfYAQNdrFRrr8oibKw',
   'UCWF0PiUvUi3Jma2oFgaiX2w',
   'UCfQDD-pbllOCXHYwiXxjJxA',
   'UCZaT_X_mc0BI-djXOlfhqWQ',
   'UCB6PV0cvJpzlcXRG7nz6PpQ',
   'UC0iwHRFpv2_fpojZgQhElEQ',
   'UC_NaA2HkWDT6dliWVcvnkuQ',
   'UCS6R2iiAJ1FvEYl4B3zmljw',
   'UC8C8WuWSsFjWFaTHcUQeQxA',
   'UC9ISPZsMaBi5mutsgX6LC1g',
   'UCiZCX1R1F3xYGbeXq1JscKA',
   'UCVfmHpXONv-LVACBV68tq5Q',
   'UC5e0xSqwDGlRg3sdvGQh7lg',
   'UClW2OsdCa2E_KkLZNpm_9VQ',
   'UC9XpoCBNvStSmp3gVf_jG1g',
   'UCflb1gG-X1dy1Ru5JIk5sPw',
   'UCNDUud96oGK5xQ9gyg913vw']},
 {'UC6MFZAOHXlKK1FI7V0XQVeA': ['UCSHsNH4FZXFeSQMJ56AdrBA']}]

Subscriptions can only done one channel at a time:

In [31]:
yt.get_subscriptions(channel_id, key)

['UCIZ3xweMcx1XUlcwRESbmBw',
 'UCPD_bxCRGpmmeQcbe2kpPaA',
 'UCroeDtD1dtd1leuxUHDMTXQ',
 'UCfwHP1M0AFSPqTdjzXhV0Zg',
 'UCWu9QuHF-dcakBmhullIH6w',
 'UCNDUud96oGK5xQ9gyg913vw',
 'UCYB6UxeSTHJyEMq-clobnrg',
 'UCaSF0d06nxqCfeuxge5TWxw',
 'UC2BFx9glnPZ-fK3UPDo3V3A',
 'UCWF0PiUvUi3Jma2oFgaiX2w',
 'UCuKKZcFYDeI6ovM6sujV_zg',
 'UCflb1gG-X1dy1Ru5JIk5sPw',
 'UCfQDD-pbllOCXHYwiXxjJxA',
 'UCIEv3lZ_tNXHzL3ox-_uUGQ',
 'UC5e0xSqwDGlRg3sdvGQh7lg',
 'UC2utQmYluWGlhV-W0rTA2wg',
 'UCVfmHpXONv-LVACBV68tq5Q',
 'UCTTMZrH1FNnE285uaOCVFwg',
 'UCUD4yDVyM54QpfqGJX4S7ng',
 'UCzH5n3Ih5kgQoiDAQt2FwLw',
 'UC9ISPZsMaBi5mutsgX6LC1g',
 'UCZaT_X_mc0BI-djXOlfhqWQ',
 'UC8C8WuWSsFjWFaTHcUQeQxA',
 'UCiZCX1R1F3xYGbeXq1JscKA',
 'UCS6R2iiAJ1FvEYl4B3zmljw',
 'UC_NaA2HkWDT6dliWVcvnkuQ',
 'UCB6PV0cvJpzlcXRG7nz6PpQ',
 'UCn8zNIfYAQNdrFRrr8oibKw',
 'UC0iwHRFpv2_fpojZgQhElEQ']

Subscriptions can be more descriptive by setting the `descriptive` flag as True.

In [32]:
yt.get_subscriptions(channel_id, key, descriptive=True)[:2]

[OrderedDict([('subscription_title', 'VICE Arabia'),
              ('subscription_channel_id', 'UCIZ3xweMcx1XUlcwRESbmBw'),
              ('subscription_kind', 'youtube#channel'),
              ('subscription_publish_date',
               datetime.datetime(2017, 11, 27, 23, 21, 22))]),
 OrderedDict([('subscription_title', 'First We Feast'),
              ('subscription_channel_id', 'UCPD_bxCRGpmmeQcbe2kpPaA'),
              ('subscription_kind', 'youtube#channel'),
              ('subscription_publish_date',
               datetime.datetime(2017, 8, 2, 17, 37, 4))])]

You can also get the comments for a given video

In [33]:
yt.get_video_comments(video_id[0], key)[:2]

[OrderedDict([('commenter_channel_url',
               'http://www.youtube.com/channel/UCkiB3DOSRYWuYQrTVAds_Eg'),
              ('commenter_channel_display_name', 'MButtlicious'),
              ('comment_id', 'UgxOiYOje-p9VRMYjDx4AaABAg'),
              ('comment_like_count', 0),
              ('comment_publish_date',
               datetime.datetime(2018, 5, 21, 10, 26, 30)),
              ('text',
               "I do not understand the people that comment on Munchies videos - what is so offensive about this vid? The actual pancake taco dish looks delicious and is an interesting idea, the guests while having strong personalities are interesting and outside-the-box, the vibe is fun and relaxed. I don't want to say that it's because there are no men in this vid, but I can't help but notice that videos with women having fun attract negative comments on this channel for no reason. I don't think people are intentionally sexist, but maybe they should just relax and try and see the video f

For more text we can get closed captions!

In [34]:
vid = 'hEDK3tC43SQ'

In [35]:
captions = yt.get_captions(vid, verbose=False)

In [36]:
captions

OrderedDict([('video_id', 'hEDK3tC43SQ'),
             ('caption',
              'Samantha: Ready to check out the Flavor Graveyard? Isaac: Yeah. There\'s a bunch of tombstones in here to our dearly de-pinted flavors. We retire flavors that just aren\'t selling. What do we got right here? Economic Crunch. "A delightful mash. This flavor we remember for the stock market crash on the 6th of November. We hardly knew you." Have you ever seen someone just fully break down here and just start sobbing? I have not, but I wouldn\'t be surprised. Yeah, I mean, it\'s a sad place. ♪♪ ♪♪ Damn! You\'re too young! ♪♪ ♪♪ ♪♪ ♪♪ I\'m Isaac Lappert. And as an ice-cream maker and businessman myself, I am very interested in the business of ice cream. We\'re starting in Sausalito, where I\'ll introduce you to my family\'s own Lappert\'s Ice Cream. We\'re on the smaller side with 10 shops in 2 states, but I\'m curious about the pros and cons of expanding. So, I\'m going to Jeni\'s Splendid Ice Creams, a pion

You can also get the recommended videos for any given video

In [37]:
recommended_vids = yt.get_recommended_videos(vid, key)

In [38]:
recommended_vids[:2]

[OrderedDict([('video_id',
               {'kind': 'youtube#video', 'videoId': 'MeWMwwRfwFI'}),
              ('channel_title', 'Munchies'),
              ('channel_id', 'UCaLfMkkHhSA_LaCta0BzyhQ'),
              ('video_publish_date',
               datetime.datetime(2015, 4, 16, 15, 12, 17)),
              ('video_title',
               'Making Cold-Stoned Sundaes with the Cannabis Creamery: BONG APPÉTIT'),
              ('video_description',
               'Watch the first episode of SMOKEABLES: How to Make a Gravity Bong - http://bit.ly/28XSWBi\n\nIn this episode of Bong Appetit, host Abdullah Saeed checks out Cannabis Creamery, a Sausalito, CA-based ice cream company that is producing sweet THC-infused treats in a range of fantastic flavors. From classic mint-chip to a grapefruit sorbet originally designed for the Grateful Dead, this ice cream is dankly delicious.\n\nOwner Isaac Lappert takes us on a visit to the family’s original business—Lappert’s Ice Cream—to hear Cannabis Crea