## Youtube

![2000px-YouTube_Logo_2017.svg.png](attachment:2000px-YouTube_Logo_2017.svg.png)

### Getting Developer Key

* follow this [video tutorial](https://www.youtube.com/watch?v=pP4zvduVAqo)
* or text instructions at https://developers.google.com/youtube/v3/getting-started

### Getting Started

* Youtube provides 1 million credits free per day
* Usual operations cost between 1 and 10 credits, leading to ca. 100k api calls per day
* [API Documentation](https://developers.google.com/youtube/v3/docs/videos/list)

In [34]:
import os

from dotenv import load_dotenv
load_dotenv()

DEVELOPER_KEY = os.environ['YOUTUBE_KEY']

### Utility Functions

We need some utility functions to wrap the Youtube API.
The API has lots of error cases, timeouts, recoverable errors, etc. that we need to take care of.

In [35]:
import logging
from apiclient.discovery import build
from googleapiclient.errors import HttpError

logging.getLogger('googleapicliet.discovery_cache').setLevel(logging.ERROR)

YOUTUBE_API_SERVICE_NAME = 'youtube'
YOUTUBE_API_VERSION = 'v3'

youtube = build(YOUTUBE_API_SERVICE_NAME,
                YOUTUBE_API_VERSION,
                developerKey=DEVELOPER_KEY,
                cache_discovery=False)

def runner(func, max_results=50, perPage=50, iterate=True, **kwargs):
    """ utility function for pagination, etc. """
    results_rem = max_results

    nexttok = None
    while results_rem > 0:
        kwargs = dict(kwargs,
                      pageToken=nexttok,
                      maxResults=perPage)

        try:
            search_response = func(**kwargs).execute()
        except HttpError as e:
            logging.warning('content: %r', e.content)
            break

        for search_result in search_response.get('items', []):
            yield search_result

        nexttok = search_response.get('nextPageToken')
        if nexttok is None or not iterate:
            break
        results_rem -= perPage


def search_videos(q, max_results=200, order='relevance'):
    """
    search youtube videos
    order: date, rating, relevance, title, videoCount, viewCount
    """
    return runner(
        youtube.search().list,
        max_results=max_results,
        q=q,
        type='video',
        order=order,
        part='id,snippet',
    )

def get_videos(video_ids):
    """ get details for comma-separated video_ids """
    return runner(
        youtube.videos().list,
        part='id,snippet,topicDetails',
        id=video_ids)

### 200 Machine Learning videos

Let's find the 200 most recent videos about "machine learning".

In [36]:
import json

res = list(search_videos('machine learning', max_results=200, order='date'))

video_id = res[198]['id']['videoId']
detailed = list(get_videos(video_id))
print(json.dumps(detailed, indent=4))

[
    {
        "kind": "youtube#video",
        "etag": "\"XI7nbFXulYBIpL0ayR_gDh3eu1k/sMz-AHXgshf60l_Shi048YivjZ4\"",
        "id": "vG8etlO2uq8",
        "snippet": {
            "publishedAt": "2018-04-06T12:07:15.000Z",
            "channelId": "UCXgGY0wkgOzynnHvSEVmE3A",
            "title": "Machine Learning and Data Science",
            "description": "link: https://courses.learncodeonline.in/learn/Machine-Learning-Bootcamp\n\nMachine Learning and Data Science\n\nCompanies like Facebook, Google and Amazon have got a lot of data about us. Even the small companies have got a lot of data like signup information, number of logins, Product purchase, products that we are looking for. All this data can be processed and can give any company a boost in productivity and increase in sale.\nThat is why machine learning is growing so fast.\n\nCompanies can offer amazing features like quick replies that are context based in Gmail, Uber driver arrival time or time to reach at the destination

### Video Comments

Let's fetch comments for the above video, even though it is a very recent upload.

In [37]:
def video_comments(videoId, q=None, max_results=50):
    """ fetch comments for video `videoId` """
    return runner(
        youtube.commentThreads().list,
        max_results=max_results,
        part='id,replies,snippet',
        order='relevance',
        videoId=videoId,
        searchTerms=q,
        textFormat='plainText'
    )

comments = list(video_comments(video_id))
print(json.dumps(comments, indent=4))

[
    {
        "kind": "youtube#commentThread",
        "etag": "\"XI7nbFXulYBIpL0ayR_gDh3eu1k/etM43T2BxA9vuBSwJki0zMVR4B0\"",
        "id": "UgwdaO70_2yrWzapGx54AaABAg",
        "snippet": {
            "videoId": "vG8etlO2uq8",
            "topLevelComment": {
                "kind": "youtube#comment",
                "etag": "\"XI7nbFXulYBIpL0ayR_gDh3eu1k/0tqXSb5QJELgMgXAtY6RnvLtgHs\"",
                "id": "UgwdaO70_2yrWzapGx54AaABAg",
                "snippet": {
                    "authorDisplayName": "Hitesh Choudhary",
                    "authorProfileImageUrl": "https://yt3.ggpht.com/-4q0GjyjZjvo/AAAAAAAAAAI/AAAAAAAAAAA/fJcCMYx3TII/s28-c-k-no-mo-rj-c0xffffff/photo.jpg",
                    "authorChannelUrl": "http://www.youtube.com/channel/UCXgGY0wkgOzynnHvSEVmE3A",
                    "authorChannelId": {
                        "value": "UCXgGY0wkgOzynnHvSEVmE3A"
                    },
                    "videoId": "vG8etlO2uq8",
                    "textDisplay": "Cou

## Other Capabilities

* find related videos
* fetch channel information and videos
* fetch closed captions
* given oauth permissions also possible to fetch user history, likes, subscriptions, etc.