# Dazbo's YouTube and Video Demos - with Google Gemini

## Overview

Welcome to notebook #2 in this tutorial guide. This notebook follows on from [YouTube and Video Demos #1](youtube-demos.ipynb). In the previous notebook I demonstrated:

- Multiple methods for downloading videos and extracting audio
- How to transcribe audio to text using a free speech-to-text API
- How to extract existing transcripts and translate to different languages

In this part we'll strip out parts of the first notebook we don't need, and add some smarts using Google technology.

## How to Launch and Run this Notebook

- The source for this notebook source lives in my GitHub repo, <a href="https://github.com/derailed-dash/youtube-and-video" target="_blank">Youtube-and-Video</a>.
- Check out further guidance - including tips on how to run the notebook - in the project's `README.md`.
- For example, you could...
  - Run the notebook locally, in your own Jupyter environment.
  - Run the notebook in a cloud-based Jupyter environment, with no setup required on your part! For example, with **Google Colab**: <br><br><a href="https://colab.research.google.com/github/derailed-dash/youtube-and-video/blob/main/src/notebooks/youtube-demos-with-google-gemini.ipynb" target="_blank"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Google Colab"/></a><br><br>It looks like this:<br><br><img src="static/images/collab-view.png" width="640px"></img>
- For more ways to run Jupyter Notebooks, check out [my guide](https://medium.com/python-in-plain-english/five-ways-to-run-jupyter-labs-and-notebooks-23209f71e5c0).

**When running this notebook, first execute the cells in the [Setup](#Setup) section, as described below.** Then you can experiment with any of the subsequent cells.


## Setup

### Packages

First, let's install any dependent packages:

In [None]:
%pip install --upgrade --no-cache-dir python-dotenv \
                                      dazbo-commons \
                                      pytubefix

In [22]:
import IPython
from IPython.display import display, Markdown

import io
import logging
import os
import re
import sys
from pathlib import Path
from dataclasses import dataclass
from dotenv import load_dotenv
import dazbo_commons as dc


In [None]:
import dazbo_commons as dc
# Colab requires an older version of Ipykernel
if not "google.colab" in sys.modules:
    pass
    %pip install --upgrade --no-cache-dir ipykernel
    

### Logging

Now we'll setup logging. Here I'm using coloured logging from my [dazbo-commons](https://pypi.org/project/dazbo-commons/) package. Feel free to change the logging level.

In [14]:
# Setup logging
APP_NAME="dazbo-yt-demos"
logger = dc.retrieve_console_logger(APP_NAME)
logger.setLevel(logging.DEBUG)
logger.info("Logger initialised.")
logger.debug("DEBUG level logging enabled.")

[32m14:06:55.440:dazbo-yt-demos - INF: Logger initialised.[39m
[34m14:06:55.443:dazbo-yt-demos - DBG: DEBUG level logging enabled.[39m


### File Locations

Here we initialise some file path locations, e.g. an output folder.

In [15]:
locations = dc.get_locations(APP_NAME)
for attribute, value in vars(locations).items():
    logger.debug(f"{attribute}: {value}")

[34m14:06:58.128:dazbo-yt-demos - DBG: script_name: dazbo-yt-demos[39m
[34m14:06:58.129:dazbo-yt-demos - DBG: script_dir: /mnt/c/Users/djl/localdev/python/youtube-and-video/src/notebooks/dazbo-yt-demos[39m
[34m14:06:58.130:dazbo-yt-demos - DBG: input_dir: /mnt/c/Users/djl/localdev/python/youtube-and-video/src/notebooks/dazbo-yt-demos/input[39m
[34m14:06:58.130:dazbo-yt-demos - DBG: output_dir: /mnt/c/Users/djl/localdev/python/youtube-and-video/src/notebooks/dazbo-yt-demos/output[39m
[34m14:06:58.131:dazbo-yt-demos - DBG: input_file: /mnt/c/Users/djl/localdev/python/youtube-and-video/src/notebooks/dazbo-yt-demos/input/input.txt[39m


### Utility Functions

In [16]:
def clean_filename(filename):
    """ Create a clean filename by removing unallowed characters. """
    pattern = r'[^a-zA-Z0-9._\s-]'
    return  re.sub(pattern, '_', filename)

### Videos to Work With

We start by defining a list of videos to test our application with, along with a function that takes a full YouTube URL and returns just the id portion.

I’ve used these videos because…

- The first is the fantastic [Burning Bridges](https://www.youtube.com/watch?v=udRAIF6MOm8) by Sigrid. The video has no embedded transcript.
- The second is the beautiful song [I Believe](https://www.youtube.com/watch?v=CiTn4j7gVvY) by Melissa Hollick. It’s one of my favourite songs of all time. When I get a migraine, I turn off the lights, and listen to this to feel better! And for those who enjoy gaming, this song is the end titles to the amazing Wolfenstein: New Order game. This video has an embedded transcript.
- Then we have a short [Jim Carey speech](https://www.youtube.com/watch?v=nLgHNu2N3JU), which gives us dialog without music or other ambient noise. It has an embedded transcript.
- And finally, a [Ukrainian song](https://www.youtube.com/watch?v=d4N82wPpdg8) from Eurovision 2024, by Jerry Heil and Alyona Alyona. This gives us an opportunity to test translation. It also has an embedded transcript.

In [17]:
# Videos to download
urls = [
    "https://www.youtube.com/watch?v=udRAIF6MOm8",  # Sigrid - Burning Bridges (English)
    "https://www.youtube.com/watch?v=CiTn4j7gVvY",  # Melissa Hollick - I Believe (English)
    "https://www.youtube.com/watch?v=nLgHNu2N3JU",  # Jim Carey - Motivational speech (English)
    "https://www.youtube.com/watch?v=d4N82wPpdg8",  # Jerry Heil & Alyona Alyona - Teresa & Maria (Ukrainian)
]

def get_video_id(url: str) -> str:
    """ Return the video ID, which is the part after 'v=' """
    return url.split("v=")[-1]

## Downloading Videos and Extracting Audio

Let's use the [pytubefix](https://github.com/JuanBindez/pytubefix) library to download YouTube videos, and then to download mp3 audio-only streams as files.

This library is a community-maintained fork of `pytube`. It was created to provide quick fixes for issues that the official pytube library faced, particularly when YouTube's updates break `pytube`.

In [53]:

from pytubefix import YouTube
from pytubefix.cli import on_progress

output_locn = f"{locations.output_dir}/pytubefix"

def process_yt_videos():
    for i, url in enumerate(urls):
        logger.info(f"Downloads progress: {i+1}/{len(urls)}")

        try:
            yt = YouTube(url, on_progress_callback=on_progress)
            logger.info(f"Getting: {yt.title}")
            video_stream = yt.streams.get_highest_resolution()
            if not video_stream:
                raise Exception("Stream not available.")
            
            # YouTube resource titles may contain special characters which 
            # can't be used when saving the file. So we need to clean the filename.
            cleaned = clean_filename(yt.title)
            
            video_output = f"{output_locn}/{cleaned}.mp4"
            logger.info(f"Downloading video {cleaned}.mp4 ...")
            video_stream.download(output_path=output_locn, filename=f"{cleaned}")
        
            logger.info(f"Creating audio...")
            audio_stream = yt.streams.get_audio_only()
            audio_stream.download(output_path=output_locn, filename=cleaned, mp3=True)
            
            logger.info("Done")
            
        except Exception as e:        
            logger.error(f"Error processing URL '{url}'.")
            logger.debug(f"The cause was: {e}") 
            
    logger.info(f"Downloads finished. See files in {output_locn}.")
    
process_yt_videos()


[32m00:46:52.975:dazbo-yt-demos - INF: Downloads progress: 1/4[39m


[32m00:46:53.145:dazbo-yt-demos - INF: Getting: Sigrid - Burning Bridges[39m
[32m00:47:19.145:dazbo-yt-demos - INF: Downloading video Sigrid - Burning Bridges.mp4 ...[39m


 ↳ |████████████████████████████████████████████| 100.0%

[32m00:47:39.932:dazbo-yt-demos - INF: Creating audio...[39m
  audio_stream.download(output_path=output_locn, filename=cleaned, mp3=True)


 ↳ |████████████████████████████████████████████| 100.0%

[32m00:48:03.593:dazbo-yt-demos - INF: Done[39m
[32m00:48:14.211:dazbo-yt-demos - INF: Downloads progress: 2/4[39m
[32m00:48:17.390:dazbo-yt-demos - INF: Getting: Wolfenstein: The New Order - I Believe - Melissa Hollick (Official Ending Song)[39m
[32m00:48:39.972:dazbo-yt-demos - INF: Downloading video Wolfenstein_ The New Order - I Believe - Melissa Hollick _Official Ending Song_.mp4 ...[39m


 ↳ |████████████████████████████████████████████| 100.0%

[32m00:48:42.392:dazbo-yt-demos - INF: Creating audio...[39m
  audio_stream.download(output_path=output_locn, filename=cleaned, mp3=True)


 ↳ |████████████████████████████████████████████| 100.0%

[32m00:48:45.745:dazbo-yt-demos - INF: Done[39m
[32m00:48:47.943:dazbo-yt-demos - INF: Downloads progress: 3/4[39m
[32m00:48:51.097:dazbo-yt-demos - INF: Getting: Jim Carrey Commencement Speech His Father's Inspiration[39m
[32m00:48:55.716:dazbo-yt-demos - INF: Downloading video Jim Carrey Commencement Speech His Father_s Inspiration.mp4 ...[39m


 ↳ |████████████████████████████████████████████| 100.0%

[32m00:48:56.247:dazbo-yt-demos - INF: Creating audio...[39m
  audio_stream.download(output_path=output_locn, filename=cleaned, mp3=True)


 ↳ |████████████████████████████████████████████| 100.0%

[32m00:48:56.513:dazbo-yt-demos - INF: Done[39m
[32m00:48:56.515:dazbo-yt-demos - INF: Downloads progress: 4/4[39m
[32m00:48:56.671:dazbo-yt-demos - INF: Getting: alyona alyona & Jerry Heil - Teresa & Maria (LIVE) | Ukraine 🇺🇦 | Grand Final | Eurovision 2024[39m
[32m00:48:59.104:dazbo-yt-demos - INF: Downloading video alyona alyona _ Jerry Heil - Teresa _ Maria _LIVE_ _ Ukraine __ _ Grand Final _ Eurovision 2024.mp4 ...[39m


 ↳ |████████████████████████████████████████████| 100.0%

[32m00:49:00.349:dazbo-yt-demos - INF: Creating audio...[39m
  audio_stream.download(output_path=output_locn, filename=cleaned, mp3=True)


 ↳ |████████████████████████████████████████████| 100.0%

[32m00:49:00.718:dazbo-yt-demos - INF: Done[39m
[32m00:49:00.719:dazbo-yt-demos - INF: Downloads finished. See files in /mnt/c/Users/djl/localdev/python/youtube-and-video/src/notebooks/dazbo-yt-demos/output/pytubefix.[39m


## Extract Existing Transcripts from Videos

Now I'm going to use the [youtube-transcript-api](https://github.com/jdepoix/youtube-transcript-api) to extract existing transcripts from YouTube videos. Not only will it return the transcript, but it can also be used to translate those to translate those transcripts into other languages.  So now I can download my Ukrainian song, and see both the Ukrainian transcript and the English translation. This is pretty awesome!

However, some videos do not contain transcripts.

In [None]:
%pip install --upgrade --no-cache-dir youtube_transcript_api

In [None]:
import youtube_transcript_api as yt_api
from pytubefix import YouTube
from pytubefix.cli import on_progress

def get_transcripts():
    """ Extract existing transcript data from videos """
    for url in urls:
        try: # Just so we can get the video title
            yt = YouTube(url, on_progress_callback=on_progress)
        except Exception as e:        
            logger.error(f"Error processing URL '{url}'.")
            logger.debug(f"The cause was: {e}") 
            continue
        
        logger.info(f"Processing '{yt.title}'...")
        video_id = get_video_id(url)
        
        try:
            # By default, we get a list of 1: only get the preferred language transcript
            transcript_list = yt_api.YouTubeTranscriptApi.list_transcripts(video_id)
        except Exception as e:
            logger.error(f"Unable to extract transcript for '{yt.title}'.")
            logger.debug(e)
            continue
        
        # iterate over all available transcripts
        for transcript in transcript_list:
            # The Transcript object provides metadata properties. Here are some...
            properties = {
                "video_id": transcript.video_id,
                "language": transcript.language,
                "language_code": transcript.language_code,
                "is_generated": transcript.is_generated,  # Whether it has been manually created or generated by YouTube
                "is_translatable": transcript.is_translatable,  # Whether this transcript can be translated or not
                "translation_languages": transcript.translation_languages,
            }
            
            for prop, value in properties.items():
                logger.info(f"{prop}: {value}")

            # Fetch the actual transcript data
            transcript_data = transcript.fetch() # returns a list of dicts
            logger.info(f"Raw transcript:\n{transcript_data}") 
            
            processed_transcript = process_transcript(transcript_data)
            logger.info(f"Processed transcript:\n{processed_transcript}")
            
            # Translate to en if we can
            if (transcript.language_code != "en" and 
                    transcript.is_translatable and 
                    any(lang['language_code'] == 'en' for lang in transcript.translation_languages)):
                transcript_data = transcript.translate('en').fetch() # translate to en
                processed_transcript = process_transcript(transcript_data)
                logger.info(f"Processed translated transcript:\n{processed_transcript}")

def process_transcript(transcript_data):
    """ Get all entries that are of type 'text' and NOT starting with [ """
    return "\n".join([entry['text'] for entry in transcript_data 
                                     if entry['text'][0] != "["])
                
get_transcripts()

How cool is this!?

## Adding Google Cloud Smarts

Now we're going to leverage Google Cloud APIs. In order to leverage these Google services, you'll need to have first created a Google Cloud project.

Then, in order to give your notebook access to the Google Cloud APIs, you broadly have three options:

1. You can build and run your notebook locally.
1. You can build and run your notebook in Google Colab.
1. You can build and run your notebook in the Google Vertex AI Workbench environment.

A bit more detail...

### Local Notebook

For local development you will need to:

1. Have the Google Cloud `gcloud CLI` installed. See instructions [here](https://cloud.google.com/sdk/docs/install).
1. Authenticate to your gcloud environment.
1. Use [Application Default Credentials](https://cloud.google.com/docs/authentication/application-default-credentials) from within the notebook.

The process looks like this:

Code example:

```python
from google.auth import default
!gcloud auth application-default login
credentials, _ = default()

PROJECT = !gcloud config get-value project
PROJECT_ID = PROJECT[0]
REGION = "europe-west2"

import vertexai
vertexai.init(project=PROJECT_ID, location=LOCATION)
```

### Google Colab

The great thing about this approach is that Colab provides native integration to authenticate your user account and provide your Google project details to the Colab environment.

Notes:

- No need to install Google `gcloud` locally.
- Share notebooks using Google Drive. Google-drive based access control.
- There are limitations for notebook size, and for notebook runtime instance size.

Check out [this guide](https://github.com/GoogleCloudPlatform/devrel-demos/blob/main/other/colab/Using%20Google%20Cloud%20from%20Colab.ipynb).

For example:

```python
import sys
import vertexai

PROJECT = !gcloud config get-value project
PROJECT_ID = PROJECT[0]
REGION = "europe-west2"

# Additional authentication is required for Google Colab
if "google.colab" in sys.modules:
    from google.colab import auth
    # Authenticate user
    auth.authenticate_user()
    credentials, _ = google.auth.default()

vertexai.init(
    project=MY_PROJECT,
    location=VERTEX_LOCATION,
    credentials=credentials
)
```

### Vertex AI Workbench

[Vertex AI Workbench](https://cloud.google.com/vertex-ai/docs/workbench/introduction) is Google's managed enterprise Jupyter notebook hosting service. Is is fully-integrated with the Google Cloud and Vertex AI ecosystem.

Notes:

- The JupyterLab environment is pre-installed.
- Access control and sharing is managed by Google Cloud IAM, rather than Google Drive.
- Because it is natively integrated with the Google Cloud environment, you don't need to provide any credentials or authenticate. You just need to provide Google project ID and region to any services that require this information. E.g.

```python
import vertexai

PROJECT = !gcloud config get-value project
PROJECT_ID = PROJECT[0]
REGION = "europe-west2"

vertexai.init(project=PROJECT_ID, location=REGION)
```

### Retrieve Environment Variables

In [40]:
import sys
from getpass import getpass

# Retrieve PROJECT_ID and other variables from any .env we can find
try:
    dc.get_envs_from_file()
except ValueError as e:
    logger.error(f"Problem reading env file:\n{e}")

env_vars = ["PROJECT_ID", "REGION"] # The vars we want to retrieve
for env_var in env_vars:
    if not os.getenv(env_var):
        PROJECT_ID = '' # @param {type: "string"}
        # If not retrieved from .env we'll need to input the value
        os.environ[env_var] = getpass(f"Enter {env_var}: ")

    # Set Python variable of the same name as the env var, e.g. PROJECT_ID
    globals()[env_var] = os.environ[env_var]
    val = globals()[env_var]
    logger.info(f"{env_var} retrieved: {val}")
    

[32m00:09:51.418:dazbo-yt-demos - INF: PROJECT_ID retrieved: video-smarts-442000[39m
[32m00:09:51.419:dazbo-yt-demos - INF: REGION retrieved: europe-west2[39m


### Clear Environment Variables

Only run the next cell if you want to manually clear the environment variables and then input new values. In this scenario, you'll also want to comment out any variables in your .env file.

In [39]:
# Optionally run this if we want to clear env vars
for env_var in env_vars:
    if env_var in os.environ:
        del os.environ[env_var]
        logger.info(f"Cleared environment variable: {env_var}")

[32m00:09:45.875:dazbo-yt-demos - INF: Cleared environment variable: PROJECT_ID[39m
[32m00:09:45.876:dazbo-yt-demos - INF: Cleared environment variable: REGION[39m


In [None]:
%pip install google-auth google-auth-oauthlib google-auth-httplib2

In [37]:
# If we're running Google Colab, authenticate
if "google.colab" in sys.modules:
    from google.colab import auth
    auth.authenticate_user()
else:
    # Be sure to set ADC if not running Vertex AI: !gcloud auth application-default login
    pass

from google.auth import default
credentials, _ = default()


### Video Transcription Usng the Video Intelligence API

In [None]:
%pip install google-cloud-videointelligence

In [None]:
from google.cloud import videointelligence

video_client = videointelligence.VideoIntelligenceServiceClient()
features = [videointelligence.Feature.SPEECH_TRANSCRIPTION]

config = videointelligence.SpeechTranscriptionConfig(
    language_code="en-US", enable_automatic_punctuation=True
)
video_context = videointelligence.VideoContext(speech_transcription_config=config)

# Fortunately, this API natively supports mp4 without any conversion
for video in Path(output_locn).glob(f'*.mp4'):
    logger.info(f"Processing {video}...")

    try:
        with io.open(video, "rb") as file:
            input_content = file.read()
            
        operation = video_client.annotate_video(
            request={
                "features": features,
                "input_content": input_content,
                "video_context": video_context,
            }
        )

        logger.info("Processing video for speech transcription.")
        result = operation.result(timeout=600)

        # There is only one annotation_result since only one video is processed.
        annotation_results = result.annotation_results[0]
        for speech_transcription in annotation_results.speech_transcriptions:
            # The number of alternatives for each transcription is limited by
            # SpeechTranscriptionConfig.max_alternatives.
            # Each alternative is a different possible transcription
            # and has its own confidence score.
            for alternative in speech_transcription.alternatives:
                print("Alternative level information:")

                print("Transcript: {}".format(alternative.transcript))
                print("Confidence: {}\n".format(alternative.confidence))

                print("Word level information:")
                for word_info in alternative.words:
                    word = word_info.word
                    start_time = word_info.start_time
                    end_time = word_info.end_time
                    print(
                        "\t{}s - {}s: {}".format(
                            start_time.seconds + start_time.microseconds * 1e-6,
                            end_time.seconds + end_time.microseconds * 1e-6,
                            word,
                        )
                    )
    except Exception as e:
        logger.error(e)

## Vertex AI

Let's integrate some Google Cloud Vertex AI smarts. Start by installing the **Google Cloud Vertex AI SDK for Python**. 

From [Introduction to the Vertex AI SDK for Python](https://cloud.google.com/vertex-ai/docs/python-sdk/use-vertex-ai-python-sdk#sdk-vs-client-library):

When you install the Vertex AI SDK for Python (`google.cloud.aiplatform`), the Vertex AI Python client library (`google.cloud.aiplatform.gapic`) is also installed. The Vertex AI SDK and the Vertex AI Python client library provide similar functionality with different levels of granularity. The Vertex AI SDK operates at a higher level of abstraction than the client library and is suitable for most common data science workflows. If you need lower-level functionality, then use the Vertex AI Python client library.

In [None]:
# Install Vertex AI SDK for Python and Vertex Generative AI SDK for Python
%pip install --upgrade google-cloud-aiplatform \
                       google-generativeai

In [None]:
from google.cloud import aiplatform # Google Cloud Vertex AI SDK for Python
# import vertexai   # Google Cloud Vertex Generative AI SDK for Python
import google.generativeai as genai  # Google Gemini API (GenAI)
from vertexai.generative_models import GenerativeModel