<a href="https://colab.research.google.com/github/prisar/ai_notebooks/blob/main/nb_096.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [4]:
!gcloud auth application-default login --no-launch-browser

Go to the following link in your browser, and complete the sign-in prompts:

    https://accounts.google.com/o/oauth2/auth?response_type=code&client_id=764086051850-6qr4p6gpi6hn506pt8ejuq83di341hur.apps.googleusercontent.com&redirect_uri=https%3A%2F%2Fsdk.cloud.google.com%2Fapplicationdefaultauthcode.html&scope=openid+https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fuserinfo.email+https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fcloud-platform+https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fsqlservice.login&state=phw1GPSu1ZAinRBOOVy5sGPp9rnh8b&prompt=consent&token_usage=remote&access_type=offline&code_challenge=nSqFYyVLNwdXixUBGikfd5qAAk--luZb6mLO11Pa_Ak&code_challenge_method=S256

Once finished, enter the verification code provided in your browser: 

Credentials saved to file: [/content/.config/application_default_credentials.json]

These credentials will be used by any library that requests Application Default Credentials (ADC).
Cannot find a quota project to add to ADC. You might receive a "quota excee

# Task
Summarize the video located at "gs://mrc-quant-ml-video-analysis/videoplayback.mp4" using temporal chunking, Whisper for captioning, vector encoding, and a memory-augmented LLM, and evaluate the summary based on faithfulness, temporal coherence, and semantic coverage.

## Setup and authentication

### Subtask:
Install necessary libraries and authenticate to Google Cloud Storage.


**Reasoning**:
Install the necessary Python libraries using pip.



In [5]:
%pip install google-cloud-storage google-cloud-aiplatform ffmpeg-python openai-whisper numpy torch transformers sentence-transformers scipy

Collecting ffmpeg-python
  Downloading ffmpeg_python-0.2.0-py3-none-any.whl.metadata (1.7 kB)
Collecting openai-whisper
  Downloading openai_whisper-20250625.tar.gz (803 kB)
[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/803.2 kB[0m [31m?[0m eta [36m-:--:--[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m803.2/803.2 kB[0m [31m39.6 MB/s[0m eta [36m0:00:00[0m
[?25h  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
Collecting nvidia-cuda-nvrtc-cu12==12.4.127 (from torch)
  Downloading nvidia_cuda_nvrtc_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-runtime-cu12==12.4.127 (from torch)
  Downloading nvidia_cuda_runtime_cu12-12.4.127-py3-none-manylinux2014_x86_64.whl.metadata (1.5 kB)
Collecting nvidia-cuda-cupti-cu12==12.4.127 (from torch)
  Downloading nvidia_cuda_cupti_cu12-12.4.

## Download video

### Subtask:
Download the video from the provided GCS path.


**Reasoning**:
Download the video from the specified GCS path by importing the necessary library, creating a client, defining bucket and blob names, and using the download_to_filename method.



In [6]:
from google.cloud.storage import Client

# Instantiate a Client object
storage_client = Client()

# Define the bucket name and source blob name
bucket_name = "mrc-quant-ml-video-analysis"
source_blob_name = "videoplayback.mp4"
destination_file_name = "videoplayback.mp4"

# Get the bucket object
bucket = storage_client.bucket(bucket_name)

# Get the blob object
blob = bucket.blob(source_blob_name)

# Download the blob to the specified destination file
blob.download_to_filename(destination_file_name)

print(f"Downloaded {source_blob_name} from bucket {bucket_name} to {destination_file_name}")



OSError: Project was not passed and could not be determined from the environment.

**Reasoning**:
The previous command failed because the project ID was not specified. The error message suggests that the project could not be determined from the environment. To fix this, explicitly provide the project ID when instantiating the `Client`. I will also try to fix the quota project warning that was raised previously by setting the project ID as the quota project as well.



In [9]:
from google.cloud.storage import Client
import os

# Explicitly set the project ID
project_id = os.environ.get("GOOGLE_CLOUD_PROJECT")

if not project_id:
    # Replace with your actual project ID if it's not set in the environment
    project_id = "mrc-quant-ml" # Replace with your project ID or set the GOOGLE_CLOUD_PROJECT environment variable

# Set the project ID as the quota project to avoid the warning
!gcloud auth application-default set-quota-project {project_id}

# Instantiate a Client object with the project ID
storage_client = Client(project=project_id)

# Define the bucket name and source blob name
bucket_name = "mrc-quant-ml-video-analysis"
source_blob_name = "videoplayback.mp4"
destination_file_name = "videoplayback.mp4"

# Get the bucket object
bucket = storage_client.bucket(bucket_name)

# Get the blob object
blob = bucket.blob(source_blob_name)

# Download the blob to the specified destination file
blob.download_to_filename(destination_file_name)

print(f"Downloaded {source_blob_name} from bucket {bucket_name} to {destination_file_name}")


Credentials saved to file: [/content/.config/application_default_credentials.json]

These credentials will be used by any library that requests Application Default Credentials (ADC).

Quota project "mrc-quant-ml" was added to ADC which can be used by Google client libraries for billing and quota. Note that some services may still bill the project owning the resource.
Downloaded videoplayback.mp4 from bucket mrc-quant-ml-video-analysis to videoplayback.mp4


## Process video into chunks

### Subtask:
Split the video into 2-5 minute chunks and save them.


**Reasoning**:
Define variables for the input video path, output directory, and chunk duration, then create the output directory and use ffmpeg to split the video into chunks of the specified duration.



In [8]:
import ffmpeg
import os

# 1. Define the input video file path, output directory, and chunk duration
input_video_path = "videoplayback.mp4"
output_dir = "video_chunks"
chunk_duration = 300  # seconds (5 minutes)

# 2. Create the output directory if it doesn't exist
if not os.path.exists(output_dir):
    os.makedirs(output_dir)

# 3. Use ffmpeg-python to process the video and 4. employ ffmpeg.output() for splitting
(
    ffmpeg
    .input(input_video_path)
    .output(f"{output_dir}/chunk_%04d.mp4",
            c="copy",
            segment_time=chunk_duration,
            f="segment")
    .run() # 5. Run the ffmpeg command
)

print(f"Video split into chunks in the directory: {output_dir}")

Video split into chunks in the directory: video_chunks


## Caption video chunks

### Subtask:
Transcribe each video chunk using Whisper.


**Reasoning**:
Import necessary libraries and load the Whisper model for transcribing the video chunks.



In [None]:
import os
import whisper

# Load the Whisper model
model = whisper.load_model("base")

# Define the directory containing the video chunks
video_chunks_dir = "video_chunks"

# Get a list of video chunk files
chunk_files = [f for f in os.listdir(video_chunks_dir) if f.endswith(".mp4")]
chunk_files.sort() # Ensure consistent order

# Initialize a list to store transcriptions
transcriptions = []

# Iterate through the video chunks and transcribe
for chunk_file in chunk_files:
    chunk_path = os.path.join(video_chunks_dir, chunk_file)
    print(f"Transcribing {chunk_file}...")
    # Transcribe the audio from the video chunk
    result = model.transcribe(chunk_path)
    transcriptions.append({"filename": chunk_file, "transcription": result["text"]})
    print(f"Finished transcribing {chunk_file}")

# Display the transcriptions (optional)
# for transcription in transcriptions:
#     print(f"File: {transcription['filename']}")
#     print(f"Transcription: {transcription['transcription']}")
#     print("-" * 20)


100%|███████████████████████████████████████| 139M/139M [00:01<00:00, 95.0MiB/s]


Transcribing chunk_0000.mp4...




Finished transcribing chunk_0000.mp4
Transcribing chunk_0001.mp4...




Finished transcribing chunk_0001.mp4
Transcribing chunk_0002.mp4...




Finished transcribing chunk_0002.mp4
Transcribing chunk_0003.mp4...




Finished transcribing chunk_0003.mp4
Transcribing chunk_0004.mp4...




Finished transcribing chunk_0004.mp4
Transcribing chunk_0005.mp4...




Finished transcribing chunk_0005.mp4
Transcribing chunk_0006.mp4...




Finished transcribing chunk_0006.mp4
Transcribing chunk_0007.mp4...




Finished transcribing chunk_0007.mp4
Transcribing chunk_0008.mp4...




Finished transcribing chunk_0008.mp4
Transcribing chunk_0009.mp4...




Finished transcribing chunk_0009.mp4
Transcribing chunk_0010.mp4...




Finished transcribing chunk_0010.mp4
Transcribing chunk_0011.mp4...




Finished transcribing chunk_0011.mp4
Transcribing chunk_0012.mp4...




Finished transcribing chunk_0012.mp4
Transcribing chunk_0013.mp4...




Finished transcribing chunk_0013.mp4
Transcribing chunk_0014.mp4...




Finished transcribing chunk_0014.mp4
Transcribing chunk_0015.mp4...




Finished transcribing chunk_0015.mp4
Transcribing chunk_0016.mp4...




Finished transcribing chunk_0016.mp4
Transcribing chunk_0017.mp4...


