# Analyze video using Gemini

### Overview

* Use Google Gemini LLM to analyze and describe a video

* **NOTE**
  - This is experimental and there is no clear guidelines yet as to what prompting strategies work best. It will likely vary from video to video type.

#### Steps

1. Upload or download a video in a format like mp4
2. To analyze the video with Gemini you first have to upload it to gcloud
3. Then edit the prompt string to get the kind of description you want

-----

### Setup

In [None]:
import os
import time

from pytube import YouTube


import google.generativeai as genai
from IPython.display import Markdown, Video
from dotenv import load_dotenv

_ = load_dotenv()

#### Setup Gemini API access

In [None]:
genai.configure()


### Download YouTube video

* To download a video from YouTube:
  1. Copy the URL and create a string object called `YT_VIDEO_URL`
  2. Run the `download_youtube_video()` function:
    - `url` - the url of the video
    - `output_folderf` - defaults to `data/videos` but can be changed

### View video

In [None]:
video_file_name = 'videos/Football_Anthem_Film_adidas.mp4'

In [None]:
Video(video_file_name)

### Upload video to gcloud so it can be analyzed by Gemini

In [None]:
print(f"Uploading file...")
video_file = genai.upload_file(path=video_file_name)
print(f"Completed upload: {video_file.uri}")

In [None]:

# Check whether the file is ready to be used.
while video_file.state.name == "PROCESSING":
    print('.', end='')
    time.sleep(10)
    video_file = genai.get_file(video_file.name)

if video_file.state.name == "FAILED":
  raise ValueError(video_file.state.name)

### Analyze the video and produce text description

In [None]:

# Create the prompt.
prompt = '''
    Summarize this video in detail, scene by scene, describing the action, characters, themes
    so that someone who cannot see the video would be able to get the same full sense and message
    that a person viewing the video does. Include timestamps for each scene and elements described.

    Separately produce a transcript of the audio narration with the video also with timestamps.
    '''

# Choose a Gemini model.
model = genai.GenerativeModel(model_name="gemini-1.5-pro-latest")

# Make the LLM request.
print("Making LLM inference request...")
response = model.generate_content([video_file, prompt],
                                  request_options={"timeout": 600})

# Print the response, rendering any Markdown
print(response.text)