<a href="https://colab.research.google.com/github/ywchanna2001/Gemini-Experiments/blob/main/Video_Understanding.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [11]:
#@title Building with Gemini 2.0: Video understanding
%%html
<iframe width="560" height="315" src="https://www.youtube.com/embed/Mot-JEU26GQ?si=pcb7-_MZTSi_1Zkw" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen></iframe>

In [12]:
# Install SDK
%pip install -U -q 'google-genai'

In [13]:
from google.colab import userdata
GOOGLE_API_KEY=userdata.get('GOOGLE_API_KEY')

Initialize SDK client

In [14]:
from google import genai
from google.genai import types

client = genai.Client(api_key=GOOGLE_API_KEY)

Selecting a Gemini Model

In [15]:
model_name = "gemini-2.5-pro-exp-03-25" # @param ["gemini-1.5-flash-latest","gemini-2.0-flash-lite","gemini-2.0-flash","gemini-2.5-pro-exp-03-25"] {"allow-input":true, isTemplate: true}

Get sample videos

In [16]:
# Load sample images
!wget https://storage.googleapis.com/generativeai-downloads/videos/Pottery.mp4 -O Pottery.mp4 -q
!wget https://storage.googleapis.com/generativeai-downloads/videos/Jukin_Trailcam_Videounderstanding.mp4 -O Trailcam.mp4 -q
!wget https://storage.googleapis.com/generativeai-downloads/videos/post_its.mp4 -O Post_its.mp4 -q
!wget https://storage.googleapis.com/generativeai-downloads/videos/user_study.mp4 -O User_study.mp4 -q

Upload the videos

In [17]:
import time

def upload_video(video_file_name):
  video_file = client.files.upload(file=video_file_name)

  while video_file.state == "PROCESSING":
      print('Waiting for video to be processed.')
      time.sleep(10)
      video_file = client.files.get(name=video_file.name)

  if video_file.state == "FAILED":
    raise ValueError(video_file.state)
  print(f'Video processing complete: ' + video_file.uri)

  return video_file

pottery_video = upload_video('Pottery.mp4')
trailcam_video = upload_video('Trailcam.mp4')
post_its_video = upload_video('Post_its.mp4')
user_study_video = upload_video('User_study.mp4')

Waiting for video to be processed.
Video processing complete: https://generativelanguage.googleapis.com/v1beta/files/jk3k8848eslf
Waiting for video to be processed.
Video processing complete: https://generativelanguage.googleapis.com/v1beta/files/dtimwr0gfu85
Waiting for video to be processed.
Video processing complete: https://generativelanguage.googleapis.com/v1beta/files/sm9ly0w9vapw
Waiting for video to be processed.
Video processing complete: https://generativelanguage.googleapis.com/v1beta/files/ttc53pvwa0mk


Imports

In [18]:
import json
from PIL import Image
from IPython.display import display, Markdown, HTML

Search within the video

In [20]:
prompt = "Organize all scenes from this video in a table, along with timecode, a short description, a list of objects visible in the scene (with representative emojis) and an estimation of the level of excitement on a scale of 1 to 10"  # @param ["For each scene in this video, generate captions that describe the scene along with any spoken text placed in quotation marks. Place each caption into an object with the timecode of the caption in the video.", "Organize all scenes from this video in a table, along with timecode, a short description, a list of objects visible in the scene (with representative emojis) and an estimation of the level of excitement on a scale of 1 to 10"] {"allow-input":true}

video = trailcam_video # @param ["trailcam_video", "pottery_video", "post_its_video", "user_study_video"] {"type":"raw","allow-input":true}

response = client.models.generate_content(
    model=model_name,
    contents=[
        video,
        prompt,
    ]
)

Markdown(response.text)

Okay, here is a table organizing the scenes from the video, including timecodes, descriptions, visible objects with emojis, and an estimated excitement level.

| Timecode | Description | Visible Objects | Excitement Level (1-10) |
|---|---|---|---|
| 00:00 | Camera view obscured by fur, then clears. Two gray foxes appear in a rocky, leafy area, sniffing around. One jumps onto a rock. | Gray Fox 🦊, Rocks 🪨, Trees/Shrubs 🌳, Leaves 🍂, Ground/Dirt 🌱 | 4 |
| 00:17 | Night vision (B&W): A mountain lion sniffs the ground in a wooded area, looks around, and walks away. | Mountain Lion 🐆, Trees/Shrubs 🌳, Leaves 🍂, Ground/Dirt 🌱 | 6 |
| 00:34 | Night vision: Two gray foxes interact. One digs while the other watches, then pounces playfully/aggressively. They scatter, bumping the camera. | Gray Fox 🦊 (x2), Trees/Shrubs 🌳, Leaves 🍂, Ground/Dirt 🌱 | 7 |
| 00:50 | Night vision: Two gray foxes scramble over rocks, interact briefly, and move away. | Gray Fox 🦊 (x2), Rocks 🪨, Trees/Shrubs 🌳, Ground/Dirt 🌱 | 5 |
| 01:04 | Night vision: A mountain lion stands alert near rocks, then slowly walks away up a slope. | Mountain Lion 🐆, Rocks 🪨, Trees/Shrubs 🌳, Ground/Dirt 🌱 | 5 |
| 01:17 | Night vision: An adult mountain lion walks past, closely followed by a cub. | Mountain Lion (Adult) 🐆, Mountain Lion (Cub) 🐾, Rocks 🪨, Trees/Shrubs 🌳, Ground/Dirt 🌱 | 8 |
| 01:29 | Night vision: A bobcat stands alert, looking towards the camera and sniffing the ground. | Bobcat 😼, Trees/Shrubs 🌳, Leaves 🍂, Ground/Dirt 🌱 | 6 |
| 01:51 | Daytime: A large black bear walks through the woods towards the camera, then turns and walks away. | Black Bear 🐻, Trees/Shrubs 🌳, Leaves 🍂, Ground/Dirt 🌱 | 7 |
| 01:56 | Night vision: A mountain lion walks past the camera. | Mountain Lion 🐆, Trees/Shrubs 🌳, Leaves 🍂, Ground/Dirt 🌱 | 5 |
| 02:04 | Daytime: Camera is bumped by fur, then clears to show two bear cubs exploring and foraging. | Black Bear Cub 🐻🐾 (x2), Trees/Shrubs 🌳, Leaves 🍂, Ground/Dirt 🌱 | 7 |
| 02:23 | Night vision: A gray fox explores a rocky ledge overlooking distant city lights. | Gray Fox 🦊, Rocks 🪨, Grass/Shrubs 🌱, City Lights 🌃 | 5 |
| 02:35 | Night vision: A black bear appears on the same rocky ledge, sniffing the ground where the fox was. | Black Bear 🐻, Rocks 🪨, Grass/Shrubs 🌱, City Lights 🌃 | 6 |
| 02:42 | Night vision: A mountain lion walks onto the same rocky ledge overlooking the city. | Mountain Lion 🐆, Rocks 🪨, Grass/Shrubs 🌱, City Lights 🌃 | 7 |
| 02:52 | Night vision: A mountain lion scent-marks against a tree. | Mountain Lion 🐆, Tree Trunk 🌳, Leaves 🍂, Ground/Dirt 🌱 | 6 |
| 03:05 | Daytime: An adult black bear stands alert, making mouth movements. Another bear passes very close. | Black Bear (Adult) 🐻, Black Bear (Young?) 🐻🐾, Trees/Shrubs 🌳, Leaves 🍂, Ground/Dirt 🌱 | 8 |
| 03:22 | Daytime: A cinnamon-colored black bear stands alert and sniffs the ground. | Black Bear (Cinnamon phase) 🐻, Trees/Shrubs 🌳, Leaves 🍂, Ground/Dirt 🌱 | 7 |
| 03:32 | Daytime: The cinnamon mother bear is joined by three cubs. They forage, she seems alert. One cub scratches itself. They eventually walk off together. | Black Bear (Mother, Cinnamon) 🐻, Black Bear Cubs 🐻🐾 (x3), Trees/Shrubs 🌳, Leaves 🍂, Ground/Dirt 🌱 | 9 |
| 04:03 | Daytime: Two young bears (cubs/yearlings) forage together and approach the camera. | Black Bear Cubs/Yearlings 🐻🐾 (x2), Trees/Shrubs 🌳, Leaves 🍂, Ground/Dirt 🌱 | 7 |
| 04:22 | Night vision: A bobcat sits, looks around, then walks over a fallen log. | Bobcat 😼, Trees/Shrubs 🌳, Fallen Log🪵, Ground/Dirt 🌱 | 5 |
| 04:30 | Night vision: Two gray foxes make brief, alert appearances before running off. | Gray Fox 🦊 (x2), Trees/Shrubs 🌳, Fallen Log🪵, Ground/Dirt 🌱 | 4 |
| 04:57 | Night vision: A mountain lion sniffs the ground near a mound/log, looks around, and walks away. | Mountain Lion 🐆, Tree Trunk 🌳, Leaves 🍂, Ground/Dirt 🌱, Log/Mound 🪵 | 5 |

In [21]:
prompt = "Transcribe the sticky notes, organize them and put it in a table. Can you come up with a few more ideas?" # @param ["Transcribe the sticky notes, organize them and put it in a table. Can you come up with a few more ideas?", "Which of those names who fit an AI product that can resolve complex questions using its thinking abilities?"] {"allow-input":true}

video = post_its_video # @param ["trailcam_video", "pottery_video", "post_its_video", "user_study_video"] {"type":"raw","allow-input":true}

response = client.models.generate_content(
    model=model_name,
    contents=[
        video,
        prompt,
    ]
)

Markdown(response.text)

Okay, here are the transcribed project names from the sticky notes, organized into a table, followed by a few new suggestions based on the apparent themes (astronomy, mythology, science/math concepts).

**Transcribed Project Names from Sticky Notes**

|                    |                     |                     |
| :----------------- | :------------------ | :------------------ |
| Aether             | Convergence         | Lynx                |
| Andromeda's Reach  | Delphinus           | Lyra                |
| Astral Forge       | Draco               | Medusa              |
| Athena             | Echo                | Odin                |
| Athena's Eye       | Equilibrium         | Orion's Belt        |
| Bayes Theorem*     | Euler's Path        | Orion's Sword       |
| Canis Major        | Fractal             | Pandora's Box       |
| Centaurus          | Galactic Core       | Persius Shield      |
| Cerberus           | Golden Ratio        | Phoenix             |
| Celestial Drift    | Hera                | Prometheus Rising   |
| Chaos Field        | Infinity Loop       | Riemann's Hypothesis|
| Chaos Theory       | Leo Minor           | Sagitta             |
| Chimera Dream      | Lunar Eclipse       | Serpens             |
| Comet's Tail       | Stellar Nexus       | Stokes Theorem      |
|                    | Supernova Echo      | Symmetry            |
|                    | Taylor Series       | Titan               |
|                    | Vector              | Zephyr              |

*(Note: One sticky note appears to read "Bayes Therorem", likely a misspelling of "Theorem")*

**New Project Name Ideas (Following Themes)**

1.  **Event Horizon:** (Astronomy/Physics) - The boundary around a black hole, sounds cool and cutting-edge.
2.  **Asgard:** (Mythology - Norse) - The realm of the gods, powerful and evocative.
3.  **Singularity:** (Physics/Math) - A point of infinite density/complexity, often related to black holes or the Big Bang.
4.  **Quantum Leap:** (Physics/Concept) - Implies a sudden, significant advance or change.
5.  **Cygnus X-1:** (Astronomy) - A well-known galactic X-ray source, widely accepted to be a black hole.
6.  **Paradigm Shift:** (Concept) - A fundamental change in approach or underlying assumptions.