<a href="https://colab.research.google.com/github/ywchanna2001/AI-ML-Football-Analysis-system/blob/main/Video_Understanding.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [1]:
#@title Building with Gemini 2.0: Video understanding
%%html
<iframe width="560" height="315" src="https://www.youtube.com/embed/Mot-JEU26GQ?si=pcb7-_MZTSi_1Zkw" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen></iframe>

## Install SDK

In [4]:
%pip install -U -q 'google-genai'

## Setup API key

In [3]:
from google.colab import userdata

GOOGLE_API_KEY=userdata.get('GOOGLE_API_KEY')

## Initialize SDK client

In [16]:
from google import genai
from google.genai import types

client = genai.Client(api_key=GOOGLE_API_KEY)

In [14]:
print(type(client))

<class 'google.genai.client.Client'>


## Select the Gemini model

Video understanding works best Gemini 2.5 pro model. We can also select former models to compare their behavior but it is recommended to use at least the 2.0 ones.

Official comparison:
https://deepmind.google/technologies/gemini/pro/

In [6]:
model_name = "gemini-2.5-pro-exp-03-25" # @param ["gemini-1.5-flash-latest","gemini-2.0-flash-lite","gemini-2.0-flash","gemini-2.5-pro-exp-03-25"] {"allow-input":true, isTemplate: true}

## Analyze youtube videos

Until we create a rabies dog video data set we decided to analyze and process the videos that include rabid dogs' visuals.

In [8]:
import json
from PIL import Image
from IPython.display import display, Markdown, HTML

response = client.models.generate_content(
    model=model_name,
    contents=types.Content(
        parts=[
            types.Part(text="Find all the instances where Sundar says \"AI\". Provide timestamps and broader context for each instance."),
            types.Part(
                file_data=types.FileData(file_uri='https://www.youtube.com/watch?v=ixRanV-rdAQ')
            )
        ]
    )
)

Markdown(response.text)

Okay, here are all the instances where Sundar Pichai says "AI" in the provided video clip, along with timestamps and context:

1.  **0:29** - "**AI** is having a very busy year."
    *   **Context:** Sundar is opening the keynote, welcoming the audience, and acknowledging the significant recent developments and public interest in AI, setting the stage for the event's focus.
2.  **0:38** - "...as an **AI** first company..."
    *   **Context:** He's reflecting on Google's journey, stating they've been an "AI first" company for seven years and are now at an exciting inflection point.
3.  **0:45** - "...opportunity to make **AI** even more helpful..."
    *   **Context:** Sundar is outlining Google's goal to leverage the current advancements to make AI more beneficial for everyone (people, businesses, communities).
4.  **0:54** - "We've been applying **AI** to make our products radically more helpful..."
    *   **Context:** He explains that Google has already been using AI for some time to improve its products, setting up the transition to discuss *generative* AI's impact.
5.  **1:17** - "...how generative **AI** is helping to evolve our products..."
    *   **Context:** He specifically mentions generative AI as the next step in evolving Google's products and introduces Gmail as the first example.
6.  **1:40** - "...more advanced writing features powered by **AI**."
    *   **Context:** Sundar is tracing the evolution of Gmail features from Smart Reply to Smart Compose, noting that the latter led to more sophisticated AI-powered writing tools in Workspace.
7.  **3:02** - "**AI** has stitched together billions of panoramic images..."
    *   **Context:** Introducing Google Maps as the next example, he explains how AI was fundamental to creating Street View from the beginning.
8.  **3:13** - "...which uses **AI** to create a high fidelity representation of a place..."
    *   **Context:** He's describing Immersive View in Maps, explaining that AI is used to build the detailed 3D models of locations.
9.  **5:07** - "Another product made better by **AI** is Google Photos."
    *   **Context:** Transitioning to the third product example, highlighting AI's role in enhancing Google Photos.
10. **5:15** - "...one of our first **AI** native products."
    *   **Context:** Sundar is emphasizing that Google Photos was designed with AI at its core from its launch in 2015.
11. **5:38** - "**AI** advancements give us more powerful ways to do this."
    *   **Context:** He's discussing photo editing capabilities in Google Photos and how AI progress enables more advanced features.
12. **5:47** - "...uses **AI** powered computational photography..."
    *   **Context:** Explaining the technology behind the Magic Eraser feature in Google Photos.
13. **5:58** - "...semantic understanding and generative **AI**..."
    *   **Context:** Introducing the upcoming Magic Editor feature and mentioning the combination of AI techniques (semantic understanding and generative AI) that power it.
14. **7:40** - "...how **AI** can help you in moments that matter."
    *   **Context:** Summarizing the product examples (Gmail, Maps, Photos) just shown, framing them as ways AI provides practical help.
15. **7:47** - "...full potential of **AI** across the products you know and love."
    *   **Context:** Stating Google's overarching goal to integrate AI capabilities throughout its product suite.
16. **8:22** - "...making **AI** helpful for everyone..."
    *   **Context:** He introduces this phrase as the core theme for Google's AI strategy and the most profound way to advance their mission. This phrase appears on the screen.
17. **8:53** - "...building and deploying **AI** responsibly..."
    *   **Context:** Sundar lists the fourth key way Google is making AI helpful, focusing on responsible development and deployment.
18. **9:02** - "...ability to make **AI** helpful for everyone relies on..."
    *   **Context:** Connecting the goal of helpful AI directly to the need for advancing Google's underlying foundation models.
19. **11:26** - "It uses **AI** to better detect malicious scripts..."
    *   **Context:** Describing the capabilities of Sec-PaLM, a specialized version of PaLM 2 fine-tuned for security use cases.
20. **12:45** - "...bring **AI** in responsible ways to billions of people."
    *   **Context:** Framing the development of PaLM 2 as the latest step in Google's long-term commitment to responsible AI deployment.
21. **12:57** - "...defining **AI** breakthroughs over the last decade..."
    *   **Context:** Crediting the Google Brain and DeepMind teams (now merged) for many significant AI advancements.
22. **14:09** - "...deeply investing in **AI** responsibility."
    *   **Context:** Introducing the topic of tools for identifying synthetic content as part of Google's commitment to responsible AI.
23. **15:04** - "...one of our **AI** generated images has that metadata."
    *   **Context:** Committing to including metadata in all AI-generated images produced by Google's tools to aid identification.
24. **15:11** - "...responsible approach to **AI** later."
    *   **Context:** Mentioning that James Manyika will discuss Google's responsible AI approach in more detail later in the keynote.
25. **15:29** - "...experiment for conversational **AI**."
    *   **Context:** Describing Google Bard as an experiment in the field of conversational AI.

In [30]:
import google.genai as genai
from google.genai import types
import json # Keep these imports if you need them elsewhere
from PIL import Image # Keep these imports if you need them elsewhere
from IPython.display import display, Markdown, HTML # Keep these imports if you need them elsewhere

# --- Assume these are already defined from your notebook setup ---
# from google.colab import userdata
# GOOGLE_API_KEY=userdata.get('GOOGLE_API_KEY')
# client = genai.Client(api_key=GOOGLE_API_KEY)
# model_name = "gemini-1.5-pro-preview-latest" # Or your chosen model like "gemini-2.5-pro-exp-03-25"
# ---------------------------------------------------------------

def analyze_youtube_video(prompt: str, youtube_url: str, client: genai.Client, model: str) -> str:
    """
    Analyzes a YouTube video using the Gemini API with a custom prompt.

    Args:
        prompt: The text prompt/instructions for the AI model.
        youtube_url: The URL of the YouTube video to analyze.
        client: The initialized google.genai.Client instance.
        model: The name of the Gemini model to use (e.g., "gemini-1.5-pro-preview-latest").

    Returns:
        The text response generated by the Gemini model, or an error message.

    Raises:
        # Catches general exceptions, specific API errors might be handled differently.
        Exception: If the API call fails for reasons other than generation blocking.
    """
    print(f"Attempting to analyze YouTube video: {youtube_url}")
    print(f"Using Model: {model}")
    print(f"Prompt: '{prompt}'")

    try:
        # Create the video part using FileData with the URI
        video_part = types.Part(
            file_data=types.FileData(file_uri=youtube_url)
        )

        # Create the text prompt part
        prompt_part = types.Part(text=prompt)

        # Construct the content object with both parts
        # The order of parts generally doesn't matter, but prompt first is common
        content_to_send = types.Content(parts=[prompt_part, video_part])

        # Make the API call
        response = client.models.generate_content( # Use client.generate_content, not client.models...
            model=model,
            contents=content_to_send
        )

        # Check if the response has text (it might be blocked)
        if hasattr(response, 'text'):
             print("Analysis successful.")
             return response.text
        else:
             # Handle cases where the response might be empty or blocked
             print("Warning: Response did not contain text. It might have been blocked.")
             # You might want to inspect response.prompt_feedback or other attributes
             return f"Analysis generated no text output. Response details: {response}"


    except Exception as e:
        # Handle potential errors during the API call
        print(f"An error occurred during Gemini API call: {e}")
        # You might want to log the full exception traceback here
        # import traceback
        # traceback.print_exc()
        return f"Error analyzing video: {e}"

# --- Example Usage ---

# **Important:** Make sure 'client' and 'model_name' are defined and valid before running this.
if 'client' in locals() and 'model_name' in locals():

    # Define your custom prompt and video URL
    my_custom_prompt = "Describe the main visual elements and the overall mood of this video clip in three sentences."
    video_to_analyze_url = 'https://www.youtube.com/watch?v=ixRanV-rdAQ' # Google I/O Keynote example

    # Call the function
    analysis_result = analyze_youtube_video(
        prompt=my_custom_prompt,
        youtube_url=video_to_analyze_url,
        client=client,      # Pass the initialized client
        model=model_name    # Pass the desired model name
    )

    # Display the result using Markdown
    print("\n--- Analysis Result ---")
    if analysis_result.startswith("Error"):
        print(analysis_result) # Print error messages directly
    else:
        display(Markdown(analysis_result))

else:
    print("Error: Please ensure 'client' (genai.Client) and 'model_name' are defined.")

Attempting to analyze YouTube video: https://www.youtube.com/watch?v=ixRanV-rdAQ
Using Model: gemini-2.5-pro-exp-03-25
Prompt: 'Describe the main visual elements and the overall mood of this video clip in three sentences.'
Analysis successful.

--- Analysis Result ---


This clip features Google CEO Sundar Pichai presenting at the Google I/O conference on a large, modern outdoor stage characterized by warm wood tones, minimalist curves, and accents of Google's signature colors. The primary visuals alternate between close-ups of Pichai and wide shots encompassing the stage, a large central screen displaying various graphics and product demonstrations (like Gmail, Maps, and Photos AI features), and the large, engaged audience. The overall mood is energetic, informative, and forward-looking, highlighting Google's focus on artificial intelligence advancements and product innovation.