## Overview
In this notebook, we will analyse multiple brand marketing youtube ads with the following parameters:
1. **Emotional Engagement**: Which ad evokes stronger emotions in viewers, either positive or negative?
2. **Memorability**: Which ad is more memorable based on its visuals, storyline, and audio elements?
3. **Clarity of Messaging**: Which ad better communicates the intended message to the audience?
4. **Creative Elements**: Which ad has more innovative or creative elements that capture attention?
5. **Call-to-Action (CTA)**: Which ad has a stronger or clearer CTA, and is it more likely to prompt viewer action?


## Install Vertex AI SDK and other required packages

In [1]:
%pip install --upgrade --user --quiet google-cloud-aiplatform itables

Note: you may need to restart the kernel to use updated packages.


## Restart runtime
To use the newly installed packages in this Jupyter runtime, you must restart the runtime. You can do this by running the cell below, which restarts the current kernel.

The restart might take a minute or longer. After it's restarted, continue to the next step.

In [2]:
import IPython

app = IPython.Application.instance()
app.kernel.do_shutdown(True)

{'status': 'ok', 'restart': True}

## Authenticate your notebook environment 
If you're running this notebook on Google Colab, run the cell below to authenticate your environment.

In [1]:

import sys

if "google.colab" in sys.modules:
    from google.colab import auth

    auth.authenticate_user()

## Set Google Cloud project information and initialize Vertex AI SDK
To get started using Vertex AI, you must have an existing Google Cloud project and enable the Vertex AI API.

In [2]:
# Use the environment variable if the user doesn't provide Project ID.
import os

import vertexai

PROJECT_ID = "[your-project-id]"  # @param {type:"string", isTemplate: true}
if PROJECT_ID == "[your-project-id]":
    PROJECT_ID = str(os.environ.get("GOOGLE_CLOUD_PROJECT"))

LOCATION = os.environ.get("GOOGLE_CLOUD_REGION", "us-central1")

vertexai.init(project=PROJECT_ID, location=LOCATION)

## Importing all the packages

In [3]:
import os
import json
import pandas as pd
import asyncio
from itables import show
import itables.options as itable_opts
from IPython.display import HTML, Markdown, display
from vertexai.generative_models import (
    GenerationConfig,
    GenerativeModel,
    FunctionDeclaration,
    Tool,
    Part,
)
from tenacity import retry, stop_after_attempt, wait_random_exponential

## Set up for analyzing multiple video files

We are trying to look at multiple marketing ads and extracting key elements necessary visual elements like facial expressions, colors, and audio elements like tone and music, each linked to their timestamps in the video

In [24]:
# Step 3: Initialize the Model
gemini_pro_model = GenerativeModel("gemini-1.5-pro")

# Set Up Generation Config with Function Calling and Schema
multiple_video_extraction_system_instruction_text = (
    "You are a video analyst that carefully looks through all frames of provided videos, "
    "extracting key elements necessary to respond to user prompts. These include visual elements "
    "like facial expressions, colors, and audio elements like tone and music, each linked to their "
    "timestamps in the video. Generate a well-structured response for each video in the given schema."
)

video_comparison_response_schema = {
    "type": "ARRAY",
    "items": {
        "type": "OBJECT",
        "properties": {
            "video_id": {"type": "STRING", "description": "Unique identifier or name for the video being analyzed"},
            "timestamp": {"type": "STRING", "description": "Timestamp of the key element in mm:ss format"},
            "element_type": {"type": "STRING", "enum": ["facial_expression", "gesture", "color_scheme", "music", "voice_tone"], "description": "Type of visual or audio element"},
            "description": {"type": "STRING", "description": "Description of the key visual or audio element"},
            "impact": {"type": "STRING", "enum": ["Positive", "Neutral", "Negative"], "description": "Impact of the element on viewer engagement"},
            "emotional_engagement_score": {"type": "NUMBER", "description": "Score indicating the emotional engagement level of the element (e.g., 1-10 scale)"},
            "memorability_score": {"type": "NUMBER", "description": "Score indicating how memorable this element is (e.g., 1-10 scale)"},
            "clarity_score": {"type": "NUMBER", "description": "Score indicating how clearly the intended message is conveyed by the element (e.g., 1-10 scale)"},
            "creative_score": {"type": "NUMBER", "description": "Score indicating the level of creativity displayed by the element (e.g., 1-10 scale)"},
            "overall_engagement_score": {"type": "NUMBER", "description": "Calculated engagement score for the entire video based on key features"}
        },
    },
}



## Prompt and configurations

Here we define a detailed pormpt for marketing analysis

In [25]:
from vertexai.generative_models import ToolConfig

# tool_config = ToolConfig(
#     function_calling_config=ToolConfig.FunctionCallingConfig(
#         mode=ToolConfig.FunctionCallingConfig.Mode.ANY,
#     ))
generation_config = GenerationConfig(
    temperature=1,
    top_p=0.95,
    max_output_tokens=8192,
    response_mime_type="application/json",
    response_schema=video_comparison_response_schema,  # Include response schema
      # Specify the tools to enable function calling
)

# Step 4: Define a Prompt for Analysis and YouTube URLs
video_analysis_prompt = ("""
You are a marketing analyst tasked with comparing multiple video advertisements and determining which one is the most effective for a marketing campaign. Please analyze the following aspects for each video:
1. **Emotional Engagement**: Which ad evokes stronger emotions in viewers, either positive or negative?
2. **Memorability**: Which ad is more memorable based on its visuals, storyline, and audio elements?
3. **Clarity of Messaging**: Which ad better communicates the intended message to the audience?
4. **Creative Elements**: Which ad has more innovative or creative elements that capture attention?
5. **Call-to-Action (CTA)**: Which ad has a stronger or clearer CTA, and is it more likely to prompt viewer action?

Based on these criteria, provide a ranking for the ads, with a justification for why one ad is more effective than the others.

The videos for comparison are as follows:
- Video 1: [https://www.youtube.com/watch?v=video1]
- Video 2: [https://www.youtube.com/watch?v=video2]
- Video 3: [https://www.youtube.com/watch?v=video2]

Please provide a detailed analysis and final ranking with a summary explaining your decision.
"""
)

## Video urls to analyse

BE VERY CAREFUL WITH YOUR QUOTA LIMITS. I can maximum do 3 and there will be times where you will have to rerun to avoid the quota error

In [26]:
youtube_video_urls = [
    "https://www.youtube.com/watch?v=VGa1imApfdg",
    "https://www.youtube.com/watch?v=u3ybWiEUaUU",
    "https://www.youtube.com/watch?v=keOaQm6RpBg",
    # "https://www.youtube.com/watch?v=z2T-Rh838GA"
    # Add more video URLs here...
]

# Step 5: Embed YouTube Videos for Display
for YOUTUBE_VIDEO_URL in youtube_video_urls:
    youtube_video_embed_url = YOUTUBE_VIDEO_URL.replace("/watch?v=", "/embed/")
    youtube_video_embed_html_code = f"""
    <iframe width="560" height="315" src="{youtube_video_embed_url}" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen></iframe>
    """
    display(HTML(youtube_video_embed_html_code))

In [50]:
import time
# Step 6: Function for Asynchronous Video Analysis
@retry(wait=wait_random_exponential(multiplier=1, max=240), stop=stop_after_attempt(2))
async def async_generate(prompt, yt_link):
    try:
        youtube_video_contents = [
            multiple_video_extraction_system_instruction_text,
            prompt,
            Part.from_uri(mime_type="video/webm", uri=yt_link)
        ]
        
        response = await gemini_pro_model.generate_content_async(
            youtube_video_contents,
            generation_config=generation_config,
            stream=False
        )

        return response.to_dict()
    except Exception as e:
        print("Something failed, retrying")
        print(e)
        raise  # Re-raise the exception for tenacity to handle

## Main running loop

Over here, you will get two types of errors:
1. Quota limit error. Solution: Wait for sometime and rerun it.
2. Bad Request Error. Solution: Rerun the cell

In [54]:

import asyncio


#time.sleep(30)
start_time = asyncio.get_event_loop().time()
get_responses = [
    async_generate(video_analysis_prompt, yt_link)  # Set delay to 30 seconds
    for yt_link in youtube_video_urls
]

multiple_video_extraction_responses = await asyncio.gather(*get_responses)
end_time = asyncio.get_event_loop().time()

elapsed_time = end_time - start_time
print(f"Elapsed time: {elapsed_time:.2f} seconds")



Something failed, retrying
429 Quota exceeded for aiplatform.googleapis.com/generate_content_requests_per_minute_per_project_per_base_model with base model: gemini-1.5-pro. Please submit a quota increase request. https://cloud.google.com/vertex-ai/docs/generative-ai/quotas-genai.
Something failed, retrying
429 Quota exceeded for aiplatform.googleapis.com/generate_content_requests_per_minute_per_project_per_base_model with base model: gemini-1.5-pro. Please submit a quota increase request. https://cloud.google.com/vertex-ai/docs/generative-ai/quotas-genai.
Elapsed time: 38.89 seconds


In [55]:
len(multiple_video_extraction_responses)

3

## A raw output of the response generated

In [57]:
multiple_video_extraction_responses

[{'candidates': [{'content': {'role': 'model',
     'parts': [{'text': '[{"timestamp": "0:08", "element_type": "facial_expression", "description": "Man appears bored and uninterested while sitting in an art museum", "impact": "Negative", "emotional_engagement_score": 3, "memorability_score": 6, "clarity_score": 7, "creative_score": 5}, {"timestamp": "0:18", "element_type": "facial_expression", "description": "Woman looking at art skeptically, unsure of what to make of it.", "impact": "Neutral", "emotional_engagement_score": 4, "memorability_score": 5, "clarity_score": 6, "creative_score": 6}, {"timestamp": "0:19", "element_type": "color_scheme", "description": "Painting comes to life with vibrant colors, grabbing attention.", "impact": "Positive", "emotional_engagement_score": 7, "memorability_score": 8, "clarity_score": 6, "creative_score": 9}, {"timestamp": "0:27", "element_type": "facial_expression", "description": "Man remains bored despite lively events unfolding around him.", "im

## Extract and analyze video results

In [56]:
import pandas as pd
import json
from itables import show
import itables.options as itable_opts

# Step 8: Add Structured Outputs Back to Original Table, Show Full Extraction Results

flattened_results = []

# Iterate over each video response, including the video link or identifier
for idx, response in enumerate(multiple_video_extraction_responses):
    response_parts = response.get("candidates", [{}])[0].get("content", {}).get("parts", [{}])[0].get("text", "[]")
    analysis_items = json.loads(response_parts)  # Assuming response_parts is a JSON array of objects

    # Loop through each item in the analysis response and add the video identifier
    for item in analysis_items:
        flattened_results.append({
            "Video ID": f"Video {idx + 1}",  # Unique identifier for each video, can also use a video link or name
            "Timestamp": item.get("timestamp", ""),
            "Element Type": item.get("element_type", ""),
            "Description": item.get("description", ""),
            "Impact": item.get("impact", ""),
            "Emotional Engagement Score": item.get("emotional_engagement_score", ""),
            "Memorability Score": item.get("memorability_score", ""),
            "Clarity Score": item.get("clarity_score", ""),
            "Creative Score": item.get("creative_score", ""),
            "Overall Engagement Score": item.get("overall_engagement_score", "")
        })

# Convert the flattened results into a DataFrame
video_analysis_df = pd.DataFrame(flattened_results)

# Configure display options for the DataFrame
itable_opts.maxBytes = 1e9
itable_opts.maxColumns = 50
itable_opts.order = []
itable_opts.column_filters = "header"

# Show the DataFrame in a clean format
show(video_analysis_df)




Video ID,Timestamp,Element Type,Description,Impact,Emotional Engagement Score,Memorability Score,Clarity Score,Creative Score,Overall Engagement Score
Loading ITables v2.2.3 from the internet... (need help?),,,,,,,,,


In [60]:
!pip install gradio

I0000 00:00:1730774133.576662   21189 fork_posix.cc:77] Other threads are currently calling into gRPC, skipping fork() handlers


Collecting gradio
  Downloading gradio-5.5.0-py3-none-any.whl.metadata (16 kB)
Collecting aiofiles<24.0,>=22.0 (from gradio)
  Downloading aiofiles-23.2.1-py3-none-any.whl.metadata (9.7 kB)
Collecting fastapi<1.0,>=0.115.2 (from gradio)
  Downloading fastapi-0.115.4-py3-none-any.whl.metadata (27 kB)
Collecting ffmpy (from gradio)
  Downloading ffmpy-0.4.0-py3-none-any.whl.metadata (2.9 kB)
Collecting gradio-client==1.4.2 (from gradio)
  Downloading gradio_client-1.4.2-py3-none-any.whl.metadata (7.1 kB)
Collecting httpx>=0.24.1 (from gradio)
  Downloading httpx-0.27.2-py3-none-any.whl.metadata (7.1 kB)
Collecting huggingface-hub>=0.25.1 (from gradio)
  Downloading huggingface_hub-0.26.2-py3-none-any.whl.metadata (13 kB)
Collecting markupsafe~=2.0 (from gradio)
  Downloading MarkupSafe-2.1.5-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (3.0 kB)
Collecting orjson~=3.0 (from gradio)
  Downloading orjson-3.10.11-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.w

In [61]:
video_analysis_df.to_csv("video_analysis_results.csv", index=False)


In [62]:
def display_analysis(video_url):
    # In this case, we simply return the saved DataFrame as an HTML table
    # If the saved analysis has multiple videos, filter it based on the video URL or ID
    filtered_df = video_analysis_df[video_analysis_df['Video URL'] == video_url]
    
    if filtered_df.empty:
        return "No precomputed analysis is available for this video."
    
    return filtered_df.to_html()

In [63]:
import gradio as gr
import pandas as pd
import json

interface = gr.Interface(
    fn=display_analysis,
    inputs="text",  # User inputs a video URL
    outputs="html",  # The output is the HTML representation of the DataFrame
    title="Marketing Video Analysis (Precomputed)",
    description="Enter the URL of a marketing video to view the precomputed engagement analysis."
)

In [65]:
interface.launch(share=True)

Rerunning server... use `close()` to stop if you need to change `launch()` parameters.
----
* Running on public URL: https://aa13e4756f596cfdd9.gradio.live

This share link expires in 72 hours. For free permanent hosting and GPU upgrades, run `gradio deploy` from the terminal in the working directory to deploy to Hugging Face Spaces (https://huggingface.co/spaces)


