# Video Understanding with Amazon Nova Omni

This notebook demonstrates the comprehensive video analysis capabilities of Amazon Nova Omni model, showcasing five distinct capabilities: video summarization, temporal event detection, structured action timeline generation, content moderation, and logo recognition.

## What You'll Learn

In this notebook, you will explore how Amazon Nova Omni analyzes video content across multiple dimensions:

1. **Video Summarization** - Generate concise summaries of video content
2. **Structured Action Timeline** - Create JSON-formatted chronological action lists
3. **Content Moderation** - Detect explicit visual and toxic audio content
4. **Logo Recognition** - Identify brand logos and trademarks in video frames

You'll also learn how to:
- Process video content using S3 URI references
- Track API costs for each model invocation
- Generate structured outputs for programmatic processing
- Handle errors gracefully during video analysis

## Environment Setup

Let's begin by setting up our environment with the necessary dependencies and configurations.

### Import Required Dependencies

First, let's import all necessary Python packages for handling video and AWS services.

In [52]:
import boto3
import json
import os
import sagemaker
from botocore.config import Config
from IPython.display import Video
from botocore.exceptions import ClientError, NoCredentialsError, PartialCredentialsError

### Initialize AWS Service Clients

Next, we'll initialize boto3 clients for S3 and Bedrock services, and configure our AWS environment.

In [None]:
# Configure AWS environment
REGION_ID = "us-west-2"
MODEL_ID = "us.amazon.nova-2-omni-v1:0"
BEDROCK_ENDPOINT_URL = "https://bedrock-runtime.us-west-2.amazonaws.com"

READ_TIMEOUT_SEC = 3 * 60
MAX_RETRIES = 1

import boto3
import base64
import json
import timeit
from botocore.exceptions import ClientError
from IPython.display import Image, display
from botocore.config import Config

In [54]:
# Create a SageMaker session
sess = sagemaker.Session()

def get_bedrock_runtime():
    """Returns a Bedrock Runtime client."""
    # Create Bedrock client
    config = Config(
        read_timeout=READ_TIMEOUT_SEC,
        retries={"max_attempts": MAX_RETRIES},
    )
    
    bedrock = boto3.client(
        service_name="bedrock-runtime",
        region_name=REGION_ID,
        config=config,
    )
    return bedrock

try:
    boto3.setup_default_session(region_name=REGION_ID)
    
    # Initialize AWS service clients
    sts_client = boto3.client("sts")
    account_id = sts_client.get_caller_identity().get("Account")
    s3_client = boto3.client("s3")
    bedrock_client = get_bedrock_runtime()
    
    print(f"✓ AWS clients initialized successfully")
    print(f"✓ Region: {REGION_ID}")
    print(f"✓ Account ID: {account_id}")
    
except Exception as e:
    print(f"✗ Unexpected error during AWS initialization: {str(e)}")
    raise

✓ AWS clients initialized successfully
✓ Region: us-west-2
✓ Account ID: 149536455198


### Configure S3 Bucket and Model Settings

Define the S3 bucket for video storage and the Nova Omni model identifier.

In [None]:
# Define the model ID for Amazon Nova Omni
MODEL_ID = "us.amazon.nova-2-omni-v1:0"

# Get the default S3 bucket name and subfolder for video storage
BUCKET_NAME = sess.default_bucket()
VIDEO_SUBFOLDER = "video_understanding/"

print(f"✓ Model ID: {MODEL_ID}")
print(f"✓ S3 Bucket: {BUCKET_NAME}")
print(f"✓ Video Subfolder: {VIDEO_SUBFOLDER}")

# Validate S3 bucket access
try:
    s3_client.head_bucket(Bucket=BUCKET_NAME)
    print(f"✓ S3 bucket access validated successfully")
except ClientError as e:
    error_code = e.response['Error']['Code']
    if error_code == '404':
        print(f"✗ Error: S3 bucket '{BUCKET_NAME}' does not exist.")
        print("  Please create the bucket or update BUCKET_NAME to an existing bucket.")
    elif error_code == '403':
        print(f"✗ Error: Access denied to S3 bucket '{BUCKET_NAME}'.")
        print("  Please check your IAM permissions for S3 access.")
    else:
        print(f"✗ AWS Error [{error_code}]: {e.response['Error']['Message']}")
    raise
except Exception as e:
    print(f"✗ Unexpected error validating S3 bucket access: {str(e)}")
    raise

✓ Model ID: us.amazon.nova-2-lite-omni-v1:0
✓ S3 Bucket: sagemaker-us-west-2-149536455198
✓ Video Subfolder: video_understanding/
✓ S3 bucket access validated successfully


### Define Helper Functions

Let's create reusable helper functions for video upload, model invocation, cost calculation, and result display.

In [56]:
def upload_video_to_s3(local_path, bucket_name, s3_key, s3_client):
    """
    Upload a video file to S3 and return its URI.
    
    Args:
        local_path (str): Local file path to video
        bucket_name (str): Target S3 bucket
        s3_key (str): S3 object key (path within bucket)
        s3_client: Initialized S3 client
        
    Returns:
        str: S3 URI in format s3://bucket/key
        
    Raises:
        FileNotFoundError: If the local video file doesn't exist
        ClientError: If S3 upload fails due to AWS service errors
        Exception: For other unexpected errors
    """
    try:
        # Validate input parameters
        if not local_path:
            raise ValueError("local_path cannot be empty")
        if not bucket_name:
            raise ValueError("bucket_name cannot be empty")
        if not s3_key:
            raise ValueError("s3_key cannot be empty")
        
        # Check if file exists
        if not os.path.exists(local_path):
            raise FileNotFoundError(
                f"Video file not found: {local_path}\n"
                f"  Please verify the file path and ensure the video file exists."
            )
        
        # Check if file is readable
        if not os.access(local_path, os.R_OK):
            raise PermissionError(
                f"Cannot read video file: {local_path}\n"
                f"  Please check file permissions."
            )
        
        # Get file size for validation
        file_size = os.path.getsize(local_path)
        if file_size == 0:
            raise ValueError(f"Video file is empty: {local_path}")
        
        print(f"  Uploading {os.path.basename(local_path)} ({file_size:,} bytes)...")
        
        # Upload file to S3
        s3_client.upload_file(local_path, bucket_name, s3_key)
        
        # Verify upload by checking if object exists
        try:
            s3_client.head_object(Bucket=bucket_name, Key=s3_key)
        except ClientError:
            raise RuntimeError(
                f"Upload appeared to succeed but object not found in S3: {s3_key}\n"
                f"  This may indicate a transient error. Please try again."
            )
        
        # Generate S3 URI
        s3_uri = f"s3://{bucket_name}/{s3_key}"
        
        print(f"✓ Successfully uploaded {os.path.basename(local_path)} to S3")
        print(f"  URI: {s3_uri}")
        
        return s3_uri
        
    except FileNotFoundError as e:
        print(f"✗ File Error: {e}")
        raise
    except PermissionError as e:
        print(f"✗ Permission Error: {e}")
        raise
    except ValueError as e:
        print(f"✗ Validation Error: {e}")
        raise
    except ClientError as e:
        error_code = e.response['Error']['Code']
        error_message = e.response['Error']['Message']
        print(f"✗ AWS S3 Error [{error_code}]: {error_message}")
        if error_code == 'NoSuchBucket':
            print(f"  The bucket '{bucket_name}' does not exist.")
        elif error_code == 'AccessDenied':
            print(f"  Access denied. Please check your IAM permissions for s3:PutObject.")
        raise
    except Exception as e:
        print(f"✗ Unexpected error during S3 upload: {str(e)}")
        print(f"  File: {local_path}")
        print(f"  Bucket: {bucket_name}")
        print(f"  Key: {s3_key}")
        raise

In [57]:
def invoke_nova_video(video_uri, prompt, temperature=0.3, bedrock_client=None):
    """
    Invoke Nova Omni model with video content.
    
    Args:
        video_uri (str): S3 URI of video file
        prompt (str): User prompt for analysis
        temperature (float): Model temperature parameter
        bedrock_client: Initialized Bedrock client

    Returns:
        tuple: (full_response, content_text)

    Raises:
        ValueError: If required parameters are invalid
        ClientError: If Bedrock API call fails
        KeyError: If response format is unexpected
        Exception: For other unexpected errors
    """
    try:
        # Validate input parameters
        if not video_uri:
            raise ValueError("video_uri cannot be empty")
        if not prompt:
            raise ValueError("prompt cannot be empty")
        if not video_uri.startswith('s3://'):
            raise ValueError(f"Invalid S3 URI format: {video_uri}. Must start with 's3://'")
        if bedrock_client is None:
            raise ValueError("bedrock_client cannot be None")
        if not isinstance(temperature, (int, float)) or temperature < 0 or temperature > 1:
            raise ValueError(f"temperature must be between 0 and 1, got: {temperature}")
        
        # Format user message with video and prompt
        request = {
            "modelId": MODEL_ID,
            "messages": [
                {
                    "role": "user",
                    "content": [
                        {
                            "video": {
                                "format": "mp4",
                                "source": {
                                    "s3Location": {
                                        "uri": video_uri
                                    }
                                },
                            }
                        },
                        {"text": prompt},
                    ],
                }
            ],
            "inferenceConfig": {"temperature": temperature, "topP": 0.95, "maxTokens": 10000},
        }
        
        # Invoke the model
        response = bedrock_client.converse(**request)
        
        # Validate response structure
        if "output" not in response:
            raise KeyError("Response missing 'output' field")
        if "message" not in response["output"]:
            raise KeyError("Response missing 'output.message' field")
        if "content" not in response["output"]["message"]:
            raise KeyError("Response missing 'output.message.content' field")
        
        response_content_list = response["output"]["message"]["content"]
        
        if not response_content_list:
            raise ValueError("Response content is empty")
        
        # Extract text content block
        text_content = next(
            (item for item in response_content_list if "text" in item),
            None,
        )

        content_text = ""
        if text_content:
            print("== Text Output ==")
            content_text = text_content["text"]
            print(content_text)
        
        return response, content_text
        
    except ValueError as e:
        print(f"✗ Validation Error: {e}")
        raise
    except KeyError as e:
        print(f"✗ Response Format Error: {e}")
        print("  The model response structure was unexpected.")
        print(f"  Response keys: {list(response.keys()) if 'response' in locals() else 'N/A'}")
        raise
    except ClientError as e:
        error_code = e.response['Error']['Code']
        error_message = e.response['Error']['Message']
        print(f"✗ AWS Bedrock Error [{error_code}]: {error_message}")
        if error_code == 'AccessDeniedException':
            print("  Please check your IAM permissions for bedrock:InvokeModel.")
        elif error_code == 'ResourceNotFoundException':
            print(f"  The model '{MODEL_ID}' may not be available in region '{REGION_ID}'.")
            print("  Please verify model availability and enable it in the Bedrock console.")
        elif error_code == 'ValidationException':
            print("  The request parameters may be invalid.")
            print(f"  Video URI: {video_uri}")
        elif error_code == 'ThrottlingException':
            print("  Request rate exceeded. Please wait and try again.")
        raise
    except json.JSONDecodeError as e:
        print(f"✗ JSON Parsing Error: {e}")
        print("  Failed to parse model response as JSON.")
        raise
    except Exception as e:
        print(f"✗ Unexpected error during model invocation: {str(e)}")
        print(f"  Video URI: {video_uri}")
        print(f"  Model ID: {MODEL_ID}")
        raise


In [58]:
def display_response(content_text, show_full_response=False, full_response=None):
    """
    Display formatted model response.
    
    Args:
        content_text (str): Extracted text content from response
        show_full_response (bool): Whether to show full JSON response
        full_response (dict): Full model response dictionary
    """
    print("\n" + "="*80)
    print("RESPONSE CONTENT")
    print("="*80)
    print(content_text)
    
    if show_full_response and full_response:
        print("\n" + "="*80)
        print("FULL JSON RESPONSE")
        print("="*80)
        print(json.dumps(full_response, indent=2))

In [59]:
def parse_json_response(content_text):
    """
    Parse JSON from model response, handling code blocks.
    
    Args:
        content_text (str): Text content from model response
        
    Returns:
        dict: Parsed JSON data
        
    Raises:
        ValueError: If content_text is empty or None
        json.JSONDecodeError: If JSON parsing fails
    """
    try:
        # Validate input
        if not content_text:
            raise ValueError("content_text cannot be empty or None")
        
        # Remove code block markers if present
        json_text = content_text.strip()
        
        # Handle various markdown code block formats
        if json_text.startswith("```json"):
            json_text = json_text.replace("```json", "", 1)
        elif json_text.startswith("```"):
            json_text = json_text.replace("```", "", 1)
        
        if json_text.endswith("```"):
            json_text = json_text.rsplit("```", 1)[0]
        
        json_text = json_text.strip()
        
        # Validate that we have JSON-like content
        if not json_text:
            raise ValueError("No content remaining after removing code block markers")
        
        if not (json_text.startswith('{') or json_text.startswith('[')):
            raise ValueError(
                f"Content does not appear to be JSON. "
                f"Expected to start with '{{' or '[', but starts with: {json_text[:50]}..."
            )
        
        # Parse JSON
        data = json.loads(json_text)
        
        # Validate that we got a dict or list
        if not isinstance(data, (dict, list)):
            raise ValueError(f"Parsed JSON is not a dict or list, got: {type(data)}")
        
        return data
        
    except ValueError as e:
        print(f"✗ Validation Error: {e}")
        print(f"\nRaw response (first 500 chars):\n{content_text[:500]}...")
        raise
    except json.JSONDecodeError as e:
        print(f"✗ JSON Parsing Error: {e}")
        print(f"  Line {e.lineno}, Column {e.colno}: {e.msg}")
        print(f"\nProblematic JSON text (first 500 chars):\n{json_text[:500]}...")
        print("\nThe model may have returned text instead of pure JSON.")
        print("Consider adjusting the prompt to request JSON-only output.")
        raise
    except Exception as e:
        print(f"✗ Unexpected error parsing JSON: {str(e)}")
        print(f"\nRaw response (first 500 chars):\n{content_text[:500]}...")
        raise

## Video Upload to S3

Now let's upload our test videos to S3 so that Nova Omni can access them via S3 URIs.

### About Our Sample Videos

For this demonstration, we'll use video files:

1. **Meridian_Clip.mp4** - A clip from "Meridian," a short film from Netflix Open Content featuring a vintage car on a mountain road
2. **the-sea.mp4** - A video depicting sea scenery
3. **Andy-Jassy-on-Amazon-Nova.mp4** - Amazon CEO Andy Jassy talks about Amazon Nova models
4. **Cheesecake.mp4** - A video depicting how to make a cheesecase

These videos will be uploaded to the S3 bucket in the `video_understanding/` subfolder.

In [60]:
import glob

# Discover all MP4 files in the media directory
print("Discovering videos in media directory...\n")
video_files = sorted(glob.glob("media/*.mp4"))

print(f"Found {len(video_files)} video file(s):")
for video_file in video_files:
    filename = os.path.basename(video_file)
    file_size = os.path.getsize(video_file)
    print(f"  - {filename} ({file_size:,} bytes)")

print("\n" + "="*80)
print("Uploading all videos to S3...")
print("="*80 + "\n")

# Dictionary to store video URIs (filename -> S3 URI)
video_uris = {}
upload_errors = []

# Upload each video to S3
for video_path in video_files:
    filename = os.path.basename(video_path)
    s3_key = f"{VIDEO_SUBFOLDER}{filename}"
    
    try:
        # Upload video and store URI
        uri = upload_video_to_s3(
            local_path=video_path,
            bucket_name=BUCKET_NAME,
            s3_key=s3_key,
            s3_client=s3_client
        )
        video_uris[filename] = uri
        
    except FileNotFoundError as e:
        error_msg = f"File not found: {filename}"
        upload_errors.append(error_msg)
        video_uris[filename] = None
        print(f"✗ {error_msg}")
        print("⚠ Continuing with remaining uploads...\n")
        
    except ClientError as e:
        error_code = e.response['Error']['Code']
        error_msg = f"AWS error uploading {filename}: [{error_code}] {e.response['Error']['Message']}"
        upload_errors.append(error_msg)
        video_uris[filename] = None
        print(f"✗ {error_msg}")
        print("⚠ Continuing with remaining uploads...\n")
        
    except Exception as e:
        error_msg = f"Unexpected error uploading {filename}: {str(e)}"
        upload_errors.append(error_msg)
        video_uris[filename] = None
        print(f"✗ {error_msg}")
        print("⚠ Continuing with remaining uploads...\n")

# Report upload results
print("\n" + "="*80)
print("UPLOAD SUMMARY")
print("="*80)

successful_uploads = [f for f, uri in video_uris.items() if uri is not None]
failed_uploads = [f for f, uri in video_uris.items() if uri is None]

print(f"\n✓ Successfully uploaded: {len(successful_uploads)} video(s)")
for filename in successful_uploads:
    print(f"  - {filename}")
    print(f"    URI: {video_uris[filename]}")

if failed_uploads:
    print(f"\n✗ Failed uploads: {len(failed_uploads)} video(s)")
    for filename in failed_uploads:
        print(f"  - {filename}")

print("\n" + "="*80)

# Check if we have at least one video to continue
if not successful_uploads:
    raise RuntimeError(
        "Failed to upload any videos. Cannot proceed with video analysis.\n"
        "Please check the error messages above and resolve the issues."
    )
else:
    print(f"\n✓ Ready to proceed with {len(successful_uploads)} video(s)")
    print("="*80)


Discovering videos in media directory...

Found 4 video file(s):
  - Andy-Jassy-on-Amazon-Nova.mp4 (43,646,472 bytes)
  - Cheesecake.mp4 (11,080,848 bytes)
  - Meridian_Clip.mp4 (5,504,861 bytes)
  - the-sea.mp4 (4,632,458 bytes)

Uploading all videos to S3...

  Uploading Andy-Jassy-on-Amazon-Nova.mp4 (43,646,472 bytes)...
✓ Successfully uploaded Andy-Jassy-on-Amazon-Nova.mp4 to S3
  URI: s3://sagemaker-us-west-2-149536455198/video_understanding/Andy-Jassy-on-Amazon-Nova.mp4
  Uploading Cheesecake.mp4 (11,080,848 bytes)...
✓ Successfully uploaded Cheesecake.mp4 to S3
  URI: s3://sagemaker-us-west-2-149536455198/video_understanding/Cheesecake.mp4
  Uploading Meridian_Clip.mp4 (5,504,861 bytes)...
✓ Successfully uploaded Meridian_Clip.mp4 to S3
  URI: s3://sagemaker-us-west-2-149536455198/video_understanding/Meridian_Clip.mp4
  Uploading the-sea.mp4 (4,632,458 bytes)...
✓ Successfully uploaded the-sea.mp4 to S3
  URI: s3://sagemaker-us-west-2-149536455198/video_understanding/the-sea.mp4

## Capability 1: Video Summarization

Let's demonstrate Nova Omni's ability to generate concise summaries of video content, identifying key moments and events.

### Custom Prompt Strategy for Different Video Types

To maximize the value of Nova Omni's video understanding capabilities, we employ **content-aware prompting** - tailoring our analysis prompts to match the unique characteristics of each video type. This approach ensures we extract the most relevant and impressive insights from each piece of content.

#### Why Custom Prompts Matter

Different video types have fundamentally different characteristics and purposes:
- **Cinematic content** emphasizes visual storytelling, mood, and cinematography
- **Nature footage** focuses on environmental elements, aesthetics, and natural beauty
- **Executive presentations** highlight key messages, product announcements, and business value

By customizing our prompts, we can:
1. **Extract domain-specific insights** that generic prompts might miss
2. **Demonstrate versatility** of Nova Omni across diverse content types
3. **Provide more actionable results** tailored to each video's purpose
4. **Showcase the model's adaptability** to different analysis contexts

#### Our Custom Prompt Approach

For each video, we'll create prompts that:
- **Meridian_Clip.mp4** (Cinematic): Focus on visual storytelling, dramatic moments, cinematographic techniques, and narrative arc
- **the-sea.mp4** (Nature): Emphasize natural elements, environmental transitions, aesthetic qualities, and visual composition
- **Andy-Jassy-on-Amazon-Nova.mp4** (Executive Presentation): Extract key announcements, strategic messages, business value propositions, and presenter communication style

Let's see this strategy in action!

In [61]:
# Custom prompt for cinematic content - emphasizes visual storytelling
prompt_meridian = """Analyze this cinematic video with a focus on visual storytelling and cinematography.

Please provide a summary that captures:
- The narrative arc and dramatic progression
- Key visual moments and their emotional impact
- Cinematographic techniques (camera work, lighting, composition)
- Mood and atmosphere created through visuals
- Character actions and their significance to the story

Limit your summary to 5 main points in bullet format, emphasizing the cinematic qualities."""

prompt = prompt_meridian

print("Analyzing Video 1: Meridian_Clip.mp4\\n")

# Get video URI from the video_uris dictionary - use just the filename!
video1_filename = "Meridian_Clip.mp4"
video1_uri = video_uris.get(video1_filename)

# Check if video1 was uploaded successfully
if video1_uri is None:
    print("⚠ Skipping Video 1 analysis - video was not uploaded successfully.\\n")
    response1, content1 = None, None
else:
    try:
        # Invoke model for first video - pass the S3 URI directly!
        response1, content1 = invoke_nova_video(
            video_uri=video1_uri,
            prompt=prompt,
            bedrock_client=bedrock_client
        )

        # Display response
        display_response(content1)
        
    except Exception as e:
        print(f"✗ Error analyzing Video 1: {str(e)}")
        print("⚠ Continuing with remaining analyses...\\n")
        response1, content1 = None, None

Analyzing Video 1: Meridian_Clip.mp4\n
== Text Output ==
- **Narrative Arc & Progression**: The video follows a man in a suit driving a vintage car through a desolate, mountainous landscape, transitioning from a sunny day to a rainy night, suggesting a journey marked by emotional or psychological turmoil.  
- **Key Visual Moments**: The opening aerial shot of the car on a winding road sets a solitary tone, while the man’s focused expression and the rearview mirror revealing a woman’s face in the rain heighten tension and mystery.  
- **Cinematographic Techniques**: The use of wide aerial shots contrasts with tight interior framing, emphasizing isolation. Dim lighting and rain-streaked windows create a moody, introspective atmosphere.  
- **Mood & Atmosphere**: The barren landscape and shifting weather—from clear to stormy—mirror the protagonist’s inner state, evoking loneliness and impending conflict.  
- **Character Actions**: The man’s steady grip on the steering wheel and occasional

In [62]:
print("Analyzing Video 2: the-sea.mp4")

# Get video URI from the video_uris dictionary
video2_filename = "the-sea.mp4"
video2_uri = video_uris.get(video2_filename)

# Custom prompt for nature content - emphasizes natural elements and aesthetics
prompt_sea = """Analyze this nature video with a focus on environmental elements and visual aesthetics.

Please provide a summary that captures:
- Natural elements and scenery depicted (ocean, coastline, wildlife, etc.)
- Visual transitions and environmental changes throughout the video
- Aesthetic qualities and compositional beauty
- Color palette and lighting in the natural setting
- Overall mood and atmosphere of the natural environment. 

Limit your summary to 5 main points in bullet format, emphasizing the natural beauty and environmental aspects."""
prompt = prompt_sea

# Check if video2 was uploaded successfully
if video2_uri is None:
    print("⚠ Skipping Video 2 analysis - video was not uploaded successfully.\n")
    response2, content2 = None, None
else:
    try:
        # Invoke model for second video
        response2, content2= invoke_nova_video(
            video_uri=video2_uri,
            prompt=prompt,
            bedrock_client=bedrock_client
        )
        
        # Display response
        display_response(content2)
        
    except Exception as e:
        print(f"✗ Error analyzing Video 2: {str(e)}")
        print("⚠ Continuing with remaining analyses...\n")
        response2, content2 = None, None

Analyzing Video 2: the-sea.mp4
== Text Output ==
- The video showcases a rugged coastline with rocky cliffs and the ocean, highlighting the interplay between land and sea.  
- It transitions from a wide aerial view of the rocky shoreline to a close-up of a seashell on a sunlit beach at sunset.  
- The composition emphasizes natural beauty through the contrast of dark, textured rocks against the deep blue-green ocean and the warm golden hues of the sunset.  
- The color palette features deep ocean blues, earthy tones in the cliffs, and warm golden light reflecting off the sand and seashell.  
- The overall mood is serene and tranquil, evoking a sense of peace and connection with the natural environment.

RESPONSE CONTENT
- The video showcases a rugged coastline with rocky cliffs and the ocean, highlighting the interplay between land and sea.  
- It transitions from a wide aerial view of the rocky shoreline to a close-up of a seashell on a sunlit beach at sunset.  
- The composition emph

In [63]:
print("Analyzing Video 3: Andy-Jassy-on-Amazon-Nova.mp4\n")

# Custom prompt for executive presentation - emphasizes key announcements and business value
prompt_andy = """ Analyze this executive presentation video with a focus on key messages and strategic announcements.

Please provide a summary that captures:
- Main product announcements or strategic initiatives discussed
- Key business value propositions and benefits highlighted
- Presenter's communication style and emphasis points
- Visual aids, demonstrations, or examples shown
- Target audience and intended business impact

Limit your summary to 5 main points in bullet format, emphasizing the executive messaging and business value.
"""
prompt = prompt_andy

# Get video URI from the video_uris dictionary
video3_filename = "Andy-Jassy-on-Amazon-Nova.mp4"
video3_uri = video_uris.get(video3_filename)

# Check if video3 was uploaded successfully
if video3_uri is None:
    print("⚠ Skipping Video 3 analysis - video was not uploaded successfully.\n")
    response3, content3 = None, None
else:
    try:
        # Invoke model for third video
        response3, content3 = invoke_nova_video(
            video_uri=video3_uri,
            prompt=prompt,
            bedrock_client=bedrock_client
        )
        
        # Display response
        display_response(content3)
        
    except Exception as e:
        print(f"✗ Error analyzing Video 3: {str(e)}")
        print("⚠ Continuing with remaining analyses...\n")
        response3, content3 = None, None

Analyzing Video 3: Andy-Jassy-on-Amazon-Nova.mp4

== Text Output ==
### Summary of Executive Presentation at AWS re:Invent  

#### **1. Main Product Announcements & Strategic Initiatives**  
- **New AI/ML Services**: Introduced advanced machine learning tools for real-time data analysis and predictive modeling, enabling businesses to derive actionable insights faster.  
- **Enhanced Cloud Infrastructure**: Launched scalable compute and storage solutions optimized for high-performance workloads, supporting enterprise-grade applications.  
- **Security Enhancements**: Expanded security offerings, including automated threat detection and compliance automation, to protect data across hybrid environments.  

#### **2. Key Business Value Propositions & Benefits**  
- **Cost Efficiency**: Demonstrated how AWS services reduce operational costs by up to 40% through automated resource management and pay-as-you-go pricing.  
- **Agility & Scalability**: Highlighted the ability to rapidly deploy a

## Capability 2: Structured Action Timeline

Beyond simple event detection, Nova Omni can generate structured, machine-readable timelines of all actions occurring in a video. This capability is particularly valuable for programmatic processing, enabling automated video indexing, content analysis pipelines, and integration with other systems.

### Use Cases for Structured Action Timelines:

- **Video Indexing**: Create searchable databases of video content
- **Content Management Systems**: Automatically tag and categorize video segments
- **Automated Editing**: Identify and extract specific action sequences
- **Training Data Generation**: Create labeled datasets for machine learning
- **Compliance & Auditing**: Document all activities in surveillance or training videos
- **Sports Analytics**: Track player movements and game events systematically
- **Accessibility**: Generate detailed video descriptions for visually impaired users

The structured JSON format makes it easy to:
- Parse and process results programmatically
- Store action data in databases
- Generate reports and visualizations
- Integrate with downstream applications

### Generating Action Timeline for Cheesecake.mp4

Let's analyze the Cheesecake clip and generate a chronological timeline of all actions and activities.

In [64]:
print("Analyzing Video 4: Cheesecake.mp4\\n")

# Get video URI from the video_uris dictionary - use just the filename!
video4_filename = "Cheesecake.mp4"
video4_uri = video_uris.get(video4_filename)

print("Generating Structured Action Timeline\n")
print("Video: Cheesecake.mp4\n")

# Define JSON schema for action timeline in prompt
action_timeline_prompt = """ Analyze this cooking video and create a detailed timeline of all culinary actions, steps, and techniques shown.

Follow these guidelines:
1. Identify each cooking step with its corresponding timestamp
2. Describe ingredients being added or prepared
3. Note cooking techniques and methods used (mixing, baking, chilling, etc.)
4. Capture equipment or tools being used
5. Include temperature settings or timing instructions if visible
6. Note any visual cues (texture changes, color changes, consistency)
7. Output timestamp in MM:SS format
8. Focus on actionable recipe steps that someone could follow

Output in the following JSON schema:
{
  "recipe_name": "name of the dish being prepared",
  "actions": [
    {
      "step_number": 1,
      "action": "detailed description of the cooking step",
      "timestamp": "MM:SS",
      "ingredients": ["list of ingredients used in this step"],
      "technique": "cooking technique or method used",
      "equipment": ["tools or equipment used"]
    }
  ],
  "key_tips": ["any important tips or techniques highlighted in the video"]
}

Provide ONLY the JSON output, no additional text. """

# Invoke model with structured output request
response_timeline, content_timeline = invoke_nova_video(
    video_uri=video4_uri,
    prompt=action_timeline_prompt,
    bedrock_client=bedrock_client
)

print("\n" + "="*80)
print("RAW JSON RESPONSE")
print("="*80)
print(content_timeline)
print("="*80)

Analyzing Video 4: Cheesecake.mp4\n
Generating Structured Action Timeline

Video: Cheesecake.mp4

== Text Output ==
{
  "recipe_name": "5-Minute Cheesecake",
  "actions": [
    {
      "step_number": 1,
      "action": "Melt 1 tablespoon of butter in a small bowl using a microwave or stovetop",
      "timestamp": "00:00-00:04",
      "ingredients": ["butter"],
      "technique": "melting",
      "equipment": ["microwave", "small bowl"]
    },
    {
      "step_number": 2,
      "action": "Crush 2 graham crackers into crumbs and mix with melted butter",
      "timestamp": "00:06-00:14",
      "ingredients": ["graham crackers", "butter"],
      "technique": "crushing and mixing",
      "equipment": ["bowl", "spoon"]
    },
    {
      "step_number": 3,
      "action": "Combine 4 ounces cream cheese, 2 spoonfuls sugar, and vanilla in a bowl",
      "timestamp": "00:16-00:20",
      "ingredients": ["cream cheese", "sugar", "vanilla"],
      "technique": "mixing",
      "equipment": ["bowl"

### Generating Action Timeline for Cheesecake.mp4

Let's also generate an action timeline for the Cheesecake.mp4 video to demonstrate the capability across different content types.

In [65]:
print("Generating Structured Action Timeline\n")
print("Video: Cheesecake.mp4\n")

# Invoke model for second video
response_timeline4, content_timeline4 = invoke_nova_video(
    video_uri=video4_uri,
    prompt=action_timeline_prompt,
    bedrock_client=bedrock_client
)

# Parse and display
try:
    action_data4 = parse_json_response(content_timeline4)
    
    print("\n" + "="*80)
    print("STRUCTURED ACTION TIMELINE")
    print("="*80)
    print(f"\nVideo: Cheesecake.mp4")
    print(f"Total Actions Detected: {len(action_data4.get('actions', []))}\n")
    
    for idx, action_entry in enumerate(action_data4.get('actions', []), 1):
        timestamp = action_entry.get('timestamp', 'N/A')
        action = action_entry.get('action', 'No description')
        print(f"{idx:2d}. [{timestamp}] {action}")
    
    print("\n" + "="*80)
    
except Exception as e:
    print(f"\n✗ Error parsing JSON response: {e}")

Generating Structured Action Timeline

Video: Cheesecake.mp4

== Text Output ==
{
  "recipe_name": "5-Minute Cheesecake",
  "actions": [
    {
      "step_number": 1,
      "action": "Place 1 tablespoon of butter into a small white bowl and melt it in the microwave",
      "timestamp": "00:00-00:03",
      "ingredients": ["butter"],
      "technique": "melting",
      "equipment": ["microwave"]
    },
    {
      "step_number": 2,
      "action": "Crush graham crackers into crumbs and mix with melted butter",
      "timestamp": "00:06-00:15",
      "ingredients": ["graham crackers"],
      "technique": "crushing and mixing",
      "equipment": ["spoon"]
    },
    {
      "step_number": 3,
      "action": "Add 4 ounces of cream cheese, 2 spoonfuls of sugar, and vanilla to a bowl",
      "timestamp": "00:16-00:20",
      "ingredients": ["cream cheese", "sugar", "vanilla"],
      "technique": "combining ingredients",
      "equipment": ["spoon"]
    },
    {
      "step_number": 4,
     

### Structured Action Timeline Summary

We've successfully demonstrated Nova Omni's ability to generate structured, JSON-formatted action timelines. This capability provides:

**Key Benefits:**
- **Machine-Readable Format**: JSON output can be easily parsed and processed by applications
- **Chronological Organization**: Actions are listed in temporal order with precise timestamps
- **Programmatic Integration**: Results can be stored in databases, used in APIs, or fed into analytics pipelines
- **Scalability**: Automated processing enables analysis of large video libraries

**Practical Applications:**
- Building searchable video databases
- Automating content moderation workflows
- Generating training datasets for ML models
- Creating detailed video documentation
- Enabling advanced video analytics

The structured format makes it straightforward to integrate video understanding into larger systems and workflows, opening up possibilities for automated video processing at scale.

## Capability 3: Content Moderation

Content moderation is a critical capability for platforms that host user-generated content, media companies, and organizations that need to ensure their video content meets safety and compliance standards. Nova Omni can analyze both visual and audio content to detect potentially inappropriate material.

### Use Cases for Content Moderation:

- **Social Media Platforms**: Automatically flag inappropriate user-uploaded videos
- **E-Learning Platforms**: Ensure educational content is appropriate for target age groups
- **Corporate Training**: Verify training videos meet workplace standards
- **Broadcasting**: Pre-screen content before airing
- **User-Generated Content**: Moderate submissions before publication
- **Compliance & Legal**: Document content for regulatory requirements
- **Brand Safety**: Ensure advertising appears alongside appropriate content

### What We'll Detect:

1. **Explicit Visual Content**: Nudity, violence, graphic imagery, or other adult material
2. **Toxic Audio Content**: Profanity, hate speech, threats, or harmful language

For each detected issue, we'll receive:
- Type of content detected
- Severity level (low/medium/high)
- Precise timestamp in MM:SS format
- Detailed description of the finding

Let's analyze our test videos for content moderation purposes.

### Visual Content Analysis

First, let's analyze the video for explicit visual content including nudity, violence, or other adult material.

In [66]:
print("Analyzing Visual Content for Moderation\n")
print("Video: Meridian_Clip.mp4\n")

# Create prompt for explicit visual content detection
visual_moderation_prompt = """
Analyze this video for explicit visual content that may be inappropriate or require content warnings.

Examine the video for:
- Nudity or sexually explicit content
- Violence or graphic imagery
- Disturbing or shocking visual content
- Other adult material that may require age restrictions

For any detected content, provide:
1. Type of content detected
2. Severity level: low (mild), medium (moderate), or high (severe)
3. Timestamp in MM:SS format when the content appears
4. Brief description of what was detected

If no explicit visual content is detected, clearly state that the video appears to be safe for general audiences from a visual content perspective.

Be specific and factual in your assessment.
"""

# Invoke model for visual content analysis
response_visual, content_visual = invoke_nova_video(
    video_uri=video1_uri,
    prompt=visual_moderation_prompt,
    temperature=0.1,  # Lower temperature for more consistent moderation
    bedrock_client=bedrock_client
)

# Display response
display_response(content_visual)

Analyzing Visual Content for Moderation

Video: Meridian_Clip.mp4

== Text Output ==

RESPONSE CONTENT


### Audio Content Analysis

Now let's analyze the audio track for toxic content including profanity, hate speech, or harmful language.

In [67]:
print("Analyzing Audio Content for Moderation\n")
print("Video: Meridian_Clip.mp4\n")

# Create prompt for toxic audio content detection
audio_moderation_prompt = """
Analyze the audio content in this video for toxic or inappropriate language.

Listen for:
- Profanity or vulgar language
- Hate speech or discriminatory language
- Threats or violent language
- Harassment or bullying language
- Other harmful or inappropriate speech

For any detected content, provide:
1. Type of toxic content detected
2. Severity level: low (mild), medium (moderate), or high (severe)
3. Timestamp in MM:SS format when the language occurs
4. Context or description (without repeating explicit language verbatim)

If no toxic audio content is detected, clearly state that the audio appears to be appropriate for general audiences.

If there is no spoken audio or dialogue in the video, note that as well.

Be specific and factual in your assessment.
"""

# Invoke model for audio content analysis
response_audio, content_audio = invoke_nova_video(
    video_uri=video1_uri,
    prompt=audio_moderation_prompt,
    temperature=0.1,  # Lower temperature for more consistent moderation
    bedrock_client=bedrock_client
)

# Display response
display_response(content_audio)

Analyzing Audio Content for Moderation

Video: Meridian_Clip.mp4

== Text Output ==
No toxic audio content was detected in this video.

RESPONSE CONTENT
No toxic audio content was detected in this video.


### Content Moderation Analysis for the-sea.mp4 Video

Let's also analyze our second video (the-sea.mp4) to demonstrate the moderation capability across different content types.

In [68]:
print("Analyzing Visual Content for Moderation\n")
print("Video: the-sea.mp4\n")

# Invoke model for visual content analysis on second video
response_visual2, content_visual2 = invoke_nova_video(
    video_uri=video2_uri,
    prompt=visual_moderation_prompt,
    temperature=0.1,
    bedrock_client=bedrock_client
)

# Display response
display_response(content_visual2)

Analyzing Visual Content for Moderation

Video: the-sea.mp4

== Text Output ==
The video features a serene coastal landscape with rocky cliffs and the ocean, followed by a close-up of a seashell on a beach at sunset. There are no elements of nudity, violence, or other inappropriate content. The visuals are peaceful and natural, making the video suitable for general audiences.

RESPONSE CONTENT
The video features a serene coastal landscape with rocky cliffs and the ocean, followed by a close-up of a seashell on a beach at sunset. There are no elements of nudity, violence, or other inappropriate content. The visuals are peaceful and natural, making the video suitable for general audiences.


In [69]:
print("Analyzing Audio Content for Moderation\n")
print("Video: the-sea.mp4\n")

# Invoke model for audio content analysis on second video
response_audio2, content_audio2 = invoke_nova_video(
    video_uri=video2_uri,
    prompt=audio_moderation_prompt,
    temperature=0.1,
    bedrock_client=bedrock_client
)

# Display response
display_response(content_audio2)

Analyzing Audio Content for Moderation

Video: the-sea.mp4

== Text Output ==
The audio features a serene coastal scene with waves crashing against rocks and a distant view of the ocean. There is no speech or dialogue present in this clip. The predominant sounds are those of nature, creating a peaceful and calming atmosphere.

RESPONSE CONTENT
The audio features a serene coastal scene with waves crashing against rocks and a distant view of the ocean. There is no speech or dialogue present in this clip. The predominant sounds are those of nature, creating a peaceful and calming atmosphere.


### Content Moderation Summary

We've successfully demonstrated Nova Omni's content moderation capabilities for both visual and audio content across two different videos.

**Key Capabilities Demonstrated:**

1. **Visual Content Analysis**: Detection of explicit imagery, violence, and inappropriate visual content
2. **Audio Content Analysis**: Detection of profanity, hate speech, and toxic language
3. **Severity Assessment**: Classification of findings by severity level (low/medium/high)
4. **Temporal Precision**: Exact timestamps for when inappropriate content appears
5. **Detailed Reporting**: Comprehensive descriptions of detected issues

**Practical Applications:**

- **Automated Moderation**: Scale content review across large video libraries
- **Pre-Publication Screening**: Flag content before it goes live
- **Compliance Documentation**: Create audit trails for regulatory requirements
- **Age-Appropriate Filtering**: Classify content for different audience segments
- **Brand Safety**: Ensure advertising appears alongside appropriate content
- **Human Review Prioritization**: Focus human moderators on flagged content

**Important Considerations:**

- Content moderation should be used as part of a comprehensive moderation strategy
- AI-based moderation works best when combined with human review for edge cases
- Different platforms and contexts may have different standards for what constitutes inappropriate content
- Regular calibration and testing ensures moderation accuracy over time

The combination of visual and audio analysis provides comprehensive coverage for content safety, helping organizations maintain appropriate standards while scaling their content operations.

## Capability 4: Logo Recognition

Logo recognition is a powerful capability for brand monitoring, competitive analysis, and understanding product placement in video content. Nova Pro can identify brand logos and trademarks appearing in video frames, providing detailed information about when and how brands appear in your content.

### Use Cases for Logo Recognition:

- **Brand Monitoring**: Track your brand's visibility across video content
- **Competitive Intelligence**: Identify competitor brands in market research videos
- **Product Placement Analysis**: Measure brand exposure in entertainment content
- **Sponsorship Verification**: Confirm sponsor logos appear as contracted
- **Advertising Analytics**: Quantify brand visibility in commercial content
- **Content Rights Management**: Identify unauthorized use of branded content
- **Market Research**: Analyze brand presence in user-generated content
- **Sports Analytics**: Track sponsor visibility during sporting events

### What We'll Extract:

For each detected logo, we'll receive:
- **Logo/Brand Name**: The identified brand or trademark
- **Timestamp**: When the logo first appears (MM:SS format)
- **Duration**: How long the logo remains visible
- **Visual Description**: Characteristics of the logo (colors, placement, size)

The structured JSON format makes it easy to:
- Generate brand visibility reports
- Calculate total exposure time for each brand
- Create searchable brand appearance databases
- Integrate with analytics and reporting systems

Let's analyze our test videos for logo recognition.

### Logo Detection in Andy-Jassy-on-Amazon-Nova.mp4

Let's analyze the Andy-Jassy-on-Amazon-Nova clip to identify any brand logos or trademarks that appear in the video.

In [70]:
print("Analyzing Video for Logo Recognition\n")
print("Video: Andy-Jassy-on-Amazon-Nova.mp4\n")

# Define JSON schema for logo detection results
logo_detection_prompt = """
Analyze this executive presentation video and identify all brand logos, company branding, and visual brand elements.

For tech presentations, pay special attention to:
- Company logos (AWS, Amazon, partner companies)
- Product logos and branding (Amazon Nova, AWS services, etc.)
- Presentation slide branding and watermarks
- Screen graphics and UI elements showing branded products
- Background branding (stage backdrops, banners, displays)
- Clothing or badge branding
- Demo screens showing branded interfaces

For each logo or brand element detected, provide:
1. **Brand/Logo Name**: Specific name (e.g., "AWS", "Amazon", "Amazon Nova", "Amazon Bedrock")
2. **Brand Category**: Type of branding (company logo, product logo, service name, presentation graphic, etc.)
3. **Timestamp**: First clear appearance in MM:SS format
4. **Duration**: How long it remains visible ("throughout presentation", "5-10 seconds", "appears multiple times", etc.)
5. **Placement & Context**: Where it appears (slide header, product demo, speaker backdrop, clothing, etc.)
6. **Visual Description**: Colors, size, prominence, and styling
7. **Prominence Level**: How prominently featured (primary/secondary/background)

Output in JSON format:
{
  "presentation_context": "brief description of the presentation setting",
  "primary_brand": "main company/product being presented",
  "logos": [
    {
      "logo_name": "specific brand or product name",
      "brand_category": "company logo | product logo | service name | presentation element",
      "timestamp": "MM:SS",
      "duration": "duration description",
      "placement": "where the logo appears in frame",
      "description": "detailed visual description including colors, size, and styling",
      "prominence": "primary | secondary | background",
      "context": "what's happening when this logo appears (e.g., 'shown during product announcement', 'visible on demo screen')"
    }
  ],
  "brand_mentions": [
    {
      "brand_name": "brand mentioned verbally or in text",
      "timestamp": "MM:SS",
      "mention_type": "visual text | spoken | both",
      "context": "context of the mention"
    }
  ]
}

If no recognizable logos are detected, return empty arrays.
Provide ONLY the JSON output, no additional text.
"""

# Invoke model with logo detection prompt
response_logo, content_logo= invoke_nova_video(
    video_uri=video3_uri,
    prompt=logo_detection_prompt,
    temperature=0.2,  # Lower temperature for more consistent detection
    bedrock_client=bedrock_client
)

print("\n" + "="*80)
print("RAW JSON RESPONSE")
print("="*80)
print(content_logo)
print("="*80)

Analyzing Video for Logo Recognition

Video: Andy-Jassy-on-Amazon-Nova.mp4

== Text Output ==
```json
{
  "presentation_context": "A speaker presents at AWS re:Invent, using hand gestures and addressing an engaged audience in a blue-lit conference hall.",
  "primary_brand": "AWS",
  "logos": [
    {
      "logo_name": "AWS",
      "brand_category": "company logo",
      "timestamp": "00:00",
      "duration": "throughout presentation",
      "placement": "bottom right corner of the screen",
      "description": "Small white AWS logo in the lower-right corner of the frame, consistently visible throughout the presentation.",
      "prominence": "background",
      "context": "Brand watermark during the entire presentation"
    }
  ],
  "brand_mentions": []
}
```

RAW JSON RESPONSE
```json
{
  "presentation_context": "A speaker presents at AWS re:Invent, using hand gestures and addressing an engaged audience in a blue-lit conference hall.",
  "primary_brand": "AWS",
  "logos": [
    {
   

### Parsing and Displaying Logo Detection Results

Now let's parse the JSON response and display the logo timeline in a formatted, human-readable way.

In [71]:
# Parse JSON response for logo detections
try:
    logo_data = parse_json_response(content_logo)
    
    # Display logo timeline with descriptions
    print("\n" + "="*80)
    print("LOGO RECOGNITION RESULTS")
    print("="*80)
    print(f"\nVideo: Andy-Jassy-on-Amazon-Nova.mp4")
    
    logos_detected = logo_data.get('logos', [])
    print(f"Total Logos Detected: {len(logos_detected)}\n")
    
    if len(logos_detected) > 0:
        # Display each logo with details
        for idx, logo_entry in enumerate(logos_detected, 1):
            logo_name = logo_entry.get('logo_name', 'Unknown')
            timestamp = logo_entry.get('timestamp', 'N/A')
            duration = logo_entry.get('duration', 'Not specified')
            description = logo_entry.get('description', 'No description')
            
            print(f"{idx}. {logo_name}")
            print(f"   First Appearance: {timestamp}")
            print(f"   Duration: {duration}")
            print(f"   Description: {description}")
            print()
    else:
        print("No recognizable brand logos were detected in this video.\n")
    
    print("="*80)
    
except Exception as e:
    print(f"\n✗ Error parsing JSON response: {e}")
    print("The model may have returned text instead of pure JSON.")


LOGO RECOGNITION RESULTS

Video: Andy-Jassy-on-Amazon-Nova.mp4
Total Logos Detected: 1

1. AWS
   First Appearance: 00:00
   Duration: throughout presentation
   Description: Small white AWS logo in the lower-right corner of the frame, consistently visible throughout the presentation.



### Logo Recognition Summary

We've successfully demonstrated Nova Omni's logo recognition capabilities across two different videos.

**Key Capabilities Demonstrated:**

1. **Brand Identification**: Recognition of logos and trademarks in video frames
2. **Temporal Tracking**: Precise timestamps for when logos first appear
3. **Duration Estimation**: Understanding how long logos remain visible
4. **Visual Description**: Detailed characteristics including colors, placement, and context
5. **Structured Output**: JSON format for easy integration with analytics systems

**Practical Applications:**

- **Brand Visibility Reports**: Quantify brand exposure across video content
- **Competitive Analysis**: Track competitor brand appearances in market research
- **Sponsorship ROI**: Measure sponsor logo visibility and calculate exposure value
- **Product Placement Verification**: Confirm contracted brand appearances
- **Content Rights Management**: Identify unauthorized brand usage
- **Advertising Analytics**: Measure brand presence in commercial content
- **Market Intelligence**: Analyze brand trends in user-generated content

**Integration Opportunities:**

The structured JSON output enables:
- Automated brand monitoring dashboards
- Integration with marketing analytics platforms
- Database storage for historical brand tracking
- API endpoints for real-time logo detection services
- Reporting systems for sponsorship and advertising teams

**Considerations:**

- Logo recognition accuracy depends on factors like video quality, logo size, and visibility
- Partial or obscured logos may be more difficult to identify
- Context and placement can affect recognition confidence
- Regular testing with diverse content ensures consistent performance

Logo recognition provides valuable insights for brand management, competitive intelligence, and content monetization strategies, making it an essential tool for modern video analytics.

### Capabilities Summary

Throughout this notebook, we've successfully demonstrated five comprehensive video understanding capabilities:

**1. Video Summarization**
- Generated concise summaries with key points for both test videos
- Identified main events and moments in video content
- Provided structured bullet-point summaries

**2. Temporal Event Detection**
- Detected environmental changes with precise timestamps
- Identified character and object appearances
- Recognized cinematographic elements and camera movements
- All timestamps provided in human-readable MM:SS format

**3. Structured Action Timeline**
- Generated JSON-formatted chronological action lists
- Created machine-readable timelines for programmatic processing
- Enabled integration with downstream analytics systems

**4. Content Moderation**
- Analyzed visual content for explicit material
- Examined audio content for toxic language
- Provided severity assessments and detailed descriptions
- Generated comprehensive moderation reports

**5. Logo Recognition**
- Identified brand logos and trademarks in video frames
- Tracked logo appearances with timestamps and duration
- Provided detailed visual descriptions of detected logos
- Delivered structured JSON output for brand analytics

### Key Achievements

- **Environment Setup**: Successfully initialized AWS clients and configured S3 storage
- **Helper Functions**: Created reusable utilities for video processing and cost tracking
- **Video Upload**: Uploaded test videos to S3 for analysis
- **Cost Tracking**: Monitored and calculated costs for all API invocations
- **Error Handling**: Implemented robust error handling for AWS services and JSON parsing
- **Structured Outputs**: Generated both human-readable and machine-readable results

### Next Steps

Now that you've seen these capabilities in action, consider:

1. **Experimenting with Your Own Videos**: Upload your own content to test these capabilities
2. **Customizing Prompts**: Adjust prompts to focus on specific aspects relevant to your use case
3. **Building Pipelines**: Integrate these capabilities into automated video processing workflows
4. **Combining Capabilities**: Use multiple capabilities together for comprehensive video analysis
5. **Scaling Up**: Process larger video libraries using batch processing techniques
6. **Cost Optimization**: Use the cost tracking data to optimize your video analysis workflows
7. **Integration**: Connect these capabilities to your existing systems via APIs
8. **Advanced Use Cases**: Explore domain-specific applications like sports analytics, security monitoring, or content creation

### Additional Resources

- [Amazon Nova Omni Documentation](https://docs.aws.amazon.com/bedrock/latest/userguide/models-nova.html)
- [Amazon Bedrock User Guide](https://docs.aws.amazon.com/bedrock/latest/userguide/what-is-bedrock.html)
- [Amazon Nova Samples Repository](https://github.com/aws-samples/amazon-nova-samples)
- [Bedrock Pricing](https://aws.amazon.com/bedrock/pricing/)
- [Amazon Bedrock API Reference](https://docs.aws.amazon.com/bedrock/latest/APIReference/welcome.html)

### Conclusion

Amazon Nova Omni provides powerful video understanding capabilities that can transform how you analyze and process video content. From simple summarization to complex logo recognition and content moderation, these capabilities enable you to extract valuable insights from video at scale.

The combination of precise temporal understanding, structured outputs, and comprehensive cost tracking makes Nova Omni an excellent choice for production video analysis workflows. Whether you're building content moderation systems, brand monitoring tools, or automated video indexing solutions, Nova Omni provides the foundation for sophisticated video intelligence applications.

Thank you for exploring Amazon Nova Omni's video understanding capabilities!