# Prerequisites

## Workshop overview

Welcome to the Video Understanding on AWS workshop!

The workshop is organized into two main parts: 1. Media Analysis using Bedrock Data Automation and 2. Media Analysis using Amazon Nova. Read on to get an overview of the different sections.  Part 1 and Part 2 can be run independently after running this notebook.

**Prerequisites**

Before running the main workshop, you'll set up the notebook environment using this notebook.

**Part 1: Media Analysis using Bedrock Data Automation (BDA):**

The notebooks in this section give an overview of BDA APIs and use cases.  They can be run in any order.  

1. [Extract and analyze a movie with BDA](1-media-analysis-using-bda/01-extract-analyze-a-movie.ipynb)
2. [Contextual Ad overlay](1-media-analysis-using-bda/02-contextual-ad-overlay.ipynb)

**Part 2: Media Analysis using Amazon Nova:**

In the foundation notebooks, you'll set up the notebook environment, prepare the sample video by breaking it down into clips, and you will experiment with using Foundation models to generate insights about video clips.  In the second part of the workshop, you will use the foundations to solve different video understanding use cases.  The use cases are independent and can be run in any order.

**Foundation (required before running use cases)**
1. [Visual video segments: frames, shots and scenes](2-media-analysis-using-amazon-nova/01A-visual-segments-frames-shots-scenes.ipynb) (20 minutes)
2. [Audio segments](2-media-analysis-using-amazon-nova/01B-audio-segments.ipynb) (10 minutes)

**Use cases (optional, run in any order):**

After running the Foundations notebooks, you can choose any use case.  If you are running at an AWS Workshop event, you will be able to complete foundations plus one use case in a 2 hour session:

* [Ad break detection and contextual Ad tartgeting](2-media-analysis-using-amazon-nova/02-ad-breaks-and-contextual-ad-targeting.ipynb) (20 minutes) - identify opportunities for ad insertion.  Use a standard taxonomy to match video content to ad content.
* [Video summarization](2-media-analysis-using-amazon-nova/03-video-summarization.ipynb) (20 minutes) - generate short form videos from a longer video
* [Semantic video search](2-media-analysis-using-amazon-nova/04-semantic-video-search.ipynb) (20 minutes) - search video using images and natural language to find relevant clips

**Resources**

The activities in this workshop are based on AWS Solution Guidance.  The [Additional Resources](./09-resources.ipynb) lab contains links to relevant reference architectures, code samples and blog posts.

# Install ffmpeg and python packages

- ffmpeg for video and image processing
- faiss for vector store
- webvtt-py for parsing subtitle file
- termcolor for formatting output

In [None]:
## install ffmpeg (Linux/SageMaker only)
# On macOS, install with: brew install ffmpeg
import platform
if platform.system() == 'Linux':
    !sudo apt update -y && sudo apt-get -y install ffmpeg
else:
    print(f"Skipping apt install on {platform.system()}. Install ffmpeg manually if needed.")

In [None]:
## Check if ffmpeg is installed
import shutil
if shutil.which('ffmpeg'):
    print(f"✓ ffmpeg found at: {shutil.which('ffmpeg')}")
else:
    print("✗ ffmpeg not found. Please install it:")
    print("  macOS: brew install ffmpeg")
    print("  Linux: sudo apt-get install ffmpeg")

In [None]:
%pip install -r requirements.txt

## Get SageMaker default resources

In [None]:
import boto3

sagemaker_resources = {}

# Try to get SageMaker execution role, fallback to boto3 session if not in SageMaker
try:
    import sagemaker
    sagemaker_resources["role"] = sagemaker.get_execution_role()
    sagemaker_resources["region"] = sagemaker.Session()._region_name
    print("Running in SageMaker environment")
except Exception:
    # Not in SageMaker, use boto3 session
    session = boto3.Session()
    sagemaker_resources["role"] = None  # Not needed outside SageMaker
    sagemaker_resources["region"] = session.region_name
    print("Running in local/non-SageMaker environment")

print(sagemaker_resources)

# Setup session AWS resources

The cell below loads AWS resources from the CloudFormation stack outputs. This works for both:
- AWS hosted events (using the full `workshop.yaml` stack)
- Your own AWS account (using the minimal `workshop-customer.yaml` stack)

Both stacks use the same stack name (`workshop`) and provide the same outputs, so this notebook works in either environment.

### Deployment Options

**Automatic deployment from notebook (recommended)**

Run the code cell below. If the stack doesn't exist, it will automatically deploy `workshop-customer.yaml` with your current user/role ARN for OpenSearch access.

**Note:** The `UserOrRoleArn` parameter is optional but recommended if you're running outside of SageMaker or need OpenSearch access from a specific IAM identity.

## Get CloudFormation stack outputs

In [None]:
import boto3
from IPython.display import JSON
from botocore.exceptions import ClientError

cf = boto3.client(service_name="cloudformation")
sts = boto3.client(service_name="sts")

try:
    stack = cf.describe_stacks(StackName='workshop')
    print("✓ Stack found")
except ClientError:
    print("Stack not found. Deploying workshop-customer.yaml...")
    
    # Get current user/role ARN for OpenSearch access
    try:
        caller_identity = sts.get_caller_identity()
        current_arn = caller_identity['Arn']
        print(f"  Detected identity: {current_arn}")
    except Exception as e:
        print(f"  Warning: Could not detect current ARN: {e}")
        current_arn = ""
    
    with open('workshop-customer.yaml', 'r') as f:
        template_body = f.read()
    
    # Create stack with current user ARN for OpenSearch access
    stack_params = {
        'StackName': 'workshop',
        'TemplateBody': template_body,
        'Capabilities': ['CAPABILITY_NAMED_IAM']
    }
    
    if current_arn:
        stack_params['Parameters'] = [
            {'ParameterKey': 'UserOrRoleArn', 'ParameterValue': current_arn}
        ]
    
    cf.create_stack(**stack_params)
    
    print("  Waiting for stack creation (5-10 minutes)...")
    cf.get_waiter('stack_create_complete').wait(StackName='workshop')
    
    stack = cf.describe_stacks(StackName='workshop')
    print("✓ Stack deployed successfully")

In [None]:
JSON(stack)

In [None]:
session = {}
session['bucket'] = next(item["OutputValue"] for item in stack['Stacks'][0]['Outputs'] if item["OutputKey"] == "S3BucketName")
session['MediaConvertRole'] = next(item["OutputValue"] for item in stack['Stacks'][0]['Outputs'] if item["OutputKey"] == "MediaConvertRole")
session["AOSSCollectionEndpoint"] = next(item["OutputValue"] for item in stack['Stacks'][0]['Outputs'] if item["OutputKey"] == "AOSSCollectionEndpoint")

print("\nWorkshop resources loaded:")
print(f"  S3 Bucket: {session['bucket']}")
print(f"  MediaConvert Role: {session['MediaConvertRole']}")
print(f"  OpenSearch Collection: {session['AOSSCollectionEndpoint']}")

# Find Amazon Q Developer

Jupyter notebooks in SageMaker Studio have Amazon Q Developer enabled.  

1. To use Q Developer click on the Q Developer chat icon in the left sidebar menu. The active side panel should now be Amazon Q Developer.
<br></br>
<img src="static/images/00-qdev-sidebar1.png" alt="Q Developer Sidebar" style="width: 600px;"/>
<br></br>
5. Try it out by asking a question.  For example, you could ask: `What kinds of questions can Q developer answer? Be brief.` You should get a response like this:
<br></br>
<img src="static/images/00-qdev-skills1.png" alt="Q Developer Skills" style="width: 600px;"/>
<br></br>

Throughout this workshop, you can use Q when you encounter errors or have questions about the code.  

# Save variables we will use in other notebooks

We will use this data in the next labs. In order to use this data we will store these variables so subsequent notebooks can use this data.

In [None]:
%store sagemaker_resources
%store session