# Contextual ad overlays


Contextual ad targeting is an advertising strategy that focuses on displaying ads based on the content and context where the ad will appear, rather than relying on user personal data or behavior.  This approach has gained renewed importance in the post-GDPR era as the digital advertising industry moves away from third-party cookies and personal data collection, making it an effective alternative to behavioral targeting while maintaining ad relevance and effectiveness.

Amazon Bedrock Data Automation (BDA) provides features to help automate the process of contextual ad targeting for video content. With BDA, you can analyze the video's content to identify different scenes and generate contextual metadata for each scene. This metadata includes:

* Classification of each scene using the IAB Content Taxonomy - The IAB Content Taxonomy is a standard used by advertisers to categorize content for automated ad placement through Ad Decision Servers.

By leveraging the IAB Content Taxonomy, you can match advertisements to the contextual information of each video scene. This allows you to create more effective and relevant advertising experiences that are tailored to the video's content.  The image below is a screenshot of a sample video with ad overlays.

![Example of an ad overlay](static/images/ad-overlay-example.png)



## Architecture

This hands-on workflow uses AWS services from SageMaker.  You will use BDA to analyze a sample video to identify scenes in the video where ads can be placed that adhere to brand safety guidelines.  Then you will select the best ad for each opportunity by matching the IAB category for an ad to the IAB category for the scene.

![Workflow diagram](static/images/ad-overlays.drawio.png)


## Prerequisites

### Retrieve saved values from previous notebooks

To run this notebook, you need to have run the previous notebook: [00_prerequisites.ipynb](./00-prequisites.ipynb), where you installed package dependencies and gathered some information from the SageMaker environment.

In [None]:
store -r

In [None]:
%pip install itables==2.2.4 PyPDF2==3.0.1 --upgrade -qq
%pip install moviepy

In [None]:
data_bucket = session["bucket"]
region = sagemaker_resources["region"]
data_prefix = "bda/video"
output_prefix = "bda/video/ouput"

In [None]:
import boto3
import json
import uuid
import utils
from IPython.display import Video, Image, display

bda_client = boto3.client('bedrock-data-automation')
bda_runtime_client = boto3.client('bedrock-data-automation-runtime')
s3_client = boto3.client('s3')

#access account id
sts_client = boto3.client('sts')
account_id = sts_client.get_caller_identity()["Account"]

default_profile_arn = f"arn:aws:bedrock:{region}:{account_id}:data-automation-profile/us.data-automation-v1"

## Create a BDA project
To start a BDA job, you need a BDA project, which organizes both standard and custom output configurations. This project is reusable, allowing you to apply the same configuration to process multiple videos that share the same settings.

In the code snippet below, we create a BDA project with standard output configurations for video modality. These configurations can be tailored to extract only the specific information you need. In this lab, we will enable the below video outputs:
- Scene summary
- Content moderation (visual and audio)
- IAB taxonomy classification of scenes

For a complete API reference for creating a BDA project, refer to this [document](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/bedrock-data-automation/client/create_data_automation_project.html).

In [None]:
response = bda_client.create_data_automation_project(
    projectName=f'bda-workshop-video-project-moderation-{str(uuid.uuid4())[0:4]}',
    projectDescription='BDA workshop video sample project - content moderation',
    projectStage='DEVELOPMENT',
    standardOutputConfiguration={
        'video': {
            'extraction': {
                'category': {
                    'state': 'ENABLED',
                    'types': ['CONTENT_MODERATION', 'TRANSCRIPT']
                },
                'boundingBox': {
                    'state': 'DISABLED'
                }
            },
            'generativeField': {
                'state': 'ENABLED',
                'types': [
                    'CHAPTER_SUMMARY', 'IAB'
                ]
            }
        }
    }
)

The `create_data_automation_project` API will return the project ARN, which we will use it to invoke the video analysis task.

In [None]:
video_project_arn = response.get("projectArn")
print("BDA video project ARN:", video_project_arn)

## Start an asynchronous BDA task to extract and analyze a video
In this section, we will use a sample video contains unsafe content, and extract and analyze it using BDA, applying the configuration defined in the BDA project. 

### Prepare the sample video

In [None]:
sample_video_path = './NetflixMeridian.mp4'
url = "https://ws-assets-prod-iad-r-pdx-f3b3f9f1a7d6a3d0.s3.us-west-2.amazonaws.com/7db2455e-0fa6-4f6d-9973-84daccd6421f/Netflix_Open_Content_Meridian.mp4"
!curl "https://ws-assets-prod-iad-r-pdx-f3b3f9f1a7d6a3d0.s3.us-west-2.amazonaws.com/7db2455e-0fa6-4f6d-9973-84daccd6421f/Netflix_Open_Content_Meridian.mp4" --output NetflixMeridian.mp4

Let's display the video. 

In [None]:
from IPython.display import Video
Video(sample_video_path, width=800)

To analyze the video using BDA, we need to upload it to an S3 bucket that BDA can access. 

In [None]:
s3_key = f'{data_prefix}/{sample_video_path.split("/")[-1]}'
s3_client.upload_file(sample_video_path, data_bucket, s3_key)

### Start BDA task
We will now invoke the BDA API to process the uploaded video. You need to provide the BDA project ARN that we created at the beginning of the lab and specify an S3 location where BDA will store the output results.

For a complete API reference for invoke a BDA async task, refer to this [document](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/bedrock-data-automation-runtime/client/invoke_data_automation_async.html).

In [None]:
response = bda_runtime_client.invoke_data_automation_async(
    inputConfiguration={
        's3Uri': f's3://{data_bucket}/{s3_key}'
    },
    outputConfiguration={
        's3Uri': f's3://{data_bucket}/{output_prefix}'
    },
    dataAutomationConfiguration={
        'dataAutomationProjectArn': video_project_arn,
        'stage': 'DEVELOPMENT'
    },
    notificationConfiguration={
        'eventBridgeConfiguration': {
            'eventBridgeEnabled': False
        }
    },
    dataAutomationProfileArn=default_profile_arn
)

The `invoke_data_automation_async` API is asynchronous. It returns an invocation task identifier, `invocationArn`. We can then use another API `get_data_automation_status` to monitor the task's status until it completes.

> In production workloads, an event-driven pattern is recommended. Allow BDA to trigger the next step once the task is complete. This can be achieved by configuring the notificationConfiguration in the invoke task, which will send a notification to a subscribed AWS service, such as a Lambda function. Alternatively, you can set up an S3 trigger on the bucket where BDA will drop the results.

In [None]:
invocation_arn = response.get("invocationArn")
print("BDA task started:", invocation_arn)

In this lab, we will use the loop below to monitor the task by calling the `get_data_automation_status` API every 5 seconds until the task is complete.

This video will take less than 5 minutes to process.

In [None]:
import time
from IPython.display import clear_output
from datetime import datetime

status, status_response = None, None
while status not in ["Success","ServiceError","ClientError"]:
    status_response = bda_runtime_client.get_data_automation_status(
        invocationArn=invocation_arn
    )
    status = status_response.get("status")
    clear_output(wait=True)
    print(f"{datetime.now().strftime('%H:%M:%S')} : BDA video task: {status}")
    time.sleep(5)

output_config = status_response.get("outputConfiguration",{}).get("s3Uri")
print("Ouput configuration file:", output_config)

## Access the BDA analysis result
The `get_data_automation_status` API returns an S3 URI containing the result configuration, which provides the S3 location where BDA outputs the extraction results. We will then parse this file to retrieve the result path.

In [None]:
config_data = utils.read_json_on_s3(output_config,s3_client)
print(json.dumps(config_data, indent=4))

As shown above, the BDA output configuration file contains metadata about the BDA result, including the job ID, status, modality, and the S3 location of the actual result JSON. We will now download this result file to verify the output.

In [None]:
from IPython.display import JSON

result_uri = config_data["output_metadata"][0]["segment_metadata"][0]["standard_output_path"]
result_data = utils.read_json_on_s3(result_uri,s3_client)

JSON(result_data)

## Review the result
The BDA video analysis results provide a detailed breakdown of information, organized by video and scene levels. 
> A video scene is a sequence of shots that form a coherent unit of action or narrative within the video.

Take a moment to view the details of the metadata for one of the scenes.  Scenes are video segments, so they have a start time, an end time and a duration.  These segment timestamps can be used to determine when to display an ad to coincide with a scene.


In [None]:
JSON(result_data['chapters'][9])

In [None]:
start = result_data['chapters'][9]['start_timestamp_millis']/1000
end = result_data['chapters'][9]['end_timestamp_millis']/1000
shot_url = f'{url}#t={start},{end}'
Video(url=shot_url, width=640, height=360)

### IAB Categories

The IAB categories, `iab_categories`, have a label and a level, so that items in the same label hierarchy can be matched at different levels of detail.  For example, here are the first two levels (aka Tiers) of the IAB taxonomy for Attractions:

![IAB Attractions](static/images/IAB-Attractions-1and2.png)


The BDA standard output for scenes contains the IAB category classification of each scene where a valid categorizations could be found.  If no suitable category is found, BDA will leave the content of the iab_categories empty.  The IAB categories and levels BDA generated for the scenes in the sample video are displayed below.  

In [None]:
print("IAB Categories")
for chapter in result_data['chapters']:
    chapter_str = f'==Scene {chapter["chapter_index"]}: '
    for iab_cat in chapter['iab_categories']:
        chapter_str = (f'{chapter_str} ({ iab_cat["category"] }, { iab_cat["taxonomy_level"] })')
    print(chapter_str)

## Use IAB Categories to match scenes to ads

It is a straight forward process to match ads to scenes if they share the same content taxonomy.  Imagine we have the following list of ad images we can use for overlays.


In [None]:
ads = {
    "Automotive": {
        "overlay_image": "static/ads/CarAd.png",
        "level": 1
    },
    "Business and Finance": {
        "overlay_image": "static/ads/BankAd.png",
        "level": 1},
    "Travel": {
        "overlay_image": "static/ads/CruiseAd.png",
        "level": 1
    },
    "Sports": {
        "overlay_image": "static/ads/SportsAd.png", 
        "level": 1
    }
}


### Match ads to scenes


We'll loop over the scenes, looking for scenes that have categories from the level 1 IAB taxonomy.  If you have a lot of potentials ads, you could choose to use a lower level of the taxonomy.  

In [None]:
from IPython.display import HTML

def create_video_with_overlay(url, overlay_image):
    return HTML(f'''
        <div class="video-container" style="position: relative; width: 640px; height: 360px;">
            <video width="640" height="360" controls>
                <source src="{url}" type="video/mp4">
                Your browser does not support the video tag.
            </video>
            <img class="overlay" src="{overlay_image}" style="
                position: absolute;
                bottom: 0;
                left: 0;
                width: 100%;
                height: 25%;  /* Takes up bottom 25% of video height */
                opacity: 1;
                pointer-events: none;
                object-fit: cover;  /* Ensures image covers the area properly */
            ">
        </div>
    ''')

LEVEL = 1
for chapter in result_data['chapters']:
    if len(chapter['iab_categories']) > 0:
        print(f"\n===== SCENE: {chapter['chapter_index']}\n")
        start = result_data['chapters'][chapter['chapter_index']]['start_timestamp_millis']/1000
        end = result_data['chapters'][chapter['chapter_index']]['end_timestamp_millis']/1000
        shot_url = f'{url}#t={start},{end}'
        
        for iab_cat in chapter['iab_categories']:
            if iab_cat['taxonomy_level'] == LEVEL:
                print(f"== Matching ad found for category: {iab_cat['category']}\n")
                try:
                    overlay_image = ads[iab_cat['category']]['overlay_image']
                    display(create_video_with_overlay(shot_url, overlay_image))
                except KeyError as e:
                    print(f"Error: Could not find key in dictionary: {e}")
                except Exception as e:
                    print(f"Error: {e}")


## Summary
In this lab, we use BDA to extract and analyze a sample video to detect scenes in the video and then determine the IAB content taxonomy for the scene content so we could match ads to the content.

## Clean up

Delete the BDA project, blueprint, image, and result from S3.

In [None]:
# delete BDA project
response = bda_client.delete_data_automation_project(
    projectArn=video_project_arn
)
response

In [None]:
# delete uploaded image from S3
s3_client.delete_object(Bucket=data_bucket, Key=s3_key)

## Continue to the next section of the workshop

You can either go on to Part 2 to work with Amazon Nova workflows or conclude the workshop by moving to the Additional Resources and Cleanup sections.

1. Continue to [Start of Part 2: Visual video segments: frames, shots and scenes](../2-media-analysis-using-amazon-nova/01A-visual-segments-frames-shots-scenes.ipynb)
2. Continue to [Additional Resources](../09-resources.ipynb)