# Video Content Moderation

Video content moderation is a common use case across various industries, including Social Media, Media & Entertainment, Advertising, Education and more. Customers need to prevent unsafe or toxic content that could damage brand safety, or violate regional regulations. 

BDA Video offers features that help customers detect unsafe content in both the visual and audio components of videos. With blueprints, customers can extend their moderation policies to flexibly support their business needs.

In this lab, we will use BDA to analyze a sample video containing unsafe and toxic content:

![video moderation](../static/bda-video-cm.png)

## Prerequisits

In [None]:
%pip install "boto3>=1.35.76" itables==2.2.4 PyPDF2==3.0.1 --upgrade -qq
%pip install moviepy

For a self-hosted workshop, we recommend creating a new S3 bucket in the same region where you plan to run the workshop. You can name it `bda-workshop-YOUR_ACCOUNT_ID-YOUR_REGION`.

In [None]:
data_bucket = "<Enter your bucket name here>"
data_prefix = "bad-workshop/video"
output_prefix = "bad-workshop/video/ouput"

In [None]:
import boto3
import json
import uuid
import utils

bda_client = boto3.client('bedrock-data-automation')
bda_runtime_client = boto3.client('bedrock-data-automation-runtime')
s3_client = boto3.client('s3')

## Create a BDA project
To start a BDA job, you need a BDA project, which organizes both standard and custom output configurations. This project is reusable, allowing you to apply the same configuration to process multiple videos that share the same settings.

In the code snippet below, we create a BDA project with standard output configurations for video modality. These configurations can be tailored to extract only the specific information you need. In this lab, we will enable the below video outputs:
- Full video summary
- Content moderation (visual and audio)
- Text in video (without bounding-boxes)

For a complete API reference for creating a BDA project, refer to this [document](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/bedrock-data-automation/client/create_data_automation_project.html).

In [None]:
response = bda_client.create_data_automation_project(
    projectName=f'bda-workshop-video-project-moderation-{str(uuid.uuid4())[0:4]}',
    projectDescription='BDA workshop video sample project - content moderation',
    projectStage='DEVELOPMENT',
    standardOutputConfiguration={
        'video': {
            'extraction': {
                'category': {
                    'state': 'ENABLED',
                    'types': ['CONTENT_MODERATION','TEXT_DETECTION','TRANSCRIPT']
                },
                'boundingBox': {
                    'state': 'DISABLED'
                }
            },
            'generativeField': {
                'state': 'ENABLED',
                'types': [
                    'VIDEO_SUMMARY',
                ]
            }
        }
    }
)

The `create_data_automation_project` API will return the project ARN, which we will use it to invoke the video analysis task.

In [None]:
video_project_arn = response.get("projectArn")
print("BDA video project ARN:", video_project_arn)

## Start an asynchronous BDA task to extract and analyze a video
In this section, we will use a sample video contains unsafe content, and extract and analyze it using BDA, applying the configuration defined in the BDA project. 

### Prepare the sample video

In [None]:
# Download sample video
sample_video_path = 'content-moderation-demo.mp4'
source_url = f'https://d1xvhy22zmw77y.cloudfront.net/tmp/{sample_video_path}'

!curl {source_url} --output {sample_video_path}

Let's display the video. This short sample contains unsafe visual scenes, including alcohol, tobacco, and suggestive content, as well as toxic audio components. It also features text in the visual portion, which is considered unsafe. Our mission is to extract and analyze this information for content moderation purposes.

In [None]:
from IPython.display import Video
Video(sample_video_path, width=800)

To analyze the video using BDA, we need to upload it to an S3 bucket that BDA can access. 

In [None]:
s3_key = f'{data_prefix}/{sample_video_path.split("/")[-1]}'
s3_client.upload_file(sample_video_path, data_bucket, s3_key)

### Start BDA task
We will now invoke the BDA API to process the uploaded video. You need to provide the BDA project ARN that we created at the beginning of the lab and specify an S3 location where BDA will store the output results.

For a complete API reference for invoke a BDA async task, refer to this [document](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/bedrock-data-automation-runtime/client/invoke_data_automation_async.html).

In [None]:
response = bda_runtime_client.invoke_data_automation_async(
    inputConfiguration={
        's3Uri': f's3://{data_bucket}/{s3_key}'
    },
    outputConfiguration={
        's3Uri': f's3://{data_bucket}/{output_prefix}'
    },
    dataAutomationConfiguration={
        'dataAutomationArn': video_project_arn,
        'stage': 'DEVELOPMENT'
    },
    notificationConfiguration={
        'eventBridgeConfiguration': {
            'eventBridgeEnabled': False
        }
    }
)

The `invoke_data_automation_async` API is asynchronous. It returns an invocation task identifier, `invocationArn`. We can then use another API `get_data_automation_status` to monitor the task's status until it completes.

> In production workloads, an event-driven pattern is recommended. Allow BDA to trigger the next step once the task is complete. This can be achieved by configuring the notificationConfiguration in the invoke task, which will send a notification to a subscribed AWS service, such as a Lambda function. Alternatively, you can set up an S3 trigger on the bucket where BDA will drop the results.

In [None]:
invocation_arn = response.get("invocationArn")
print("BDA task started:", invocation_arn)

In this lab, we will use the loop below to monitor the task by calling the `get_data_automation_status` API every 5 seconds until the task is complete.

This video will take less than 5 minutes to process.

In [None]:
import time
from IPython.display import clear_output
from datetime import datetime

status, status_response = None, None
while status not in ["Success","ServiceError","ClientError"]:
    status_response = bda_runtime_client.get_data_automation_status(
        invocationArn=invocation_arn
    )
    status = status_response.get("status")
    clear_output(wait=True)
    print(f"{datetime.now().strftime('%H:%M:%S')} : BDA video task: {status}")
    time.sleep(5)

output_config = status_response.get("outputConfiguration",{}).get("s3Uri")
print("Ouput configureation file:", output_config)

## Access the BDA analysis result
The `get_data_automation_status` API returns an S3 URI containing the result configuration, which provides the S3 location where BDA outputs the extraction results. We will then parse this file to retrieve the result path.

In [None]:
config_data = utils.read_json_on_s3(output_config,s3_client)
print(json.dumps(config_data, indent=4))

As shown above, the BDA output configuration file contains metadata about the BDA result, including the job ID, status, modality, and the S3 location of the actual result JSON. We will now download this result file to verify the output.

In [None]:
from IPython.display import JSON

result_uri = config_data["output_metadata"][0]["segment_metadata"][0]["standard_output_path"]
result_data = utils.read_json_on_s3(result_uri,s3_client)

JSON(result_data)

## Review the result
The BDA video analysis results provide a detailed breakdown of information, organized by video and scene levels. 
> A video scene is a sequence of shots that form a coherent unit of action or narrative within the video.

In this lab, we are using a short sample video that contains only one scene, as our focus will be on reviewing the moderation analysis output. For more detailed instructions on metadata on scene and frame level, such as summary, IAB, and bounding boxes, please refer to the previous lab.

### Full video transcription
At the video level, we receive the full transcript based on the video's audio, with speakers identified.

In [None]:
print(result_data["video"]["transcript"]["representation"]["text"])

### Audio transcripts with toxicity analysis
You can find two fields at the scene level containing audio transcripts and audio moderation data: `audio_segments` and `content_moderation`. The sequence of moderation (toxicity) analysis corresponds to the transcripts. Below, we display them side by side.

In [None]:
import matplotlib.pyplot as plt

for scene in result_data["scenes"]:
    for idx in range(0,len(scene["audio_segments"])):
        print(f'[{scene["audio_segments"][idx]["start_timestamp_millis"]/1000} - {scene["audio_segments"][idx]["end_timestamp_millis"]/1000}] {scene["audio_segments"][idx]["text"]}')
        
        data = scene["content_moderation"][idx]
        # Extract category names and their confidence values
        categories = [item["category"] for item in data["moderation_categories"]]
        confidence_values = [item["confidence"] for item in data["moderation_categories"]]
        
        # Plotting
        plt.figure(figsize=(10, 6))
        plt.bar(categories, confidence_values, color='skyblue')
        
        # Add title and labels
        plt.title(f'Moderation Confidence by Category - overall confidence {data["confidence"]*100}%')
        plt.xlabel('Moderation Categories')
        plt.ylabel('Confidence')
        plt.xticks(rotation=45, ha='right')
        
        # Display the plot
        plt.tight_layout()
        plt.show()

The bar chart above represents the audio segment with its start and end timestamps. It visualizes the toxicity analysis of the audio transcript, where the x-axis shows the toxicity categories and the y-axis displays the corresponding confidence scores. 

BDA video toxicity analysis support the below built-in toxicity categories with confidence score between 0 and 1:
- profanity
- hate_speech
- sexual
- insult
- violence_or_threat
- graphic
- harassment_or_abuse

### Visual content moderation the frame level with confidence score
BDA also analyzes the visual portion of the video and provides content moderation labels at the frame level. You can find the visual moderation output for each scene under the respective frames.

In [None]:
from moviepy.video.io.VideoFileClip import VideoFileClip
from PIL import Image, ImageDraw
import matplotlib.pyplot as plt

with VideoFileClip(sample_video_path) as video_clip:
    for scene in result_data["scenes"]:
        for frame in scene["frames"]:
            if frame.get("content_moderation"):
                for cm in frame["content_moderation"]:
                    timestamp = frame["timestamp_millis"]/1000
                    img_frame = video_clip.get_frame(timestamp)  
                    frame_image = Image.fromarray(img_frame)
    
                    plt.figure(figsize=(10, 6))
                    plt.imshow(frame_image)
                    plt.title(f"Frame at {timestamp} seconds, {cm['category']}  ({cm['confidence']*100}%) ")
                    plt.axis("off")
                    plt.show()

### Frame level text extraction and confidence scores
Text extraction, along with confidence scores, is available at the frame level. In the output JSON structure, frames are organized under each scene with defined start and end times. If text is detected at a given frame timestamp, you can find `text_words` and `text_lines` included at the frame level.

Let's plot the frames with detected text.

In [None]:
with VideoFileClip(sample_video_movie) as video_clip:
    for scene in result_data["scenes"]:
        for frame in scene["frames"]:
            txts = []
            if frame.get("text_lines"):
                for tl in frame["text_lines"]:
                    txts.append(tl["text"])
            if txts:
                img_frame = video_clip.get_frame(timestamp)  
                frame_image = Image.fromarray(img_frame)

                timestamp = frame["timestamp_millis"]/1000
                plt.figure(figsize=(10, 6))
                plt.imshow(frame_image)
                plt.title(f"Frame at {timestamp} seconds: {'; '.join(txts)}")
                plt.axis("off")
                plt.show()

### Blueprint analyze the extract text - TODO when feature GA

## Summary
In this lab, we use BDA to extract and analyze a sample video containing unsafe and toxic content. The BDA output includes visual and audio moderation detections, as specified in the BDA project configuration.

## Clean up

Delete the BDA project, blueprint, image, and result from S3.

In [None]:
# delete BDA project
response = bda_client.delete_data_automation_project(
    projectArn=video_project_arn
)
response

In [None]:
# delete uploaded image from S3
s3_client.delete_object(Bucket=data_bucket, Key=s3_key)

In [None]:
# delete results from S3
utils.delete_s3_folder(data_bucket, output_config.replace("job_metadata.json","") ,s3_client)