# TwelveLabs Pegasus on Amazon Bedrock Workshop

TwelveLabs is a leading provider of multimodal AI models specializing in video understanding and analysis. TwelveLabs' advanced models enable sophisticated video search, analysis, and content generation capabilities through state-of-the-art computer vision and natural language processing technologies. Amazon Bedrock now offers two TwelveLabs models: TwelveLabs Pegasus 1.2, which provides comprehensive video understanding and analysis, and TwelveLabs Marengo Embed 2.7, which generates high-quality embeddings for video, text, audio, and image content. These models empower developers to build applications that can intelligently process, analyze, and derive insights from video data at scale.

## Part 0: Setup

### Dependencies

In [1]:
%pip install -r requirements.txt -Uq

Note: you may need to restart the kernel to use updated packages.


In [2]:
import boto3, botocore
import json
import re
import uuid
import time
from IPython.display import clear_output, HTML, display, Image
from opensearchpy import AWSV4SignerAuth, OpenSearch, RequestsHttpConnection

### Configure boto3

In [3]:
AWS_REGION = "us-west-2" # TODO: Replace with your AWS region

In [7]:
# Initialize AWS session
session = boto3.Session(profile_name='tl-dev-oregon') # TODO: Replace with your AWS profile

# Initialize AWS clients
bedrock_client = session.client('bedrock-runtime', region_name=AWS_REGION)
s3_client = session.client('s3')

### Configure S3

In [8]:
# S3 Configuration
S3_BUCKET_NAME = "tl-bedrock-samples-us-west-2" # TODO: Replace with your S3 bucket name
S3_VIDEOS_PATH = "videos"

# Validate S3 bucket name
if S3_BUCKET_NAME == "<YOUR_S3_BUCKET_NAME>" or S3_BUCKET_NAME == "":
    raise ValueError("Please replace <YOUR_S3_BUCKET_NAME> with your S3 bucket name")

#### Set up sample dataset to S3 bucket

In [9]:
# AWS Account ID for S3 bucket ownership
aws_account_id = session.client('sts').get_caller_identity()["Account"]

print(f"AWS Account ID: {aws_account_id}")
print(f"S3 Bucket: {S3_BUCKET_NAME}")
print(f"S3 Videos Path: {S3_VIDEOS_PATH}")

# Verify bucket access
try:
    s3_client.head_bucket(Bucket=S3_BUCKET_NAME)
    print(f"✅ Successfully connected to S3 bucket: {S3_BUCKET_NAME}")
except Exception as e:
    print(f"❌ Error accessing S3 bucket: {e}")
    print("Please ensure the bucket exists and you have proper permissions.")

AWS Account ID: 026090552520
S3 Bucket: tl-bedrock-samples-us-west-2
S3 Videos Path: videos
✅ Successfully connected to S3 bucket: tl-bedrock-samples-us-west-2


#### Netflix Open Content

The [Netflix Open Content](https://opencontent.netflix.com/) is an open source content available under the [Creative Commons Attribution 4.0 International Public License](https://www.google.com/url?q=https%3A%2F%2Fcreativecommons.org%2Flicenses%2Fby%2F4.0%2Flegalcode&sa=D&sntz=1&usg=AOvVaw3DDX6ldzWtAO5wOs5KkByf).

The assets are available for download at: http://download.opencontent.netflix.com/

We will be utilizing a subset of the videos for demonstrating how to utilize the TwelveLabs models on Amazon Bedrock.

In [10]:
# Sample video S3 URIs
sample_videos = [
    # 's3://download.opencontent.netflix.com/TechblogAssets/CosmosLaundromat/encodes/CosmosLaundromat_2048x858_24fps_SDR.mp4',
    # 's3://download.opencontent.netflix.com/TechblogAssets/Meridian/encodes/Meridian_3840x2160_5994fps_SDR.mp4',
    's3://download.opencontent.netflix.com/TechblogAssets/Sparks/encodes/Sparks_4096x2160_5994fps_SDR.mp4'
]

In [11]:
# Unsigned S3 client
public_s3_client = boto3.client('s3', config=botocore.client.Config(signature_version=botocore.UNSIGNED))

In [12]:
def parse_s3_uri(s3_uri: str) -> tuple[str, str]:
    """
    Parses an S3 URI like s3://bucket-name/path/to/object and returns (bucket, key)

    Args:
        s3_uri (str): The S3 URI to parse
        
    Returns:
        tuple[str, str]: The bucket and key
    """
    pattern = r'^s3://([^/]+)/(.+)$'
    match = re.match(pattern, s3_uri)
    if not match:
        raise ValueError(f"Invalid S3 URI format: {s3_uri}")
    return match.group(1), match.group(2)


def copy_public_s3_object_to_private_bucket(public_s3_uri: str, dest_bucket: str, dest_key: str, aws_profile: str = 'default') -> None:
    """
    Copies a public S3 object to a private bucket

    Args:
        public_s3_uri (str): The S3 URI of the public object to copy
        dest_bucket (str): The name of the private bucket to copy to
        dest_key (str): The key of the object to copy to
        aws_profile (str): The AWS profile to use for the authenticated client
    """

    # Parse source bucket and key
    source_bucket, source_key = parse_s3_uri(public_s3_uri)

    # Anonymous client to read public object
    anon_s3 = boto3.client('s3', config=botocore.client.Config(signature_version=botocore.UNSIGNED))

    print(f"Downloading from {public_s3_uri}...")
    response = anon_s3.get_object(Bucket=source_bucket, Key=source_key)
    data = response['Body'].read()

    print(f"Uploading to s3://{dest_bucket}/{dest_key} ...")
    s3_client.put_object(Bucket=dest_bucket, Key=dest_key, Body=data)

    print("✅ Copy completed successfully!")

In [None]:
# Copy videos to the S3 bucket
for video_uri in sample_videos:
    # Extract the filename from the S3 key
    _, src_key = parse_s3_uri(video_uri)
    filename = src_key.split("/")[-1]
    dest_key = f"{S3_VIDEOS_PATH}/{filename}"
    copy_public_s3_object_to_private_bucket(
        public_s3_uri=video_uri,
        dest_bucket=S3_BUCKET_NAME,
        dest_key=dest_key
    )

## Part 1: Using Pegasus on Bedrock

In [37]:
MODEL_ID_REGIONS = {
    "us-west-2": "us.twelvelabs.pegasus-1-2-v1:0",
    "eu-west-1": "eu.twelvelabs.pegasus-1-2-v1:0"
}

In [39]:
try:
    MODEL_ID = MODEL_ID_REGIONS[AWS_REGION]
except KeyError:
    raise ValueError(f"Pegasus 1.2 is not supported for {AWS_REGION}")

In [27]:
s3_response = s3_client.list_objects_v2(Bucket=S3_BUCKET_NAME, Prefix=S3_VIDEOS_PATH)

# List all object keys
if 'Contents' in s3_response:
    object_keys = [obj['Key'] for obj in s3_response['Contents']]
    for key in object_keys:
        print(key)
    print(f"\nTotal objects found: {len(object_keys)}")
else:
    print("No objects found in the specified bucket and prefix.")


videos/Sparks_4096x2160_5994fps_SDR.mp4

Total objects found: 1


In [28]:
video_s3_key = "videos/Sparks_4096x2160_5994fps_SDR.mp4" # TODO: Replace with your video S3 key

### Part 1a: Analyze with Pegasus on Bedrock

In [None]:
prompt = "What is the video about?"

request_body = {
    "inputPrompt": prompt,
    "mediaSource": {
        "s3Location": {
            "uri": f"s3://{S3_BUCKET_NAME}/{video_s3_key}",
            "bucketOwner": aws_account_id
        }
    },
    "temperature": 0
}

response = bedrock_client.invoke_model(
    modelId=MODEL_ID,
    body=json.dumps(request_body),
    contentType="application/json",
    accept="application/json"
)

response_body = json.loads(response.get("body").read())
print(response_body["message"])

The video showcases various activities at a construction site, beginning with a time-lapse of a sunrise over a forested area. It then follows a worker in safety gear as he exits a lift cage, adjusts his harness, and walks towards the edge of a building's roof. The worker engages in welding tasks, meticulously joining sections of beams while sparks fly. The video also captures other workers performing their duties, such as drinking water, discussing something outside the frame, sweeping debris, and operating machinery. Throughout the video, the construction site's environment is highlighted, including its structural elements and the natural backdrop of trees and mountains.


### Part 1b: Pegasus streaming response

In [54]:
prompt = "Summarize the video"

request_body = {
    "inputPrompt": prompt,
    "mediaSource": {
        "s3Location": {
            "uri": f"s3://{S3_BUCKET_NAME}/{video_s3_key}",
            "bucketOwner": aws_account_id
        }
    },
    "temperature": 0
}

streaming_response = bedrock_client.invoke_model_with_response_stream(
    modelId=MODEL_ID,
    body=json.dumps(request_body),
    contentType="application/json",
    accept="application/json"
)

# Extract and print the response text in real-time.
message = ""
for event in streaming_response["body"]:
    chunk = json.loads(event["chunk"]["bytes"])
    print(chunk["message"], end="")

print(message)



The video showcases a day at a construction site, beginning with a time-lapse of a sunrise over a forested landscape. A worker in safety gear exits a lift cage and adjusts his harness before walking towards the edge of a building's roof. He welds beams together, with close-ups capturing the sparks and intense light from the welding torch. Inside the building, the construction progresses with workers conversing near scaffolding and another sweeping debris. The scene transitions to a dirt road with water spraying as vehicles drive through, and concludes with silhouettes of people walking at night. Throughout the video, there are occasional unclear dialogues that do not significantly contribute to the overall narrative.


## Part 2: Video Analysis with Pegasus

### Part 2a: Summaries, hashtags, and highlights

### Part 2b: Structured outputs