# Amazon Bedrock Knowledge Bases - Audio and Video Data Preparation using Amazon Bedrock Data Automation

# Introduction

This module demonstrates how to build a Multimodal Retrieval-Augmented Generation (RAG) application using Amazon Bedrock Data Automation (BDA) and Bedrock Knowledge Bases (KB). The application is designed to analyze and generate insights from multi-modalal data, including video and audio data. By incorporating contextual information from your own data sources with BDA, you can create highly accurate and secure intelligent search Generative AI applications.

In this notebook, it shows the first step of building this intelligent search application: how to efficiently process video and audio data by using BDA to generate contextual outputs for KB embedding.

With the latest integration between BDA and Amazon Bedrock Knowledge Bases, you can specify BDA as parser of your data source for Bedrock Knowledge Bases.

## Key Features

- Amazon Bedrock Data Automation (BDA): A managed service that automatically extracts content from multimodal data. BDA streamlines the generation of valuable insights from unstructured multimodal content such as documents, images, audio, and videos through a unified multi-modal inference API.
  
- Bedrock KB to build a RAG solution with BDA: Amazon Bedrock KB extract multi-modal content using BDA, generating semantic embeddings using the selected embedding model, and storing them in the chosen vector store. This enables users to retrieve and generate answers to questions derived not only from text but also from image, video and audio data. Additionally, retrieved results include source attribution for visual data, enhancing transparency and building trust in the generated outputs.

## Prerequisites
Please make sure to enable `Anthropic Claude 3 Sonnet` , `Amazon Nova Micro` and  `Titan Text Embeddings V2` model access in Amazon Bedrock Console

You need to have suitable IAM role permission to run this notebook. For IAM role, choose either an existing IAM role in your account or create a new role. The role must the necessary permissions to invoke the BDA, Bedrock KB, create IAM roles, SageMaker and S3 APIs.

Note: The AdministratorAccess IAM policy can be used, if allowed by security policies at your organization.

<div class="alert alert-block alert-info">
<b>Note:</b> Please run the notebook cell one at a time instead of using "Run All Cells" option.
</div>

# Setup notebook and boto3 clients

In this step, we will import some necessary libraries that will be used throughout this notebook. To use Amazon Bedrock Data Automation (BDA) with boto3, you'll need to ensure you have the latest version of the AWS SDK for Python (boto3) installed. Version Boto3 1.35.96 of later is required.

Note: At time of Public Preview launch, BDA is available in us-west-2 only.

In [None]:
%pip install --upgrade pip --quiet
%pip install -r ../requirements.txt --no-deps --quiet
%pip install -r ../requirements.txt --upgrade --quiet

In [None]:
# restart kernel
from IPython.core.display import HTML
HTML("<script>Jupyter.notebook.kernel.restart()</script>")

In [None]:
%load_ext autoreload
%autoreload 2

import warnings
warnings.filterwarnings('ignore')

<div class="alert alert-block alert-info">
<b>Note:</b> In this workshop, a new S3 bucket following the naming convention "kb-bda-multimodal-datasource-{account_id}" will be used, and the input and output will be saved under a folder called "bda" in the default bucket.
</div>

In [None]:
import boto3
from botocore.exceptions import ClientError
import json, uuid
from datetime import datetime
import time
from time import sleep
import random
from tqdm import tqdm
from pathlib import Path
from IPython.display import JSON, IFrame, Audio, display, clear_output
import IPython.display as display
import logging
import sys
from pathlib import Path

# Get current path and go up two parent directories
current_path = Path().resolve()
parent_path = current_path.parent.parent  # Go up two levels

# Add to sys.path if not already there
if str(parent_path) not in sys.path:
    sys.path.append(str(parent_path))

# Now you can import from utils
from utils.knowledge_base import BedrockKnowledgeBase


In [None]:
#Clients
suffix = random.randrange(200, 900)

sts_client = boto3.client('sts')
account_id = sts_client.get_caller_identity()["Account"]

bucket_name_kb = f'bedrock-kb-{suffix}-1' # replace it with your first bucket name.
region_name = "us-west-2" # can be removed ones BDA is GA and available in other regions.
region = region_name

s3_client = boto3.client('s3', region_name=region_name)

bda_client = boto3.client('bedrock-data-automation', region_name=region_name)
bda_runtime_client = boto3.client('bedrock-data-automation-runtime', region_name=region_name)

bedrock_agent_client = boto3.client('bedrock-agent')
bedrock_agent_runtime_client = boto3.client('bedrock-agent-runtime') 

logging.basicConfig(format='[%(asctime)s] p%(process)s {%(filename)s:%(lineno)d} %(levelname)s - %(message)s', level=logging.INFO)
logger = logging.getLogger(__name__)

In [None]:
# Function to check if the bucket exists, if not, create the data_bucket
from utils.knowledge_base_operators import bucket_exists
            
# Create S3 bucket for the KB if it doesn't exist
if not bucket_exists(bucket_name_kb):
    print(f"Bucket '{bucket_name_kb}' does not exist. Creating it now...")
    if region == "us-east-1":
        s3_client.create_bucket(Bucket=bucket_name_kb)
    else:
        s3_client.create_bucket(
            Bucket=bucket_name_kb,
            CreateBucketConfiguration={'LocationConstraint': region}
        )
    print(f"Bucket '{bucket_name_kb}' created successfully.")
else:
    print(f"Bucket '{bucket_name_kb}' already exists.")


bucket_name_input = f's3://{bucket_name_kb}/bda/input'      # DBA input path
bucket_name_output = f's3://{bucket_name_kb}/bda/output'    # DBA output path

## Create a BDA project
To start a BDA job, you need a BDA project, which organizes both standard and custom output configurations. This project is reusable, allowing you to apply the same configuration to process multiple video/audio files that share the same settings.

In [None]:
project_name= f'bda-workshop-kb-project-{str(uuid.uuid4())[0:4]}'

# delete project if it already exists
projects_existing = [project for project in bda_client.list_data_automation_projects()["projects"] if project["projectName"] == project_name]
if len(projects_existing) >0:
    print(f"Deleting existing project: {projects_existing[0]}")
    bda_client.delete_data_automation_project(projectArn=projects_existing[0]["projectArn"])

In [None]:
#BDA API standard output schema
response = bda_client.create_data_automation_project(
    projectName=project_name,
    projectDescription='BDA workshop sample project',
    projectStage='DEVELOPMENT',
    standardOutputConfiguration={
        "video": {
            "extraction": {
                "category": {
                    "state": "ENABLED",
                    "types": ["CONTENT_MODERATION", "TEXT_DETECTION", "TRANSCRIPT"]
                },
                "boundingBox": {"state": "ENABLED"}
            },
            "generativeField": {
                "state": "ENABLED",
                "types": ["VIDEO_SUMMARY", "SCENE_SUMMARY", "IAB"]
            }
        },
        "audio": {
            "extraction": {
                "category": {
                    "state": "ENABLED", 
                    "types": ["AUDIO_CONTENT_MODERATION", "CHAPTER_CONTENT_MODERATION", "TRANSCRIPT"]
                }
            },
            "generativeField": {
                "state": "ENABLED",
                "types": ["AUDIO_SUMMARY", "CHAPTER_SUMMARY", "IAB"]
            }
        }
    }
)

In [None]:
kb_project_arn = response.get("projectArn")
print("BDA kb project ARN:", kb_project_arn)

In [None]:
 # Upload an audio file
from IPython.display import Audio,Video, display

file_name_audio = '../../synthetic_dataset/podcastdemo.mp3'
object_name_audio = f'bda/input/{file_name_audio}'

s3_client.upload_file(file_name_audio, bucket_name_kb, object_name_audio)

file_name_video = '../../synthetic_dataset/bda.m4v'
object_name_video = f'bda/input/{file_name_video}'

s3_client.upload_file(file_name_video, bucket_name_kb, object_name_video)

In [None]:
# Load and play an MP3 file
display(Audio(file_name_audio, autoplay=True))

In [None]:
# Load and play an MP4 file
#Video(file_name_video,embed=True)

from IPython.display import HTML
from utils.knowledge_base_operators import play


play(file_name_video)

### Start BDA tasks
We will now invoke the BDA API to process the uploaded audio file. You need to provide the BDA project ARN that we created at the beginning of the lab and specify an S3 location where BDA will store the output results.

For a complete API reference for invoke a BDA async task, refer to this [document](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/bedrock-data-automation-runtime/client/invoke_data_automation_async.html).

In [None]:
# Start BDA task audio

input_name = object_name_audio
output_name = f'bda/output/' 

response_aud = bda_runtime_client.invoke_data_automation_async(
    inputConfiguration={'s3Uri':  f"s3://{bucket_name_kb}/{input_name}"},
    outputConfiguration={'s3Uri': f"s3://{bucket_name_kb}/{output_name}"},
    dataAutomationConfiguration={
        'dataAutomationArn': kb_project_arn,
        'stage': 'DEVELOPMENT'
    })
response_aud

In [None]:
invocation_audio_arn = response_aud.get("invocationArn")
print("BDA audio task started:", invocation_audio_arn)

In [None]:
# Start BDA task video
input_name = object_name_video
output_name = f'bda/output/' 

response_vid = bda_runtime_client.invoke_data_automation_async(
    inputConfiguration={'s3Uri':  f"s3://{bucket_name_kb}/{object_name_video}"},
    outputConfiguration={'s3Uri': f"s3://{bucket_name_kb}/{output_name}"},
    dataAutomationConfiguration={
        'dataAutomationArn': kb_project_arn,
        'stage': 'DEVELOPMENT'
    })
response_vid

In [None]:
invocation_video_arn = response_vid.get("invocationArn")
print("BDA video task started:", invocation_video_arn)

### We can monitor the progress status of BDA task execution, by running the code cell below

In [None]:
statusAudio,statusVideo, status_aud_response, status_vid_response = None, None, None, None
while (statusAudio not in ["Success","ServiceError","ClientError"]) and (statusAudio not in ["Success","ServiceError","ClientError"]):
    status_aud_response = bda_runtime_client.get_data_automation_status(
        invocationArn=invocation_audio_arn
    )
    statusAudio = status_aud_response.get("status")
    clear_output(wait=True)
   # print(f"{datetime.now().strftime('%H:%M:%S')} : BDA kb audio task: {statusAudio}")
    
    status_vid_response = bda_runtime_client.get_data_automation_status(
        invocationArn=invocation_video_arn
    )
    statusVideo = status_vid_response.get("status")
    clear_output(wait=True)
    print(f"{datetime.now().strftime('%H:%M:%S')} : "\
          f"BDA kb video task: {statusVideo} "\
          f"BDA kb audio task: {statusAudio}")
    time.sleep(5)

output_aud_config = status_aud_response.get("outputConfiguration",{}).get("s3Uri")
print("Ouput configuration file:", output_aud_config)

output_vid_config = status_vid_response.get("outputConfiguration",{}).get("s3Uri")
print("Ouput configuration file:", output_vid_config)

# Examine the BDA output for the processed audio file

In [None]:
out_vid_loc = status_vid_response['outputConfiguration']['s3Uri'].split("/job_metadata.json", 1)[0].split(bucket_name_kb+"/")[1]
out_vid_loc += "/0/standard_output/0/result.json"
print(out_vid_loc)
s3_client.download_file(bucket_name_kb, out_vid_loc, 'result_vid.json')

In [None]:
data_vid = json.load(open('result_vid.json'))
print(data_vid["video"]["summary"])

In [None]:
#download audio data file
out_aud_loc = status_aud_response['outputConfiguration']['s3Uri'].split("/job_metadata.json", 1)[0].split(bucket_name_kb+"/")[1]
out_aud_loc += "/0/standard_output/0/result.json"
print(out_aud_loc)
s3_client.download_file(bucket_name_kb, out_aud_loc, 'result_aud.json')

## Conclusion

Congrats! By following this notebook, you finished the BDA processing of video and audio files, and you are ready to build a robust Multimodal RAG application tailored to your specific needs in the next notebook: 02_audio_video_rag_kb.