# Amazon Bedrock Multimodal Knowledge Bases with S3 Vectors
This notebook provides sample code for building multimodal RAG applications using Amazon Bedrock Knowledge Bases with Amazon Nova Multimodal Embeddings and S3 Vectors. This notebook demonstrates:

1. Overview
2. Pre-requisites
3. Uploading Product Catalog (Images/Videos)
4. Creating an S3 Vector Store and Index
5. Creating a Multimodal Knowledge Base
6. Creating and Syncing the Data Source
7. Testing with Text Queries
8. Testing with Image-based Visual Search
9. Cleanup

## Overview
Amazon Bedrock Knowledge Bases now supports multimodal retrieval, enabling you to search and retrieve information across text, images, audio, and video within a fully managed service.

### What's New?
Previously, Bedrock Knowledge Bases supported text documents and images, but video and audio required custom preprocessing pipelines. With multimodal retrieval, you can now:

- **Ingest multiple content types**: Process text, images, videos, and audio in a unified workflow
- **Preserve visual context**: Content is encoded using multimodal embeddings that maintain visual and audio characteristics
- **Enable cross-modal search**: Search using text to find videos, or upload an image to find visually similar content

### Amazon Nova Multimodal Embeddings
In this notebook, we'll use Amazon Nova Multimodal Embeddings‚Äîthe first unified embedding model that encodes text, documents, images, video, and audio into a single shared vector space. This enables powerful use cases like:

- Visual product search in e-commerce
- Finding similar scenes in video content
- Matching products across different media types

### Use Case: Visual Product Search
We'll build a product catalog search system where customers can:
- Search using text descriptions
- Upload a photo to find similar products
- Query using natural language about product features

The system will retrieve visually similar items by comparing embedded representations across your product images and videos, stored efficiently in S3 Vectors.

## Prerequisites

To complete this notebook you should have:

1. An AWS account with appropriate permissions
2. A role with access to: Amazon S3, AWS STS, Amazon Bedrock, and S3 Vectors
3. Product images/videos in a local `product-catalog` folder

### About the Dataset
This notebook assumes you have product images (phone cases, accessories, etc.) in a local folder. The multimodal Knowledge Base will process these images and enable visual similarity search. You can also bring in your own dataset of multi-modal content.

### Setup

Let's first install the required dependencies and initialize the boto3 clients we'll need throughout this notebook.

In [None]:
# Install or update boto3 and pillow for image handling
!pip install -qU boto3 pillow matplotlib

In [None]:
import os
import sys
import json
import time
import uuid
import boto3
import base64
from io import BytesIO
from PIL import Image
from botocore.client import Config
from botocore.exceptions import ClientError

# Import utility functions
from utils import (
    generate_short_code, 
    create_bedrock_execution_role, 
    empty_and_delete_bucket, 
    create_s3_bucket
)

# Create boto3 session and get account information
boto3_session = boto3.session.Session()
region_name = boto3_session.region_name

# Verify we're in a supported region for Nova Multimodal Embeddings
if region_name != 'us-east-1':
    print(f"‚ö†Ô∏è Warning: Amazon Nova Multimodal Embeddings is currently only available in us-east-1.")
    print(f"   Your current region is: {region_name}")
    print(f"   Please switch to us-east-1 to use this feature.")
else:
    print(f"‚úÖ Region check passed: {region_name}")

# Initialize AWS clients
iam_client = boto3.client('iam')
s3_client = boto3.client('s3')
sts_client = boto3.client('sts')
account_id = sts_client.get_caller_identity()['Account']

# Create s3vectors client for vector store operations
s3vectors = boto3.client('s3vectors', region_name=region_name)

# Create bedrock agent clients with extended timeouts for long-running operations
bedrock_config = Config(
    connect_timeout=120, 
    read_timeout=120, 
    retries={'max_attempts': 0}, 
    region_name=region_name
)
bedrock_agent_runtime_client = boto3_session.client("bedrock-agent-runtime", config=bedrock_config)
bedrock_agent_client = boto3.client('bedrock-agent', region_name=region_name)

# Generate unique identifier for resource names to avoid conflicts
unique_id = generate_short_code()

# Define resource names with unique identifiers
bucket_name = f"product-catalog-{unique_id}"
multimodal_storage_bucket_name = f"multimodal-product-catalog-{unique_id}"
vector_store_name = f"multimodal-vector-store-{unique_id}"
vector_index_name = f"multimodal-vector-index-{unique_id}"
kb_name = f"kb-product-catalog-{unique_id}"

print(f"\n{'='*60}")
print(f"Resource Configuration")
print(f"{'='*60}")
print(f"Unique Identifier:  {unique_id}")
print(f"AWS Account ID:     {account_id}")
print(f"AWS Region:         {region_name}")
print(f"S3 Bucket:          {bucket_name}")
print(f"S3 Multimodal Data: {multimodal_storage_bucket_name}")
print(f"Vector Store:       {vector_store_name}")
print(f"Vector Index:       {vector_index_name}")
print(f"Knowledge Base:     {kb_name}")
print(f"{'='*60}")

## Upload Product Catalog to S3

First, we'll create an S3 bucket and upload our product catalog. The catalog should contain images and/or videos of products (e.g., phone cases, accessories).

Make sure you have your product images/videos in a local folder named `product-catalog` in the same directory as this notebook.

In [None]:
# Define the local folder containing your product catalog
catalog_folder = "product-catalog"

# Verify the folder exists
if not os.path.exists(catalog_folder):
    print(f"‚ùå Error: Folder '{catalog_folder}' not found.")
    print(f"   Please create a folder named '{catalog_folder}' and add your product images/videos.")
    print(f"   Expected location: {os.path.abspath(catalog_folder)}")
else:
    # Count files
    files = [f for f in os.listdir(catalog_folder) 
             if os.path.isfile(os.path.join(catalog_folder, f)) and not f.startswith('.')]
    
    print(f"‚úÖ Found {len(files)} files in {catalog_folder}")
    
    # Show file types
    extensions = {}
    for f in files:
        ext = os.path.splitext(f)[1].lower()
        extensions[ext] = extensions.get(ext, 0) + 1
    
    print(f"   File types: {dict(extensions)}")
    print(f"   Sample files: {files[:5]}")

In [None]:
# Create S3 bucket for product catalog
create_s3_bucket(bucket_name, region=region_name)
# Create S3 bucket for multimodal storage Location
create_s3_bucket(multimodal_storage_bucket_name, region=region_name)


In [None]:
def upload_folder_to_s3(folder_path, bucket_name, prefix=''):
    """
    Upload all files from a folder to an S3 bucket
    
    Args:
        folder_path: Path to the folder containing files to upload
        bucket_name: Name of the S3 bucket
        prefix: Prefix to add to the object names in S3 (optional)
    """
    upload_count = 0
    total_files = 0
    
    # Count total files first (excluding hidden files)
    for _, _, files in os.walk(folder_path):
        total_files += len([f for f in files if not f.startswith('.')])
    
    if total_files == 0:
        print(f"‚ö†Ô∏è No files found in {folder_path}")
        return
    
    print(f"\nUploading {total_files} files to S3 bucket '{bucket_name}'...")
    print("-" * 60)
    
    # Upload files
    for root, dirs, files in os.walk(folder_path):
        for file in files:
            # Skip hidden files and system files
            if file.startswith('.'):
                continue
                
            local_path = os.path.join(root, file)
            relative_path = os.path.relpath(local_path, folder_path)
            s3_path = os.path.join(prefix, relative_path).replace("\\", "/")
            
            try:
                s3_client.upload_file(local_path, bucket_name, s3_path)
                upload_count += 1
                
                # Show progress periodically
                if upload_count % 10 == 0 or upload_count == total_files:
                    print(f"Progress: {upload_count}/{total_files} files uploaded")
                    
            except ClientError as e:
                print(f"‚ùå Error uploading {local_path}: {e}")
    
    print("-" * 60)
    print(f"‚úÖ Successfully uploaded {upload_count} files to S3 bucket '{bucket_name}'\n")

# Upload the product catalog
upload_folder_to_s3(catalog_folder, bucket_name)

## Create S3 Vector Store and Index

Now we'll create an S3 Vector Store to hold our multimodal embeddings. S3 Vector Store provides cost-effective and durable storage optimized for large-scale vector datasets with sub-second query performance.

In [None]:
# Define the dimensionality of our embedding vectors
# Amazon Nova Multimodal Embeddings uses 3072 dimensions by default
vector_dimension = 3072

print(f"Vector dimension: {vector_dimension}")
print(f"This matches Amazon Nova Multimodal Embeddings default output dimension")

In [None]:
def create_vector_bucket(vector_bucket_name):
    """Create an S3 Vector bucket and return its ARN"""
    try:
        print(f"Creating S3 Vector Store: {vector_bucket_name}")
        
        # Create the vector bucket
        s3vectors.create_vector_bucket(vectorBucketName=vector_bucket_name)
        print(f"‚úÖ Vector bucket '{vector_bucket_name}' created successfully")
        
        # Get the vector bucket details
        response = s3vectors.get_vector_bucket(vectorBucketName=vector_bucket_name)
        bucket_info = response.get("vectorBucket", {})
        vector_store_arn = bucket_info.get("vectorBucketArn")
        
        if not vector_store_arn:
            raise ValueError("Vector bucket ARN not found in response")
            
        print(f"Vector bucket ARN: {vector_store_arn}")
        return vector_store_arn
        
    except ClientError as e:
        error_code = e.response.get('Error', {}).get('Code', 'Unknown')
        error_message = e.response.get('Error', {}).get('Message', 'Unknown error')
        print(f"‚ùå Error creating vector bucket: {error_code} - {error_message}")
        raise

# Create the vector bucket
vector_store_arn = create_vector_bucket(vector_store_name)

### Creating a Vector Index

Now that we have created the vector store, we need to create a vector index. The vector index is where:

1. Vector embeddings are stored and organized
2. Similarity searches are performed
3. Metadata about our documents is maintained

We'll specify parameters like dimension (3072 for Nova Multimodal), distance metric (cosine similarity), and data type (float32).

In [None]:
def create_and_get_index_arn(s3vectors_client, vector_store_name, vector_index_name, vector_dimension):
    """
    Create a vector index in the specified vector store and return its ARN
    
    Args:
        s3vectors_client: Boto3 client for S3 Vectors
        vector_store_name: Name of the vector store
        vector_index_name: Name for the new index
        vector_dimension: Dimension of the vectors (3072 for Nova Multimodal)
        
    Returns:
        str: ARN of the created index
    """
    # Define index configuration
    index_config = {
        "vectorBucketName": vector_store_name,
        "indexName": vector_index_name,
        "dimension": vector_dimension,
        "distanceMetric": "cosine",  # Using cosine similarity as our metric
        "dataType": "float32",       # Standard for most embedding models
        "metadataConfiguration": {
            "nonFilterableMetadataKeys": ["AMAZON_BEDROCK_TEXT"]  # Text content won't be used for filtering
        }
    }
    
    try:
        print(f"\nCreating vector index with configuration:")
        print(f"  - Dimension: {vector_dimension}")
        print(f"  - Distance Metric: cosine")
        print(f"  - Data Type: float32")
        
        # Create the index
        s3vectors_client.create_index(**index_config)
        print(f"‚úÖ Vector index '{vector_index_name}' created successfully")

        # Get the index ARN
        response = s3vectors_client.list_indexes(vectorBucketName=vector_store_name)
        index_arn = response.get("indexes", [{}])[0].get("indexArn")
        
        if not index_arn:
            raise ValueError("Index ARN not found in response")
            
        print(f"Vector index ARN: {index_arn}")
        return index_arn

    except ClientError as e:
        error_code = e.response.get('Error', {}).get('Code', 'Unknown')
        error_message = e.response.get('Error', {}).get('Message', 'Unknown error')
        print(f"‚ùå Failed to create or retrieve index: {error_code} - {error_message}")
        raise

# Create the vector index
vector_index_arn = create_and_get_index_arn(
    s3vectors,
    vector_store_name,
    vector_index_name,
    vector_dimension
)

print(f"\n{'='*60}")
print(f"‚úÖ Vector index successfully created")
print(f"{'='*60}")

## Create a Multimodal Knowledge Base

Now we'll create a Knowledge Base configured for multimodal retrieval using Amazon Nova Multimodal Embeddings and S3 Vectors as the vector store. First, we need to set up the appropriate IAM permissions.

In [None]:
# Create IAM role for Bedrock Knowledge Base
print(f"\nCreating IAM execution role for Knowledge Base...\n")

create_role = create_bedrock_execution_role(
    unique_id, 
    region_name, 
    bucket_name,
    multimodal_storage_bucket_name, 
    vector_store_name,
    vector_index_name,
    account_id
)

roleArn = create_role["Role"]["Arn"]
roleName = create_role["Role"]["RoleName"]

print(f"\n{'='*60}")
print(f"IAM Role Details")
print(f"{'='*60}")
print(f"Role Name: {roleName}")
print(f"Role ARN:  {roleArn}")
print(f"{'='*60}")

In [None]:
bedrock_agent_runtime_client = boto3_session.client("bedrock-agent-runtime")
bedrock_agent_client = boto3.client('bedrock-agent')

In [None]:
# Wait for IAM role propagation
print("\nWaiting for IAM role propagation (30 seconds)...")
print("This ensures the role and policies are fully available before creating the Knowledge Base.")
time.sleep(30)
print("‚úÖ Role propagation complete\n")

# Create the multimodal Knowledge Base with Nova embeddings and S3 Vectors
print(f"Creating Knowledge Base: {kb_name}")
print(f"  - Embedding Model: Amazon Nova Multimodal Embeddings v1")
print(f"  - Vector Dimensions: 3072")
print(f"  - Vector Store: S3 Vectors")

create_kb_response = bedrock_agent_client.create_knowledge_base(
    name=kb_name,
    description='Multimodal Product Catalog Knowledge Base with Amazon Nova Embeddings and S3 Vectors',
    roleArn=roleArn,
    knowledgeBaseConfiguration={
        'type': 'VECTOR',
        'vectorKnowledgeBaseConfiguration': {
            # Use Amazon Nova Multimodal Embeddings
            'embeddingModelArn': f'arn:aws:bedrock:{region_name}::foundation-model/amazon.nova-2-multimodal-embeddings-v1:0',
            'embeddingModelConfiguration': {
                'bedrockEmbeddingModelConfiguration': {
                    'audio': [{'segmentationConfiguration': {'fixedLengthDuration': 5}}],
                    'dimensions': 3072,  # Nova Multimodal default dimension
                    'embeddingDataType': 'FLOAT32',
                    'video': [{'segmentationConfiguration': {'fixedLengthDuration': 5}}]
                }
            },
            # Storage location for extracted images from multimodal documents
            'supplementalDataStorageConfiguration': {
                'storageLocations': [
                    {
                        'type': 'S3',
                        's3Location': {
                            'uri': f's3://{multimodal_storage_bucket_name}'  # Update with your bucket
                        }
                    }
                ]
            }
        },
    },
    storageConfiguration={
        'type': 'S3_VECTORS',
        's3VectorsConfiguration': {
            'indexArn': f'arn:aws:s3vectors:{region_name}:{account_id}:bucket/{vector_store_name}/index/{vector_index_name}',
        },
    }
)

knowledge_base_id = create_kb_response["knowledgeBase"]["knowledgeBaseId"]
print(f"\n‚úÖ Knowledge Base created with ID: {knowledge_base_id}")

print(f"\nWaiting for Knowledge Base to become active...")
print("-" * 60)

# Poll for KB creation status
status = "CREATING"
start_time = time.time()

while status == "CREATING":
    response = bedrock_agent_client.get_knowledge_base(
        knowledgeBaseId=knowledge_base_id
    )
    
    status = response['knowledgeBase']['status']
    elapsed_time = int(time.time() - start_time)
    
    print(f"Status: {status} | Elapsed time: {elapsed_time}s")
    
    if status == "CREATING":
        print("Checking again in 30 seconds...")
        time.sleep(30)
    else:
        break

print("-" * 60)
print(f"\n‚úÖ Knowledge Base creation completed with status: {status}")
print(f"   Total time: {elapsed_time} seconds\n")

## Create and Sync the Data Source

Now we'll create a data source pointing to our S3 bucket with product images/videos, and configure it to use the Amazon Bedrock default parser which will handle multimodal content natively.

In [None]:
# Create the multimodal data source
print(f"Creating data source for S3 bucket: {bucket_name}")

data_source_response = bedrock_agent_client.create_data_source(
    knowledgeBaseId=knowledge_base_id,
    name='product-catalog-ds',
    description='Product catalog with images and videos',
    dataDeletionPolicy='DELETE',
    dataSourceConfiguration={
        'type': 'S3',
        's3Configuration': {
            'bucketArn': f'arn:aws:s3:::{bucket_name}',
        },
    },
    vectorIngestionConfiguration={
        'chunkingConfiguration': {
            'chunkingStrategy': 'NONE',
        }
    }
)

datasource_id = data_source_response["dataSource"]["dataSourceId"]
print(f"‚úÖ Data source created with ID: {datasource_id}")

## Sync the Data Source

Now we'll start the ingestion job to process our product catalog. This will:
1. Read images and videos from S3
2. Generate multimodal embeddings using Amazon Nova
3. Store embeddings in the S3 Vector Store

This process may take several minutes depending on the size and number of files.

In [None]:
# Start the ingestion job
print(f"\nStarting ingestion job for Knowledge Base: {knowledge_base_id}")
print(f"Data Source: {datasource_id}\n")

response_ingestion = bedrock_agent_client.start_ingestion_job(
    dataSourceId=datasource_id,
    description='Initial product catalog sync',
    knowledgeBaseId=knowledge_base_id
)

ingestion_job_id = response_ingestion['ingestionJob']['ingestionJobId']
print(f"‚úÖ Ingestion job started with ID: {ingestion_job_id}")

In [None]:
# Monitor the ingestion job progress
status = "STARTING"
start_time = time.time()

print("\nMonitoring ingestion job progress:")
print("=" * 70)

while status in ["STARTING", "IN_PROGRESS"]:
    response = bedrock_agent_client.get_ingestion_job(
        dataSourceId=datasource_id,
        knowledgeBaseId=knowledge_base_id,
        ingestionJobId=ingestion_job_id
    )
    
    status = response['ingestionJob']['status']
    elapsed_time = int(time.time() - start_time)
    
    stats = response['ingestionJob']['statistics']
    
    print(f"\nStatus: {status} | Elapsed: {elapsed_time}s")
    print(f"  Documents scanned:  {stats['numberOfDocumentsScanned']}")
    print(f"  Documents indexed:  {stats['numberOfNewDocumentsIndexed']}")
    print(f"  Documents failed:   {stats['numberOfDocumentsFailed']}")
    
    if status in ["STARTING", "IN_PROGRESS"]:
        print(f"\n‚è≥ Checking again in 30 seconds...")
        time.sleep(30)
    else:
        break

print("\n" + "=" * 70)

if status == "COMPLETE":
    print(f"\n‚úÖ Ingestion job completed successfully!")
else:
    print(f"\n‚ö†Ô∏è Ingestion job ended with status: {status}")
    
print(f"\nFinal Statistics:")
print(f"  üìÑ Documents scanned:  {stats['numberOfDocumentsScanned']}")
print(f"  ‚úÖ Documents indexed:  {stats['numberOfNewDocumentsIndexed']}")
print(f"  ‚ùå Documents failed:   {stats['numberOfDocumentsFailed']}")
print(f"  ‚è±Ô∏è  Total time:         {elapsed_time} seconds")
print("\n" + "=" * 70)

## Test the Knowledge Base with Text Queries

Now that our multimodal Knowledge Base is ready, let's test it with text-based queries. We'll use the Retrieve API to retrieve the most similar chunks.

In [None]:
input_query = "A metallic phone cover"

response = bedrock_agent_runtime_client.retrieve(
    knowledgeBaseId=knowledge_base_id,
    retrievalQuery={
        "text": input_query
    },
    retrievalConfiguration={
        "vectorSearchConfiguration": {
            "numberOfResults": 5
        }
    }
)

In [None]:
from IPython.display import HTML, display, Image as IPImage
import boto3
import uuid
s3_client = boto3.client('s3')

def display_video_segment(s3_uri, start_time_ms, end_time_ms, score, width=400):
    """Display video segment with controls"""
    bucket = s3_uri.replace('s3://', '').split('/')[0]
    key = '/'.join(s3_uri.replace('s3://', '').split('/')[1:])
    
    presigned_url = s3_client.generate_presigned_url(
        'get_object',
        Params={'Bucket': bucket, 'Key': key},
        ExpiresIn=3600
    )
    
    start_sec = start_time_ms / 1000
    end_sec = end_time_ms / 1000
    video_id = f"video_{uuid.uuid4().hex[:8]}"
    
    html = f"""
    <div style="margin: 20px 0; padding: 10px; border: 2px solid #2196F3; border-radius: 5px; max-width: {width + 20}px;">
        <div style="background: #2196F3; color: white; padding: 8px; margin: -10px -10px 10px -10px;">
            <strong>üìπ Video Segment: {start_sec:.1f}s - {end_sec:.1f}s | Score: {score:.3f}</strong>
        </div>
        
        <video id="{video_id}" width="{width}" controls preload="metadata" style="max-width: 100%;">
            <source src="{presigned_url}" type="video/mp4">
        </video>
        
        <div style="margin-top: 10px; padding: 8px; background: #e3f2fd; border-radius: 3px; font-family: monospace; font-size: 12px;">
            <span id="{video_id}_status">‚è≥ Loading...</span>
        </div>
    </div>
    
    <script>
        (function() {{
            var video = document.getElementById('{video_id}');
            var status = document.getElementById('{video_id}_status');
            var startTime = {start_sec};
            var endTime = {end_sec};
            var seeked = false;
            
            function seekToStart() {{
                if (video.readyState >= 2 && !seeked) {{
                    video.currentTime = startTime;
                    seeked = true;
                    status.innerHTML = '‚úÖ Ready at ' + startTime.toFixed(1) + 's';
                }}
            }}
            
            video.addEventListener('loadedmetadata', seekToStart);
            video.addEventListener('loadeddata', seekToStart);
            video.addEventListener('canplay', seekToStart);
            setTimeout(seekToStart, 100);
            setTimeout(seekToStart, 500);
            
            video.addEventListener('timeupdate', function() {{
                if (video.currentTime >= endTime) {{
                    video.pause();
                    video.currentTime = startTime;
                    status.innerHTML = '‚èπÔ∏è Segment ended';
                }} else if (video.currentTime >= startTime) {{
                    status.innerHTML = '‚ñ∂Ô∏è ' + video.currentTime.toFixed(1) + 's';
                }}
            }});
        }})();
    </script>
    """
    
    display(HTML(html))

def display_image_result(image_b64, source_uri, score, width=400):
    """Display image result - FIXED VERSION"""
    html = f"""
    <div style="margin: 20px 0; padding: 10px; border: 2px solid #4CAF50; border-radius: 5px; max-width: {width + 20}px;">
        <div style="background: #4CAF50; color: white; padding: 8px; margin: -10px -10px 10px -10px;">
            <strong>üñºÔ∏è Image | Relevance Score: {score:.3f}</strong>
        </div>
        <div style="margin: 10px 0; text-align: center;">
            <img src="{image_b64}" style="max-width: {width}px; width: 100%; height: auto; border-radius: 5px;"/>
        </div>
        <div style="margin-top: 10px; padding: 8px; background: #f1f8e9; border-radius: 3px; font-family: monospace; font-size: 11px; word-break: break-all;">
            üìÅ Source: {source_uri}
        </div>
    </div>
    """
    display(HTML(html))

def display_all_retrieval_results(response, video_width=500, image_width=300):
    """Display all retrieval results with separate sizing for videos and images"""
    
    print(f"üîç Found {len(response['retrievalResults'])} relevant results\n")
    
    for idx, result in enumerate(response['retrievalResults'], 1):
        score = result['score']
        
        print(f"\n{'='*80}")
        print(f"Result #{idx}")
        print(f"{'='*80}\n")
        
        if result['content']['type'] == 'VIDEO':
            video_uri = result['content']['video']['s3Uri']
            start_time = result['metadata']['x-amz-bedrock-kb-chunk-start-time-in-millis']
            end_time = result['metadata']['x-amz-bedrock-kb-chunk-end-time-in-millis']
            source_uri = result['location']['s3Location']['uri']
            
            print(f"Type: VIDEO")
            print(f"Source File: {source_uri}")
            print(f"Time Range: {start_time/1000:.1f}s - {end_time/1000:.1f}s")
            print(f"Score: {score:.3f}\n")
            
            display_video_segment(video_uri, start_time, end_time, score, video_width)
            
        elif result['content']['type'] == 'IMAGE':
            image_b64 = result['content']['byteContent']
            source_uri = result['location']['s3Location']['uri']
            
            print(f"Type: IMAGE")
            print(f"Source File: {source_uri}")
            print(f"Score: {score:.3f}\n")
            
            display_image_result(image_b64, source_uri, score, image_width)

# Use it with different sizes for videos vs images
display_all_retrieval_results(response, video_width=600, image_width=300)

## Test with Image-based Visual Search

Now for the powerful part‚Äîvisual search! Upload a reference image to find visually similar products in your catalog. This demonstrates the cross-modal capability of Amazon Nova Multimodal Embeddings.

### Using the Retrieve API

First, let's use the Retrieve API to see the raw retrieval results with similarity scores.

In [None]:
import base64
import os
from PIL import Image
from io import BytesIO
import matplotlib.pyplot as plt
from IPython.display import display
from botocore.exceptions import ClientError

def image_to_base64(image_path):
    """
    Convert an image file to base64 string
    """
    with open(image_path, "rb") as image_file:
        return base64.b64encode(image_file.read()).decode('utf-8')

def display_image_from_base64(base64_str, title="Image"):
    """
    Display image from base64 string
    """
    if base64_str.startswith('data:image'):
        base64_str = base64_str.split(',')[1]
    
    image_bytes = base64.b64decode(base64_str)
    image = Image.open(BytesIO(image_bytes))
    return image

# Path to your reference image
reference_image_path = "test-image/phone.png"  # Update this path

print(f"\n{'='*70}")
print(f"Image-based Visual Search Test")
print(f"{'='*70}\n")

if not os.path.exists(reference_image_path):
    print(f"‚ö†Ô∏è Reference image not found: {reference_image_path}")
    print(f"\nTo test visual search:")
    print(f"1. Place a reference image in the same folder as this notebook")
    print(f"2. Name it 'reference_image.png' (or update the path above)")
    print(f"3. Run this cell again")
    print(f"\nExpected location: {os.path.abspath(reference_image_path)}")
    print(f"\nüí° Alternative: Download a product image from your S3 bucket first!")
    
else:
    # Display the reference image
    print("üîç Reference Image:")
    reference_img = Image.open(reference_image_path)
    
    # Create figure for reference image
    fig, ax = plt.subplots(1, 1, figsize=(4, 4))
    ax.imshow(reference_img)
    ax.axis('off')
    ax.set_title("Search Query Image")
    plt.tight_layout()
    plt.show()
    
    print(f"\nImage size: {reference_img.size}")
    print(f"Image mode: {reference_img.mode}")
    
    # Convert image to base64
    image_base64 = image_to_base64(reference_image_path)
    
    # Determine image format from file extension
    image_format = reference_image_path.split('.')[-1].lower()
    if image_format == 'jpg':
        image_format = 'jpeg'
    
    print(f"\nSearching for visually similar products...")
    print(f"Query image format: {image_format}")
    
    # Query with image using Retrieve API
    response = bedrock_agent_runtime_client.retrieve(
        knowledgeBaseId=knowledge_base_id,  # Use the variable
        retrievalQuery={
            "type": "IMAGE",
            "image": {
                "format": image_format,
                "inlineContent": base64.b64decode(image_base64)
            }
        },
        retrievalConfiguration={
            "vectorSearchConfiguration": {
                "numberOfResults": 5,
            } 
        }
    )

    display_all_retrieval_results(response, video_width=600, image_width=300)
    

## Understanding the Results

The multimodal Knowledge Base with S3 Vectors provides several powerful capabilities:

### Search Methods

1. **Text Queries**: Searches across product images and videos using semantic understanding of your text description
   - Example: "metallic phone cover" finds products with metallic finishes
   
2. **Image Queries**: Finds visually similar products by comparing visual features
   - Colors, patterns, shapes, textures
   - No need for text descriptions

## Next Steps and Use Cases

This multimodal Knowledge Base architecture can be extended for various use cases:

### E-commerce
- **Visual product search**: Customers upload photos to find similar items
- **Style matching**: Find products that match a particular aesthetic
- **Cross-sell recommendations**: Suggest visually complementary products
- **Reverse image search**: Find products seen in social media or other sites

### Manufacturing
- **Equipment manuals**: Search through technical documentation with diagrams
- **Quality control**: Find similar defects in inspection images
- **Training materials**: Locate specific procedures in video tutorials
- **Parts identification**: Match components visually

### Media and Entertainment
- **Video libraries**: Find similar scenes across large video collections
- **Content discovery**: Search using text descriptions or reference images
- **Asset management**: Organize and retrieve visual content efficiently

### Further Enhancements

1. **Add metadata filtering**
   - Price ranges, categories, availability
   - Brand, color, size attributes
   
2. **Implement re-ranking**
   - Use Cohere Rerank for improved relevance
   - Combine with user preferences and behavior
   
3. **Combine with recommendations**
   - Purchase history
   - User preferences
   - Trending products
   
4. **Add audio/video processing**
   - Use Bedrock Data Automation (BDA) for speech transcription
   - Extract audio features from product videos
   - Combine visual and audio embeddings
5. **Send the results to a multimodal model to generate responses**

## Clean up

To avoid ongoing charges, clean up all the resources we've created in this notebook.

In [None]:
print(f"\n{'='*70}")
print("Starting cleanup process...")
print(f"{'='*70}\n")

# Step 1: Delete Knowledge Base
print(f"[1/4] Deleting Knowledge Base: {knowledge_base_id}")
try:
    bedrock_agent_client.delete_knowledge_base(knowledgeBaseId=knowledge_base_id)
    print("      ‚úÖ Knowledge Base deleted successfully\n")
except Exception as e:
    print(f"      ‚ùå Error deleting Knowledge Base: {str(e)}\n")

# Step 2: Delete S3 Vector Store policy
print(f"[2/4] Deleting S3 Vector Store: {vector_store_name}")
try:
    s3vectors.delete_vector_bucket_policy(vectorBucketName=vector_store_name)
    print("      ‚úÖ S3 Vector Store policy deleted successfully\n")
except Exception as e:
    print(f"      ‚ùå Error deleting Vector Store policy: {str(e)}\n")

# Step 3: Empty and delete S3 Bucket
print(f"[3/4] Emptying and deleting S3 Bucket: {bucket_name}")
try:
    empty_and_delete_bucket(bucket_name)
except Exception as e:
    print(f"      ‚ùå Error emptying and deleting S3 Bucket: {str(e)}\n")

# Step 4: Delete IAM Role and policies
print(f"[4/4] Deleting IAM Role: {roleName}")
try:
    # List and detach all attached policies
    attached_policies = iam_client.list_attached_role_policies(RoleName=roleName).get('AttachedPolicies', [])
    
    for policy in attached_policies:
        print(f"      Detaching policy: {policy['PolicyName']}")
        iam_client.detach_role_policy(RoleName=roleName, PolicyArn=policy['PolicyArn'])
        
        # Delete the policy
        try:
            iam_client.delete_policy(PolicyArn=policy['PolicyArn'])
            print(f"      Deleted policy: {policy['PolicyName']}")
        except Exception as e:
            print(f"      Warning: Could not delete policy {policy['PolicyName']}: {e}")
    
    # Delete the role
    iam_client.delete_role(RoleName=roleName)
    print(f"      ‚úÖ IAM Role deleted successfully\n")
    
except Exception as e:
    print(f"      ‚ùå Error deleting IAM Role: {str(e)}\n")

print(f"{'='*70}")
print("‚úÖ Cleanup completed successfully!")
print(f"{'='*70}\n")