# File Name: simple_multimodal_data_prep.ipynb
### Location: Chapter 19
### Purpose: 
#####       1. Processing a dataset of images, saving them to a specified directory while generating and storing metadata descriptions.
#####       2. Generates multimodal embeddings by accepting an image or text description, processes the input, and invokes a model via the Bedrock runtime client, returning the resulting embeddings.
#####       3. Adds multimodal embeddings to each dictionary in image_metadata_list.
##### Dependency: simple-sageMaker-bedrock.ipynb at Chapter 3 should work properly.
# <ins>-----------------------------------------------------------------------------------</ins>

# <ins>Amazon SageMaker Classic</ins>
#### Those who are new to Amazon SageMaker Classic. Follow the link for the details. https://docs.aws.amazon.com/sagemaker/latest/dg/studio.html

# <ins>Environment setup of Kernel</ins>
##### Fill "Image" as "Data Science"
##### Fill "Kernel" as "Python 3"
##### Fill "Instance type" as "ml-t3-medium"
##### Fill "Start-up script" as "No Scripts"
##### Click "Select"

###### Refer https://docs.aws.amazon.com/sagemaker/latest/dg/notebooks-create-open.html for details.

# <ins>Mandatory installation on the kernel through pip</ins>

##### This lab will work with below software version. But, if you are trying with latest version of boto3, awscli, and botocore. This code may fail. You might need to change the corresponding api. 

##### You will see pip dependency errors. you can safely ignore these errors and continue executing rest of the cell. 

In [None]:
%pip install --no-build-isolation --force-reinstall -q \
    "boto3>=1.28.57" \
    "awscli>=1.29.57" \
    "botocore>=1.31.57" \
    "utils"

# <ins>Disclaimer</ins>

##### You will see pip dependency errors. you can safely ignore these errors and continue executing rest of the cell.

# <ins>Restart the kernel</ins>

In [None]:
# restart kernel
from IPython.core.display import HTML
HTML("<script>Jupyter.notebook.kernel.restart()</script>")

# <ins>Python package import</ins>

##### boto3 offers various clients for Amazon Bedrock to execute various actions.
##### botocore is a low-level interface to AWS tools, while boto3 is built on top of botocore and provides additional features

In [None]:
import json
import os
import boto3
import botocore
import warnings
import time
import pandas as pd
import re
import json
import base64
import numpy as np
import seaborn as sns
from PIL import Image
from io import BytesIO
from tqdm import tqdm
import sagemaker
from utils import *
from datasets import load_dataset
from IPython.display import display
import matplotlib.pyplot as plt

### Ignore warning 

In [None]:
warnings.filterwarnings('ignore')

## Define important environment variable

In [None]:
# Try-except block to handle potential errors
try:
    # Create a new Boto3 session to interact with AWS services
    # This session is responsible for managing credentials and region configuration
    boto3_session = boto3.session.Session()

    # Retrieve the current AWS region from the session (e.g., 'us-east-1', 'us-west-2')
    aws_region_name = boto3_session.region_name
    
    # Initialize Bedrock and Bedrock Runtime clients using Boto3
    # These clients will allow interactions with Bedrock-related AWS services
    boto3_bedrock_client = boto3.client('bedrock', region_name=aws_region_name)
    boto3_bedrock_runtime_client = boto3.client('bedrock-runtime', region_name=aws_region_name)
    
    # Create a SageMaker session and retrieve the execution role ARN
    # The role ARN is used to authorize SageMaker to perform tasks on behalf of the user
    sagemaker_role_arn = sagemaker.get_execution_role()
    
    # Select Amazon titan-embed-image-v1 as Embedding model for multimodal indexing
    multimodal_embed_model_id = "amazon.titan-embed-image-v1"

    # Store all relevant variables in a dictionary for easier access and management
    variables_store = {
        "aws_region_name": aws_region_name,                          # AWS region name
        "boto3_bedrock_client": boto3_bedrock_client,                # Bedrock client instance
        "boto3_bedrock_runtime_client": boto3_bedrock_runtime_client,  # Bedrock Runtime client instance
        "boto3_session": boto3_session,                               # Current Boto3 session object
        "sagemaker_role_arn" : sagemaker_role_arn,
        "multimodal_embed_model_id": multimodal_embed_model_id
    }

    # Print all stored variables for debugging and verification
    for var_name, value in variables_store.items():
        print(f"{var_name}: {value}")

# Handle any exceptions that occur during the execution
except Exception as e:
    # Print the error message if an unexpected error occurs
    print(f"An unexpected error occurred: {e}")


Vector DB Creation Architecture

<img src="./vector_db_arc_diagram.png" style="width: 600px; height: 400px;">

# Datset for this use cases

### Refer: https://huggingface.co/datasets/ashraq/fashion-product-images-small

### Metadata

    dataset_info:
      features:
        - name: id
          dtype: int64
        - name: gender
          dtype: string
        - name: masterCategory
          dtype: string
        - name: subCategory
          dtype: string
        - name: articleType
          dtype: string
        - name: baseColour
          dtype: string
        - name: season
          dtype: string
        - name: year
          dtype: float64
        - name: usage
          dtype: string
        - name: productDisplayName
          dtype: string
        - name: image
          dtype: image
      splits:
        - name: train
          num_bytes: 546202015.44
          num_examples: 44072
      download_size: 271496441
      dataset_size: 546202015.44
    Dataset Card for "fashion-p


### Note: We will only use 200 random data from this datasets to save the computation time

In [None]:
%%time
fashion_product_images = load_dataset("ashraq/fashion-product-images-small")

# metadata check
fashion_product_images

# look one sample data
fashion_product_images["train"][5]

# Function to display the image

    The display_image function is designed to handle image validation, resizing, saving, and displaying with robust error handling. 
    It first checks if the provided image is valid and meets the minimum size requirement. 
    If the image passes the checks, it is resized to the specified target size and saved to a file. 
    The function then displays the saved image inline and prints a message with details about the original and resized image sizes.

In [None]:
%%time
def display_image(image, min_size=(64, 64), target_size=(128, 128), save_path="resized_image.png"):
    try:
        if image is None:
            raise ValueError("No image provided.")
        
        if not isinstance(image, Image.Image):
            raise TypeError("The input is not a valid PIL Image instance.")
        
        if image.size[0] < min_size[0] or image.size[1] < min_size[1]:
            raise ValueError(
                f"Image size {image.size} is smaller than the minimum required size {min_size}."
            )
        
        # Resize the image
        resized_image = image.resize(target_size)
        
        # Save resized image
        resized_image.save(save_path)
        print(f"Image resized and saved successfully. Original size: {image.size}, Resized size: {resized_image.size}")
        
        # Display inline
        saved_image = Image.open(save_path)
        display(saved_image)
        print(f"Image displayed inline from saved file: {save_path}")
    
    except Exception as e:
        print(f"Error displaying image: {e}")


# Example Usage
display_image(fashion_product_images["train"][10]["image"], min_size=fashion_product_images["train"][10]["image"].size)

# Processing a dataset of images, saving them to a specified directory while generating and storing metadata descriptions

    The below code provides a modular approach to processing and saving images from a dataset. 
    It includes three main functions: save_image, generate_description, and process_images. 
    The save_image function ensures proper error handling and validation before saving the image to the specified path. 
    The generate_description function formats and returns a string description from the metadata associated with each image, including attributes like gender, category, color, and season. 
    The process_images function iterates through a subset of the dataset, saving each image and generating its metadata description. 

##### The example usage processes a limited number of images (200) for efficiency, storing them in the downloaded_images folder. You can run this file with all the dataset. Be aware of the cost of execution for entire dataset. 

In [None]:
%%time
def save_image(image, image_path):
    """
    Saves an image to the specified path with error handling.
    
    Args:
        image (PIL.Image.Image): The image to save.
        image_path (str): The path to save the image.
    
    Raises:
        ValueError: If the image or path is invalid.
    """
    try:
        if image is None:
            raise ValueError("No image provided for saving.")
        if not isinstance(image, Image.Image):
            raise TypeError("Provided image is not a valid PIL Image instance.")
        
        image.save(image_path)
        print(f"Image saved successfully to {image_path}")
    except Exception as e:
        print(f"Error saving image: {e}")
        raise

def generate_description(item):
    """
    Generates a description string from the dataset item.
    
    Args:
        item (dict): A dictionary containing image metadata.
    
    Returns:
        str: A formatted description string.
    """
    try:
        return (f"gender: {item['gender']}, master_category: {item['masterCategory']}, "
                f"sub_category: {item['subCategory']}, article_type: {item['articleType']}, "
                f"base_colour: {item['baseColour']}, season: {item['season']}, "
                f"year: {int(item['year'])}, usage: {item['usage']}, productDisplayName: {item['productDisplayName']}")
    except KeyError as e:
        print(f"Missing key in dataset item: {e}")
        raise
    except Exception as e:
        print(f"Error generating description: {e}")
        raise

def process_images(dataset, output_dir, max_images=100):
    """
    Processes images from the dataset and saves them to the specified directory.
    
    Args:
        dataset (list): A list of image data items.
        output_dir (str): The directory to save images.
        max_images (int): The maximum number of images to process.
    
    Returns:
        list: A list of dictionaries containing image metadata and file paths.
    """
    try:
        os.makedirs(output_dir, exist_ok=True)
        print(f"Output directory created: {output_dir}")

        image_data_list = []

        for index, item in enumerate(dataset.select(range(max_images))):
            try:
                image_id = item["id"]
                image = item["image"]
                image_filename = f"image_{image_id}.jpg"
                image_path = os.path.join(output_dir, image_filename)
                
                # Save image
                save_image(image, image_path)
                
                # Generate description
                description = generate_description(item)
                
                # Append metadata to the list
                image_data_list.append({
                    "ID": image_id,
                    "Description": description,
                    "Image_path": image_path
                })
            except Exception as e:
                print(f"Error processing image {index + 1}: {e}")
        
        print(f"Processed {len(image_data_list)} images successfully.")
        return image_data_list
    
    except Exception as e:
        print(f"Error processing dataset: {e}")
        raise



output_directory = "downloaded_images"
max_images=200 # taking 200 sample data to reduce computational time. You can consider all the dataset. 

try:
    image_metadata_list = process_images(fashion_product_images["train"], output_directory, max_images)
    print()
    print()
    print("First image metadata:", image_metadata_list[0])
except Exception as e:
    print(f"Unhandled error: {e}")

# Generates multimodal embeddings by accepting an image or text description, processes the input, and invokes a model via the Bedrock runtime client, returning the resulting embeddings

##### The get_titan_multimodal_embedding function is designed to generate multimodal embeddings by accepting either an image or a text description as input. The function supports custom image dimensions and embedding sizes, with the default dimension set to 1024. It first checks if an image path is provided and ensures the image exists; if so, it reads and encodes the image in base64 before adding it to the payload. If a description is provided, it is added to the payload as well. The function requires either an image or a text description for the request and raises an error if neither is provided. The payload, including the embedding configuration, is sent to the model using the Bedrock runtime client, and the model response is returned as a parsed JSON object.

In [None]:
%%time
def get_titan_multimodal_embedding(
    image_path: str = None,  # Maximum image dimensions: 2048 x 2048 pixels
    description: str = None,  # Text description in English (max 128 tokens)
    dimension: int = 1024,  # Desired embedding dimension (default 1024, other options: 384, 256)
    model_id: str = multimodal_embed_model_id  # Predefined model ID for the multimodal embedding
):
    """
    Function to obtain multimodal embeddings by providing either an image or a text description.
    
    Args:
        image_path (str): Path to the image file (optional).
        description (str): Text description for embedding (optional).
        dimension (int): The dimensionality of the embedding output (default is 1024).
        model_id (str): Model identifier for the multimodal embedding model.
    
    Returns:
        dict: The response from the Bedrock model containing the multimodal embeddings.
    
    Raises:
        FileNotFoundError: If the image file does not exist at the given path.
        AssertionError: If neither image nor description is provided.
    """
    
    # Initialize the payload to send to the model
    payload_body = {}

    # Embedding configuration with the specified output dimension
    embedding_config = {
        "embeddingConfig": { 
            "outputEmbeddingLength": dimension
        }
    }
    
    # Process image input if provided
    if image_path:
        # Check if the provided image path exists locally
        if os.path.exists(image_path):
            # Open the image file in binary mode and encode it in base64
            with open(image_path, "rb") as image_file:
                encoded_image = base64.b64encode(image_file.read()).decode('utf8')
            # Add the base64 encoded image to the payload
            payload_body["inputImage"] = encoded_image
        else:
            # Raise an error if the image file does not exist
            raise FileNotFoundError(f"The image file at {image_path} does not exist.")
    
    # Process text description input if provided
    if description:
        payload_body["inputText"] = description

    # Ensure that either image or text is provided for the request
    assert payload_body, "Please provide either an image and/or a text description."

    try:
        # Invoke the model using the Bedrock runtime client to get multimodal embeddings
        response = boto3_bedrock_runtime_client.invoke_model(
            body=json.dumps({**payload_body, **embedding_config}), 
            modelId=model_id,
            accept="application/json", 
            contentType="application/json"
        )
        # Return the parsed JSON response from the model
        return json.loads(response.get("body").read())

    except Exception as e:
        # Handle any exceptions that might occur during the model invocation
        print(f"An error occurred while invoking the model: {e}")
        return None


# Example usage: Processing a list of image metadata to obtain embeddings
multimodal_embeddings_img = []

# Assuming image_metadata_list contains metadata with image paths
for idx, image_metadata in enumerate(image_metadata_list):
    # Get the multimodal embedding for each image
    embedding = get_titan_multimodal_embedding(image_path=image_metadata["Image_path"], dimension=1024)
    # Store the obtained embedding
    multimodal_embeddings_img.append(embedding)

# The code adds multimodal embeddings to each dictionary in image_metadata_list

In [None]:
%%time
try:
    for idx, item in enumerate(image_metadata_list):
        # Ensure the index exists in the multimodal_embeddings_img list
        if idx < len(multimodal_embeddings_img):
            item["embedding_img"] = multimodal_embeddings_img[idx]["embedding"]
        else:
            raise IndexError(f"Index {idx} out of range in multimodal_embeddings_img.")
    
    # Example: Print the first entry with the embedding
    print(image_metadata_list[0])

except IndexError as e:
    # Handle case where the index is out of range
    print(f"IndexError: {e}")

except KeyError as e:
    # Handle case where expected keys are missing in the dictionaries
    print(f"KeyError: Missing key {e} in the item dictionary.")

except Exception as e:
    # Catch any other unforeseen errors
    print(f"An unexpected error occurred: {e}")

# Plot similarity heatmap

###### The function plot_similarity_heatmap is used to visualize the similarity between two sets of embeddings using a heatmap. The heatmap represents the inner product (dot product) between the two sets of embeddings, which is commonly used to measure the similarity between vectors.

In [None]:
%%time
def plot_similarity_heatmap(embeddings_set_a, embeddings_set_b):
    """
    Function to plot a heatmap showing the similarity between two sets of embeddings.

    Args:
        embeddings_set_a (list or np.array): First set of embeddings (e.g., image or text embeddings).
        embeddings_set_b (list or np.array): Second set of embeddings (e.g., image or text embeddings).

    Raises:
        ValueError: If embeddings are not 2D arrays or are empty.
    """
    try:
        # Ensure embeddings are numpy arrays (and are 2D)
        embeddings_set_a = np.array(embeddings_set_a)
        embeddings_set_b = np.array(embeddings_set_b)

        # Check if embeddings are 2D arrays
        if embeddings_set_a.ndim != 2 or embeddings_set_b.ndim != 2:
            raise ValueError("Both embeddings must be 2D arrays.")
        
        # Compute the inner product (dot product) between the embeddings
        similarity_matrix = np.inner(embeddings_set_a, embeddings_set_b)

        # Create a heatmap to visualize the similarity
        sns.set(font_scale=1.1)
        plt.figure(figsize=(10, 8))  # Optional: Adjust the figure size for better readability
        heatmap = sns.heatmap(
            similarity_matrix,
            vmin=np.min(similarity_matrix),
            vmax=1,
            cmap="OrRd",
            cbar_kws={'label': 'Similarity'}
        )
        plt.title('Embedding Similarity Heatmap')
        plt.show()

    except ValueError as ve:
        # Handle invalid data format for embeddings (not 2D)
        print(f"Error: {ve}")
    except Exception as e:
        # Handle any other unexpected errors
        print(f"An error occurred: {e}")

# Example usage (assuming 'embedding_img' contains embeddings)
try:
    
    image_metadata_list_df = pd.DataFrame(image_metadata_list)
    
    # Ensure 'embedding_img' is a list or array of embeddings, here applying the transformation
    embeddings_sample = image_metadata_list_df['embedding_img'][:20].apply(lambda x: np.array(x)).tolist()  # Convert embeddings to numpy arrays

    # Plot similarity heatmap using the same embeddings for both inputs
    plot_similarity_heatmap(embeddings_sample, embeddings_sample)

except KeyError as ke:
    print(f"Error: The specified column 'embedding_img' was not found in the dataframe. {ke}")
except Exception as e:
    print(f"An unexpected error occurred while preparing the data: {e}")


In [None]:
%store image_metadata_list_df aws_region_name sagemaker_role_arn multimodal_embed_model_id max_images image_metadata_list

# End of NoteBook 

## Please ensure that you close the kernel after using this notebook to avoid any potential charges to your account.

## Process: Go to "Kernel" at top option. Choose "Shut Down Kernel" if you are not executing the next notebook. Here simple_multimodal_knwl_bases_building.ipynb
##### Refer https://docs.aws.amazon.com/sagemaker/latest/dg/studio-ui.html