# File Name: advanced_image_patterns_part1.ipynb
### Location: Chapter 18
### Purpose: 
#####       1. Perfecting Prompt for Image
#####       2. Image Embedding
#####       3. Image to Image
##### Dependency: simple-sageMaker-bedrock.ipynb at Chapter 3 should work properly.
# <ins>-----------------------------------------------------------------------------------</ins>

# <ins>Amazon SageMaker Classic</ins>
#### Those who are new to Amazon SageMaker Classic. Follow the link for the details. https://docs.aws.amazon.com/sagemaker/latest/dg/studio.html

# <ins>Environment setup of Kernel</ins>
##### Fill "Image" as "Data Science"
##### Fill "Kernel" as "Python 3"
##### Fill "Instance type" as "ml-t3-medium"
##### Fill "Start-up script" as "No Scripts"
##### Click "Select"

###### Refer https://docs.aws.amazon.com/sagemaker/latest/dg/notebooks-create-open.html for details.

# <ins>Mandatory installation on the kernel through pip</ins>

##### This lab will work with below software version. But, if you are trying with latest version of boto3, awscli, and botocore. This code may fail. You might need to change the corresponding api. 

##### You will see pip dependency errors. you can safely ignore these errors and continue executing rest of the cell. 

In [None]:
%%time 

%pip install --no-build-isolation --force-reinstall -q \
    "boto3" \
    "awscli" \
    "botocore" \
    "utils" \
    "matplotlib" \
    "sagemaker" \
    "numpy<2"

# <ins>Disclaimer</ins>

##### You will see pip dependency errors. you can safely ignore these errors and continue executing rest of the cell.

# <ins>Restart the kernel</ins>

In [None]:
# restart kernel
from IPython.core.display import HTML
HTML("<script>Jupyter.notebook.kernel.restart()</script>")

# <ins>Python package import</ins>

##### boto3 offers various clients for Amazon Bedrock to execute various actions.
##### botocore is a low-level interface to AWS tools, while boto3 is built on top of botocore and provides additional features

In [None]:
import base64
import io
import json
import os
import sys
import boto3
from PIL import Image
import botocore
import warnings
import sagemaker
import base64
from io import BytesIO
import matplotlib.pyplot as plt
import json
import random

### Ignore warning 

In [None]:
warnings.filterwarnings('ignore')

## Define important environment variable

In [None]:
%%time 

# Try-except block to handle potential errors during execution
try:
    # Create a new Boto3 session to interact with AWS services
    # This session manages credentials and region configuration for AWS interactions
    boto3_session = boto3.session.Session()

    # Retrieve the current AWS region from the session (e.g., 'us-east-1', 'us-west-2')
    aws_region_name = boto3_session.region_name

    # Initialize Bedrock and Bedrock Runtime clients using Boto3
    # These clients enable interactions with AWS Bedrock-related services
    boto3_bedrock_client = boto3.client('bedrock', region_name=aws_region_name)
    boto3_bedrock_runtime_client = boto3.client('bedrock-runtime', region_name=aws_region_name)

    # Create a SageMaker session and retrieve the execution role ARN
    # The role ARN authorizes SageMaker to perform tasks on behalf of the user
    sagemaker_role_arn = sagemaker.get_execution_role()

    # Specify the Amazon Titan image generator model ID for multimodal processing
    amazon_titan_image_model_id = "amazon.titan-image-generator-v2:0"

    # Specify the Amazon Titan embedding model ID for multimodal indexing
    multimodal_embed_model_id = "amazon.titan-embed-image-v1"

    # Store all relevant variables in a dictionary for easier access and management
    variables_store = {
        "aws_region_name": aws_region_name,                           # AWS region name
        "boto3_bedrock_client": boto3_bedrock_client,                 # Bedrock client instance
        "boto3_bedrock_runtime_client": boto3_bedrock_runtime_client, # Bedrock Runtime client instance
        "boto3_session": boto3_session,                               # Current Boto3 session object
        "sagemaker_role_arn": sagemaker_role_arn,                     # SageMaker execution role ARN
        "multimodal_embed_model_id": multimodal_embed_model_id,       # Titan embedding model ID
        "amazon_titan_image_model_id": amazon_titan_image_model_id    # Titan image generator model ID
    }

    # Print all stored variables for debugging and verification
    for var_name, value in variables_store.items():
        print(f"{var_name}: {value}")

# Handle any exceptions that occur during the execution
except Exception as e:
    # Print an error message if an unexpected error occurs
    print(f"An unexpected error occurred: {e}")


# Text-to-Image Generation with Amazon Bedrock: Payload Creation and Model Invocation

    The provided code consists of two functions that work together to generate images using a text-to-image model in Amazon Bedrock. The create_text_to_image_payload function constructs a JSON payload that includes configuration details such as the prompt for the image, the number of images to generate, image quality, dimensions, CFG scale, and seed for reproducibility. It prepares this data to be used in an InvokeModel request to the Amazon Bedrock service. The invoke_text_to_image_model function sends this payload to the Bedrock model using the invoke_model method of the Boto3 runtime client. It processes the response, which contains the generated image data, and returns it as a dictionary.

In [None]:
%%time
def create_text_to_image_payload(prompt, encoded_image=None, negative_prompts=None, num_images=1, quality="standard", 
                                  height=1024, width=1024, cfg_scale=7.5, seed=42):
    """
    Create the payload for the text-to-image Bedrock model invocation.

    This function constructs the JSON payload required for making an InvokeModel request to Amazon Bedrock
    to generate images based on a text prompt. When making an InvokeModel request, the body field is populated
    with a JSON object that specifies the task type (in this case, "TEXT_IMAGE") and various configuration parameters.
    The Amazon Titan models support the following parameters:
    
    - cfgScale: Controls how much the final image reflects the input prompt. A higher value means the image
      will more closely align with the prompt.
    - seed: A numeric value used to initialize the image generation process. The same seed, when used with the
      same prompt and settings, will produce identical results.
    - numberOfImages: Defines how many images the model should generate, ranging from 1 to 5.
    - quality: Specifies the quality of the output image. You can choose either "standard" or "premium".
    
    Args:
        prompt (str): The main prompt that describes the image to be generated.
        negative_prompts (str): Optional negative prompts that help exclude certain features from the generated image.
        num_images (int): The number of images to generate, which can be between 1 and 5.
        quality (str): The quality of the generated image. It can either be "standard" or "premium".
        height (int): The height of the image in pixels (default is 1024).
        width (int): The width of the image in pixels (default is 1024).
        cfg_scale (float): The CFG scale value determines how closely the generated image matches the prompt. 
                            It ranges from 1.0 (exclusive) to 10.0 (default is 7.5).
        seed (int): A seed number to initialize the generation process. The same seed with the same prompt and settings 
                    will always produce the same image (default is 42).

    Returns:
        str: A JSON string containing the constructed payload to be sent in the InvokeModel request.
    """
    
    if encoded_image == "":
        # Create the payload dynamically based on whether negative_prompts is provided or not
        payload = {
            "taskType": "TEXT_IMAGE",  # Specifies the task type: text-to-image generation.
            "textToImageParams": {
                "text": prompt,  # The main prompt that describes the image to be generated (Required).
                **({"negativeText": negative_prompts} if negative_prompts else {})  # Add negativeText only if it's provided.
            },
            "imageGenerationConfig": {
                "numberOfImages": num_images,  # The number of images to generate (1 to 5).
                "quality": quality,  # Image quality: 'standard' or 'premium'.
                "height": height,  # Image height in pixels.
                "width": width,  # Image width in pixels.
                "cfgScale": cfg_scale,  # CFG scale (1.0 exclusive to 10.0, higher values create more prompt alignment).
                "seed": seed  # Seed for reproducibility of results (same seed gives the same image).
            }
        }
    else:
        payload = {
            "taskType": "IMAGE_VARIATION",  # Specifies the task type: text-to-image generation.
            "imageVariationParams":{
                "text": prompt,  # The main prompt that describes the image to be generated (Required).
                "images":[encoded_image],
                **({"negativeText": negative_prompts} if negative_prompts else {})  # Add negativeText only if it's provided.
            },
            "imageGenerationConfig": {
                "numberOfImages": num_images,  # The number of images to generate (1 to 5).
                "quality": quality,  # Image quality: 'standard' or 'premium'.
                "height": height,  # Image height in pixels.
                "width": width,  # Image width in pixels.
                "cfgScale": cfg_scale,  # CFG scale (1.0 exclusive to 10.0, higher values create more prompt alignment).
                "seed": seed  # Seed for reproducibility of results (same seed gives the same image).
            }
        }
       
    return json.dumps(payload)


def invoke_text_to_image_model(runtime_client, model_id, payload):
    """
    Invoke the Bedrock text-to-image model.

    This function sends the constructed JSON payload to the Amazon Bedrock service to generate images.
    The payload contains the prompt and configuration details, and the request is processed by the 
    specified model (model_id). The response is parsed and returned as a dictionary.
    
    Args:
        runtime_client (boto3.client): The Boto3 Bedrock runtime client for invoking the model.
        model_id (str): The identifier for the Bedrock model to be used for text-to-image generation.
        payload (str): The JSON string containing the request payload that includes prompt and configuration.

    Returns:
        dict: The parsed response from the Bedrock model, containing the generated image data.
    """
    try:
        # Making the request to invoke the model with the provided payload
        response = runtime_client.invoke_model(
            body=payload,  # The payload for the model invocation.
            modelId=model_id,  # Model ID for the text-to-image generation.
            accept="application/json",  # Expected response format.
            contentType="application/json"  # The content type for the request body.
        )
        # Parsing the JSON response body and returning it as a dictionary
        return json.loads(response.get("body").read())
    except Exception as e:
        print(f"Error invoking text-to-image model: {e}")  # Handling any errors during invocation
        raise  # Raising the exception for further handling if necessary

# Display Base64-encoded Image Using Matplotlib in Headless Environments

    The function display_base64_image is designed to display an image that is encoded in Base64 format, making it suitable for use in headless environments. It decodes the Base64 string into image data, loads the image using the Python Imaging Library (PIL), and then displays the image using matplotlib.

In [None]:
%%time
def display_base64_image(image_b64, image_number):
    """
    Display a Base64-encoded image using matplotlib (works in headless environments).
    """
    try:
        # Decode the Base64 string
        image_data = base64.b64decode(image_b64)
        
        # Open the image using PIL
        image = Image.open(io.BytesIO(image_data))
        
        # Display the image using matplotlib
        plt.imshow(image)
        plt.axis('off')  # Hide axis
        plt.show()
        
        save_path = "data/generated_image"
        os.makedirs(save_path, exist_ok=True)  # Create folder if it doesn't exist
        
        image_file_name = os.path.join(save_path, f"generated_image_{image_number}.png")
        image.save(image_file_name)
        print(f"Image saved at: {image_file_name}")

    except Exception as e:
        print(f"Error displaying the Base64 image: {e}")
        raise

# Generate and Display Image Using Amazon Bedrock Text-to-Image Model

    The function image_invoke_model is designed to generate an image based on a given prompt and optional negative prompts using an Amazon Bedrock text-to-image model. It first constructs the necessary payload with parameters like prompt, negative prompts, image dimensions, and configuration settings. Then, it invokes the model via the invoke_text_to_image_model function and processes the response to retrieve the generated image in Base64 format. The function prints a preview of the image's Base64 string, displays the image, and optionally allows for further processing or saving.

In [None]:
%%time
def image_invoke_model( prompt, image_number, encoded_image, negative_prompts = None):
    try:
        # Create payload
        
        payload = create_text_to_image_payload(
            prompt=prompt,
            negative_prompts=negative_prompts,
            num_images=1,
            quality="standard",
            height=1024,
            width=1024,
            cfg_scale=7.5,
            seed=42,
            encoded_image = encoded_image
        )

        # Invoke model
        print("Invoking Bedrock text-to-image model...")
        response = invoke_text_to_image_model(boto3_bedrock_runtime_client, amazon_titan_image_model_id, payload)

        # Process response
        image_b64_format = response["images"][0]
        print(f"Generated image (Base64 preview): {image_b64_format[0:80]}...")

        #display image
        display_base64_image(image_b64_format, image_number)

    # Optionally save or process the image further
    except Exception as e:
        print(f"An error occurred in the main execution: {e}")

# Section 1: Perfecting Prompt for Image

#### The detail explaination on the "Perfecting Prompt for Image" sub section of 18.6 section

#### All the prompt and negative prompt example 

In [None]:
# JSON data with an additional "image_number" field
sample_test_json_data = [
    {
        "image_number": 1,
        "type": "Type of Image",
        "sub_type": "Photograph",
        "prompt": "A clear photograph of a calm lake surrounded by pine trees during sunset, with vibrant orange and pink hues in the sky.",
        "negative_prompt": ""
    },
    {
        "image_number": 2,
        "type": "Type of Image",
        "sub_type": "Sketch",
        "prompt": "A pencil sketch of a cozy cottage with a chimney, nestled in a snowy forest, detailed with shading to create depth.",
        "negative_prompt": ""
    },
    {
        "image_number": 3,
        "type": "Type of Image",
        "sub_type": "Painting",
        "prompt": "An oil painting of a vibrant sunflower field under a bright blue sky, inspired by Van Gogh's expressive brushstrokes.",
        "negative_prompt": ""
    },
    {
        "image_number": 4,
        "type": "Type of Image",
        "sub_type": "Digital Art",
        "prompt": "A digital artwork of a futuristic city with flying cars zipping past towering skyscrapers, illuminated by glowing holographic billboards.",
        "negative_prompt": ""
    },
    {
        "image_number": 5,
        "type": "Description",
        "sub_type": "Subject",
        "prompt": "A majestic elephant walking across the African savannah, with the sun setting behind it, casting long shadows.",
        "negative_prompt": ""
    },
    {
        "image_number": 6,
        "type": "Description",
        "sub_type": "Object",
        "prompt": "A classic pocket watch with intricate engravings, resting on a velvet cushion.",
        "negative_prompt": ""
    },
    {
        "image_number": 7,
        "type": "Description",
        "sub_type": "Environment",
        "prompt": "A tranquil beach at dawn, with soft waves lapping against the shore and a golden glow from the rising sun.",
        "negative_prompt": ""
    },
    {
        "image_number": 8,
        "type": "Description",
        "sub_type": "Scene",
        "prompt": "A vibrant carnival scene with colorful tents, performers, and joyful crowds under the bright lights of a summer night.",
        "negative_prompt": ""
    },
    {
        "image_number": 9,
        "type": "Style Keywords",
        "sub_type": "Hyper-Realistic",
        "prompt": "A hyper-realistic depiction of a bustling city street during a rainy night, with neon lights reflecting off the wet pavement.",
        "negative_prompt": ""
    },
    {
        "image_number": 10,
        "type": "Style Keywords",
        "sub_type": "Artistic (Classical Painting)",
        "prompt": "An impressionist-style painting inspired by Claude Monet, featuring a serene water lily pond with soft, blended brushstrokes.",
        "negative_prompt": ""
    },
    {
        "image_number": 11,
        "type": "Style Keywords",
        "sub_type": "Futuristic (Anime Style)",
        "prompt": "A futuristic anime-style cityscape with glowing skyscrapers, flying vehicles, and a vibrant night sky filled with holographic advertisements.",
        "negative_prompt": ""
    },
    {
        "image_number": 12,
        "type": "Style Keywords",
        "sub_type": "Fantasy (Digital Art)",
        "prompt": "A fantasy digital art scene of a dragon perched on a cliff, overlooking a glowing enchanted forest under a starlit sky.",
        "negative_prompt": ""
    },
    {
        "image_number": 13,
        "type": "Style Keywords",
        "sub_type": "Minimalist",
        "prompt": "A minimalist artwork of a lone tree in a desert, with clean lines and a muted color palette of beige and brown tones.",
        "negative_prompt": ""
    },
    {
        "image_number": 14,
        "type": "Style Keywords",
        "sub_type": "Vintage Photography",
        "prompt": "A vintage sepia-toned photograph of a 1920s train station, with steam billowing from locomotives and passengers dressed in period attire.",
        "negative_prompt": ""
    },
    {
        "image_number": 15,
        "type": "Adjectives and Details",
        "sub_type": "Lighting",
        "prompt": "A dramatic scene lit by the cool, silvery glow of moonlight reflecting on a tranquil ocean, with soft shadows creating a sense of depth.",
        "negative_prompt": ""
    },
    {
        "image_number": 16,
        "type": "Adjectives and Details",
        "sub_type": "Lens Details",
        "prompt": "Captured with a 24mm ultra-wide-angle lens, showcasing the expansive view of a rugged canyon with intricate textures and layers of rock formations.",
        "negative_prompt": ""
    },
    {
        "image_number": 17,
        "type": "Adjectives and Details",
        "sub_type": "Framing",
        "prompt": "A close-up shot of a vibrant butterfly resting on a flower, perfectly framed by blurred wildflowers in the background, emphasizing the subject's delicate details.",
        "negative_prompt": ""
    },
    {
        "image_number": 18,
        "type": "Negative Prompts",
        "sub_type": "Lighting",
        "prompt": "Golden hour lighting illuminating a tranquil garden.",
        "negative_prompt": "A dark, dimly lit garden with harsh shadows, lacking the warmth of golden hour lighting."
    },
    {
        "image_number": 19,
        "type": "Negative Prompts",
        "sub_type": "Lens Details",
        "prompt": "Captured with an 85mm wide-angle lens for a cinematic effect.",
        "negative_prompt": "Shot with a distorted fisheye lens, causing the image to look warped and unnatural."
    },
    {
        "image_number": 20,
        "type": "Negative Prompts",
        "sub_type": "Framing",
        "prompt": "A close-up portrait of a young woman wearing traditional attire.",
        "negative_prompt": "A distant, full body shot of a person wearing modern casual clothing, with no focus on the face."
    }
]

In [None]:
%%time 

encoded_image = ""
# Example: Print the parsed data
for item in sample_test_json_data:
    
    print("-------------------------------\n\n")
    print(f"Image Number: {item['image_number']}")
    print(f"Type: {item['type']}")
    print(f"Sub Type: {item['sub_type']}")
    print(f"Prompt: {item['prompt']}")
    
    negative_prompt = None if item['negative_prompt'] == "" else item['negative_prompt']
            
    print(f"Negative Prompt: {negative_prompt}")
    print()
    image_invoke_model( item['prompt'] , item['image_number'], encoded_image, negative_prompt ) 
    print("\n\n\n")
    print("-------------------------------")

# Section 2: Image Embedding 

### Generate Multimodal Embeddings with Image or Text Inputs

    The get_titan_multimodal_embedding function generates multimodal embeddings for input images  using a specified model. It accepts an optional image path and a desired embedding dimension (default is 1024). The function validates the inputs, encodes the image to Base64 if provided, and constructs a payload with the embedding configuration.

In [None]:
%%time
def get_titan_multimodal_embedding(
    image_path: str = None,  # Maximum image dimensions: 2048 x 2048 pixels
    dimension: int = 1024,  # Desired embedding dimension (default 1024, other options: 384, 256)
    model_id: str = multimodal_embed_model_id  # Predefined model ID for the multimodal embedding
):
    """
    Function to obtain multimodal embeddings by providing either an image .
    
    Args:
        image_path (str): Path to the image file (optional).
        dimension (int): The dimensionality of the embedding output (default is 1024).
        model_id (str): Model identifier for the multimodal embedding model.
    
    Returns:
        dict: The response from the Bedrock model containing the multimodal embeddings.
    
    Raises:
        FileNotFoundError: If the image file does not exist at the given path.
        AssertionError: If neither image is provided.
    """
    
    # Initialize the payload to send to the model
    payload_body = {}

    # Embedding configuration with the specified output dimension
    embedding_config = {
        "embeddingConfig": { 
            "outputEmbeddingLength": dimension
        }
    }
    
    # Process image input if provided
    if image_path:
        # Check if the provided image path exists locally
        if os.path.exists(image_path):
            # Open the image file in binary mode and encode it in base64
            with open(image_path, "rb") as image_file:
                encoded_image = base64.b64encode(image_file.read()).decode('utf8')
            # Add the base64 encoded image to the payload
            payload_body["inputImage"] = encoded_image
        else:
            # Raise an error if the image file does not exist
            raise FileNotFoundError(f"The image file at {image_path} does not exist.")
    

    # Ensure that either image or text is provided for the request
    assert payload_body, "Please provide either an image."

    try:
        # Invoke the model using the Bedrock runtime client to get multimodal embeddings
        response = boto3_bedrock_runtime_client.invoke_model(
            body=json.dumps({**payload_body, **embedding_config}), 
            modelId=model_id,
            accept="application/json", 
            contentType="application/json"
        )
        # Return the parsed JSON response from the model
        return json.loads(response.get("body").read())

    except Exception as e:
        # Handle any exceptions that might occur during the model invocation
        print(f"An error occurred while invoking the model: {e}")
        return None

In [None]:
%%time

print("-------------------------------\n\n")
print(f"Example of embedding")
    
random_png_file = "data/generated_image/generated_image_20.png"

print(f"Randomly selected file: {random_png_file}\n")

embedding = get_titan_multimodal_embedding(image_path=random_png_file, dimension=1024)

print(embedding["embedding"])

# Section 3: Image to Image

    The image_to_base64 function converts an image file into a Base64 encoded string. It attempts to open the file in binary read mode, encode its contents, and return the encoded string.

In [None]:
%%time

def image_to_base64(image_path):
    """
    Converts an image file to its Base64 encoded string representation.

    Args:
        image_path (str): The path to the image file.

    Returns:
        str: Base64 encoded string of the image.
             Returns None if an error occurs during encoding.
    """
    try:
        # Open the image file in binary read mode
        with open(image_path, "rb") as image_file:
            # Read the file's contents and encode it into Base64 format
            encoded_image = base64.b64encode(image_file.read()).decode('utf8')
        return encoded_image
    except FileNotFoundError:
        # Handle the case where the file path is invalid
        print(f"Error: The file '{image_path}' was not found.")
    except Exception as e:
        # Handle any other unexpected errors
        print(f"An error occurred: {e}")
    
    # Return None if an error occurs
    return None

#### Example of prompt and negative prompt

In [None]:
extension_prompt = "An ancient lighthouse perched on a cliff, its beam piercing through the cool moonlight as the ocean glimmers below."
negative_prompt = "Avoid stormy weather, rough seas, modern or futuristic lighthouses, brightly lit surroundings, warm or golden lighting, visible human figures, cluttered landscapes, exaggerated fantasy elements, overly colorful skies, or unrealistic structures."

### Implementing on top of the image data/generated_image/generated_image_7.png

##### Encoding a specified image file to Base64 format

In [None]:
%%time
print("-------------------------------")
print(f"Example of Image to Image")
    
random_png_file = "data/generated_image/generated_image_7.png"

print(f"Randomly selected file: {random_png_file}")

encoded_image = image_to_base64(random_png_file)

#### The code measures the execution time of invoking an image generation model with a given extension prompt, negative prompt, and image number, while printing relevant details and displaying the result.

In [None]:
%%time 
image_number = 21 
# Example: Print the parsed data
print(f"Extension Prompt: {extension_prompt}")
print(f"Negative Prompt: {negative_prompt}")
print(f"Image Number: {image_number}")
print()
image_invoke_model( extension_prompt , image_number, encoded_image, negative_prompt ) 
print("\n\n\n")
print("-------------------------------")

# End of NoteBook 

#### <ins>Step 1</ins> 

##### Please ensure that you close the kernel after using this notebook to avoid any potential charges to your account.

##### Process: Go to "Kernel" at top option. Choose "Shut Down Kernel". 
##### Refer https://docs.aws.amazon.com/sagemaker/latest/dg/studio-ui.html