## Stable Diffusion (Text to Image) model deployment from SageMaker JumpStart

### SageMaker JumpStart
`Amazon SageMaker JumpStart` is a powerful feature within the Amazon SageMaker machine learning platform that provides developers with a comprehensive hub of state-of-the-art (SOTA) language, vision, and other modalities' deep learning models. With over 600 pre-trained models available and growing every day, SageMaker JumpStart enables developers to quickly and easily incorporate cutting-edge machine learning techniques into their production workflows.

One of the key benefits of SageMaker JumpStart is that it provides developers with access to hundreds of built-in algorithms and pre-trained models from leading model hubs and providers tailored in all the most popular machine learning frameworks like PyTorch, HuggingFace, TensorFlow and more. It also comes with a low-code user interface that makes it easy to get started with deep learning, even for those without extensive machine learning expertise. In addition, JumpStart also provides solution templates for common use cases, as well as executable example notebooks that demonstrate best practices for machine learning with SageMaker.

#### SageMaker JumpStart Foundation Model Hub
`Amazon SageMaker Foundation Model Hub` is a feature of SageMaker JumpStart which is a model hub or zoo for SOTA deep learning models that are tailored for a wide range of advanced text and image generation use cases. This hub includes both public and proprietary models, such as those from AWS partners like Stability AI, Cohere, AI21, as well home brewed models like Amazon's own AlexTM and many coming soon.

These LLMs excel in standard benchmarks and are capable of solving a wide range of problems such as text-to-image generation, text summarization, abstractive question answering, sentiment analysis, and entity extraction, among others. They come with a user-friendly playground that allows developers to interactively test different flavors of the models and generate outputs with different generation configurations.

You can access these models via APIs or through SageMaker Studio, and fine-tune or deploy them for your domain-specific use cases with just a few clicks in a no-code fashion, or via APIs if you prefer a high-code execution style. These models come with all the benefits of SageMaker training and hosting and allow you to create endpoints that are automatically enabled for resiliency, scalability, load balancing, and fault tolerance. They tightly integrate with all SageMaker components and AWS services for seamless integration into your existing workflows.

As the number of models continues to grow, SageMaker Foundation Model Hub will remain an essential resource for those seeking to stay at the forefront of the field of generative AI and deep learning.

### Deploy a pre-trained Stable Diffusion model from the SageMaker JumpStart console
In the navigation pane, under **SageMaker JumpStart**, choose **Model, notebooks, solutions**. You’re presented with a range of solutions, foundation models, and other artifacts that can help you get started with a specific model or a specific business problem or use case. If you want to experiment in a particular area, you can use the search function. Or you can simply browse the artifacts to find the relevant model or business solution for your needs. To start exploring the Stable Diffusion models, complete the following steps:

1. Go to the `Foundation Models` section and select the **Stable Diffusion 2.1 base** model and click **View model**.
<div>
    <img src="./img/1.png" alt="Image jumpstart" width="1000" style="display:inline-block">
</div>
<br>

2. A new tab is opened with the options to train, deploy and view model details as shown below.

3. In the Deploy Model section, click the **Deploy** button, for a 1 click deployment of the Jumpstart model.

<div>
    <img src="./img/2.png" alt="Image sb2.1" width="1000" style="display:inline-block">
</div>
<br>



The deploy action will start a new tab showing the model creation status and the model deployment status.
It will start by "Creating" the Endpoint.
<div>
    <img src="./img/5.png" alt="Image preparemodel" width="1000" style="display:inline-block">
</div>
<br>
<br> After which that status will change to "Model is ready"<br>
<br> The creation and deployment of the endpoint ready to accept inferences will take around 10-15 minutes.
<br><br>
<div>
    <img src="./img/4.png" alt="Image preparemodel" width="1000" style="display:inline-block">
</div>
<br>


4. When the endpoint is deployed, choose **Open Notebook** to open a Jupyter notebook with Python code. Or use the code at the end of this section to invoke the endpoint.

<div>
    <img src="./img/6.png" alt="Image opennotebook" width="700" style="display:inline-block">
</div>
<br><br>
You will be prompted to Select a Notebook Envrionment, accept the default
<br>
<br>Image : Data Science 2.0
<br>Kernal : Python 3
<br>Instance Type : ml.t3.medium
<br><br>
<div>
    <img src="./img/7.png" alt="Image opennotebook" width="700" style="display:inline-block">
</div>
<br>


### Executing the Sample Jumpstart Notebook
Following similar steps, you can deploy other pre-trained models from JumpStart. Such as the **Stable Diffusion 2 Inpainting** model from [Stability AI](https://stability.ai/blog/stable-diffusion-public-release). It takes an image, a mask image and a text prompt as input. It replaces the mask area of the original image with an image described by the text prompt to generate a new image. Follow the same steps from the above and deploy this model from the JumpStart model hub to a SageMaker endpoint (with instance type `ml.g5.2xlarge`). Once the model is successfully deployed, you can use below code to invoke the endpoint.

In [None]:
import boto3
import matplotlib.pyplot as plt
from IPython.core.display import HTML
import numpy as np
import json
import base64
from PIL import Image
from io import BytesIO

In [None]:
region = boto3.Session().region_name
s3_bucket = f"jumpstart-cache-prod-{region}"
key_prefix = "model-metadata/assets"
input_img_file_name = "dog_suit.jpg"
input_img_mask = "dog_suit_mask.jpg"
s3 = boto3.client("s3")

s3.download_file(s3_bucket, f"{key_prefix}/{input_img_file_name}", input_img_file_name)
s3.download_file(s3_bucket, f"{key_prefix}/{input_img_mask}", input_img_mask)


HTML(f'<table><tr><td> <img src="{input_img_file_name}" alt="cat" style="height: 700px;"/> <figcaption>Input Image</figcaption>'
     '</td></tr></table>')

In [None]:
HTML(f'<table><tr><td> <img src="{input_img_mask}" alt="cat" style="height: 700px;"/> <figcaption>Mask Image</figcaption>'
     '</td></tr></table>')

### Query endpoint

***
Next, we query the endpoint to inpaint an image with a different image using a prompt. You can put in any image and a mask matching the dimension of the original image. Furthermore, you can replace the masked part with any image guided by the prompt.
***

In [None]:
endpoint_name = 'jumpstart-dft-stable-diffusion-2-inpainting'

def encode_img(img_name):
    with open(img_name,'rb') as f: img_bytes = f.read()
    encoded_img = base64.b64encode(bytearray(img_bytes)).decode()
    return encoded_img

encoded_input_image = encode_img(input_img_file_name)
encoded_mask_image = encode_img(input_img_mask)


payload = { "prompt":"a white cat, blue eyes, wearing a sweater, lying in park", "image": encoded_input_image, "mask_image":encoded_mask_image, "num_inference_steps":50, "guidance_scale":7.5, "seed": 1}

def query_endpoint(payload):
    """query the endpoint with the json payload encoded in utf-8 format."""
    encoded_payload = json.dumps(payload).encode('utf-8')
    client = boto3.client('runtime.sagemaker')
    # Accept = 'application/json;jpeg' returns the jpeg image as bytes encoded by base64.b64 encoding.
    # To receive raw image with rgb value set Accept = 'application/json'
    # To send raw image, you can set content_type = 'application/json' and encoded_image as np.array(PIL.Image.open('low_res_image.jpg')).tolist()
    # Note that sending or receiving payload with raw/rgb values may hit default limits for the input payload and the response size.
    response = client.invoke_endpoint(EndpointName=endpoint_name, ContentType='application/json;jpeg', Accept = 'application/json;jpeg', Body=encoded_payload)
    return response

def display_image(img, title):
    plt.figure(figsize=(12,12))
    plt.imshow(np.array(img))
    plt.axis('off')
    plt.title(title)
    plt.show()

def parse_and_display_response(query_response):
    """Parse the endpoint response and display the generated images"""
    
    response_dict = json.loads(query_response['Body'].read())
    generated_images = response_dict['generated_images']
    
    for generated_image in generated_images:
        with BytesIO(base64.b64decode(generated_image.encode())) as generated_image_decoded:
            with Image.open(generated_image_decoded) as generated_image_np:
                generated_image_rgb = generated_image_np.convert("RGB")
                display_image(generated_image_rgb, "Inpainted Image")

query_response = query_endpoint(payload)
parse_and_display_response(query_response)

### Supported parameters

***
This model supports many parameters while performing inference. They include:

* **prompt**: prompt to guide the image generation. Must be specified and can be a string or a list of strings.
* **num_inference_steps**: number of denoising steps during image generation. More steps lead to higher quality image. If specified, it must a positive integer.
* **guidance_scale**: higher guidance scale results in image closely related to the prompt, at the expense of image quality. If specified, it must be a float. guidance_scale<=1 is ignored.
* **negative_prompt**: guide image generation against this prompt. If specified, it must be a string or a list of strings and used with guidance_scale. If guidance_scale is disabled, this is also disabled. Moreover, if prompt is a list of strings then negative_prompt must also be a list of strings.
* **num_images_per_prompt**: number of images returned per prompt. If specified it must be a positive integer.
* **seed**: fix the randomized state for reproducibility. If specified, it must be an integer.
* **batch_size**: Number of images to generate in a single forward pass. If using a smaller instance or generating many images, please reduce batch_size to be a small number (1-2). Number of images = number of prompts*num_images_per_prompt.

***

In [None]:
payload = { 
    "prompt":"a white cat, blue eyes, wearing a sweater, lying in park",
    "image":encoded_input_image, 
    "mask_image":encoded_mask_image, 
    "num_inference_steps":30,
    "guidance_scale":7.5,
    "num_images_per_prompt":4,
    "seed": 1,
    "negative_prompt": "poorly drawn feet",
    "batch_size":2
}
query_response = query_endpoint(payload)
parse_and_display_response(query_response)

## Clean up

Before we move on, don’t forget to delete your endpoints when you’re finished. On the previous tab, under **Delete Endpoint**, choose **Delete**. Do the same to other endpoints that you have created during the lab.

<div>
    <img src="./img/delete.png" alt="Image delete" width="800" style="display:inline-block">
</div>
<br>