# Image Captioning: Run EXAONE Atelier from AWS Marketplace

---

In this demo notebook, we demonstrate how to use the SageMaker Python SDK to deploy EXAONE Atelier Image Captioning Model from AWS Marketplace.

---

## Setup

***


To subscribe to the model package:

1. Open the model package **listing page**
1. On the AWS Marketplace listing, click on the **Continue to Subscribe** button.
1. On the **Subscribe to this software** page, review and click on **"Accept Offer"** if you and your organization agrees with EULA, pricing, and support terms.
1. Once you click on **Continue to configuration** button and then choose a **region**, you will see a **Product Arn** displayed. This is the model package ARN that you need to specify while creating a deployable model using Boto3. Copy the ARN corresponding to your region and specify the same in the following cell.

---

In [None]:
%pip install --upgrade --quiet sagemaker
%pip install --upgrade --quiet sagemaker accelerate datasets tritonclient[all]

In [None]:
import boto3

#model_package = "exaone-atelier-i2t-limited-85f3f0d181593a10b7aef9bea522a333" # EXAONE Atelier - Image to Text - Limited
model_package = "exaone-atelier-i2t-76c77246a8343a23a36b2ce80c06f4f6" # EXAONE Atelier - Image to Text

model_package_map = {
    "us-east-1": f"arn:aws:sagemaker:us-east-1:865070037744:model-package/{model_package}",
    "us-east-2": f"arn:aws:sagemaker:us-east-2:057799348421:model-package/{model_package}",
    "us-west-1": f"arn:aws:sagemaker:us-west-1:382657785993:model-package/{model_package}",
    "us-west-2": f"arn:aws:sagemaker:us-west-2:594846645681:model-package/{model_package}",
    "ca-central-1": f"arn:aws:sagemaker:ca-central-1:470592106596:model-package/{model_package}",
    "eu-central-1": f"arn:aws:sagemaker:eu-central-1:446921602837:model-package/{model_package}",
    "eu-west-1": f"arn:aws:sagemaker:eu-west-1:985815980388:model-package/{model_package}",
    "eu-west-2": f"arn:aws:sagemaker:eu-west-2:856760150666:model-package/{model_package}",
    "eu-west-3": f"arn:aws:sagemaker:eu-west-3:843114510376:model-package/{model_package}",
    "eu-north-1": f"arn:aws:sagemaker:eu-north-1:136758871317:model-package/{model_package}",
    "ap-southeast-1": f"arn:aws:sagemaker:ap-southeast-1:192199979996:model-package/{model_package}",
    "ap-southeast-2": f"arn:aws:sagemaker:ap-southeast-2:666831318237:model-package/{model_package}",
    "ap-northeast-2": f"arn:aws:sagemaker:ap-northeast-2:745090734665:model-package/{model_package}",
    "ap-northeast-1": f"arn:aws:sagemaker:ap-northeast-1:977537786026:model-package/{model_package}",
    "ap-south-1": f"arn:aws:sagemaker:ap-south-1:077584701553:model-package/{model_package}",
    "sa-east-1": f"arn:aws:sagemaker:sa-east-1:270155090741:model-package/{model_package}",
}
region = boto3.Session().region_name
if region not in model_package_map.keys():
    raise Exception(f"Current boto3 session region {region} is not supported.")

model_package_arn = model_package_map[region]

In [None]:
model_name = 'exaone-i2t'

### Changing instance type
---


Model are supported on the following instance types:
 - Exaone Atelier - Image to Text - Limited: `ml.g5.12xlarge`
 - Exaone Atelier - Image to Text: `ml.g5.xlarge`, `ml.g5.12xlarge`, `ml.g5.48xlarge`, `ml.p4d.24xlarge`

Exaone Atelier - Image to Text - Limited offers 5 days of free-trial.

Below are average inference times to process a single image request for each supported instance type. The actual response time may differ due to various reasons including network condition.


|Instance Type|Inference Time (sec)|
|---|---|
|ml.p4d.24xlarge|1.85|
|ml.g5.48xlarge|3.68|
|ml.g5.12xlarge|6.59|
|ml.g5.xlarge|25.85|

---

In [None]:
from sagemaker import ModelPackage
from sagemaker import get_execution_role
import sagemaker

role = get_execution_role()
sagemaker_session = sagemaker.Session()


model = ModelPackage(
    role=role,
    model_package_arn=model_package_arn,
    sagemaker_session=sagemaker_session,
)

model.deploy(
    initial_instance_count=1,
    instance_type='ml.g5.12xlarge', # choose preferred instance type
    endpoint_name=model_name,
    container_startup_health_check_timeout=3600,
)
model.endpoint_name

## Invoke the endpoint
---

In this notebook, we show how you can encode an image into bytes, send a request, and decode the response.

***
### Notes
- This model receives an image in byte and returns 4 captions in total along with the confidence score for each caption.
- The confidence score indicates how sure the model is that each caption well describes the given image.
***


In [None]:
import base64
from PIL import Image
from io import BytesIO
import numpy as np
import tritonclient.http as httpclient
import requests
import json
import boto3

smr_client = boto3.client("sagemaker-runtime")

def encode_image(image):
    buffer = BytesIO()
    image.save(buffer, format="JPEG")
    img_str = base64.b64encode(buffer.getvalue())
    return img_str

def get_sample_binary(payload):

    inputs = []
    outputs = []
    for idx, dic in enumerate(payload["inputs"]):
        input_name = dic["name"]
        input_value = dic["data"][0]

        input_value = np.array([input_value.encode('utf-8')], dtype=np.object_)

        input_value = np.expand_dims(input_value, axis=0)
        inputs.append(httpclient.InferInput(input_name, [1, 1], "BYTES"))
        inputs[idx].set_data_from_numpy(input_value)

    outputs.append(httpclient.InferRequestedOutput("generated_caption", binary_data=True))

    request_body, header_length = httpclient.InferenceServerClient.generate_request_body(
        inputs, outputs=outputs
    )
    return request_body, header_length


def invoke_endpoint(endpoint_name, payload):
    import re
    request_body, header_length = get_sample_binary(payload)
    response = smr_client.invoke_endpoint(
        EndpointName=endpoint_name,
        ContentType="application/vnd.sagemaker-triton.binary+json;json-header-size={}".format(
            header_length
        ),
        Body=request_body
    )
    data = response["Body"].read()
    ptn = re.compile(rb'\{"binary_data_size":[0-9]*\}')
    match = json.loads(ptn.search(data).group().decode('utf-8'))
    binary_data_size = match['binary_data_size']
    binary_data = data[len(data)-binary_data_size+1:]
    binary_data = binary_data.replace(b'\x00', b'')
    binary_data = binary_data.replace(b'\x01', b'').decode('utf-8')    

    return eval(binary_data)

### Prepare Image

In [None]:
def verify_image(image):
    width, height = image.size
    if width < 256 or height < 256:
        if width > height:
            ratio = width / height         
            image = image.resize((int( 256* ratio), 256)) 
        else:
            ratio = height / width
            image = image.resize((256, int( 256 * ratio)))                 
    elif width > 4096 or height > 4096:        
        if width > height:
            ratio = height / width         
            image = image.resize((4096, int( 4096 * ratio))) 
        else:
            ratio = width / height
            image = image.resize((int( 4096 * ratio), 4096))  
    return image

### Example 1 (single image from web)

In [None]:
url = 'https://github.com/LGAI-Research/EXAONE-Atelier/blob/main/example.png?raw=true'
image = Image.open(requests.get(url, stream=True).raw).convert('RGB')  
# verify image to avoid image size out of range error (256 x 256 ~ 4096 x 4096)
image = verify_image(image)
display(image)
input_image = encode_image(image)

In [None]:
inputs = dict(
    image=input_image,
)

payload = {
    "inputs": [
        {"name": name, "shape": [1, -1], "datatype": "BYTES", "data": [data.decode('utf8')]}
        for name, data in inputs.items()
    ]
}

In [None]:
import time

endpoint_name=model.endpoint_name
start_time = time.time()
captions = invoke_endpoint(endpoint_name, payload)
print('{:.2f} sec'.format(time.time()-start_time))
print(captions)

### Example 2 (batch inference on images in local directory)

In [None]:
from pathlib import Path
import time

image_dir = 'image_data/'

path = Path(image_dir)

image_files = [
*path.glob('**/*.png'), *path.glob('**/*.jpg'),
*path.glob('**/*.jpeg')
    ]

endpoint_name=model.endpoint_name


In [None]:
for i, file in enumerate(image_files): 
    image = Image.open(file).convert('RGB')
    image = verify_image(image)
    display(image)
    input_image = encode_image(image)
    inputs = dict(image=input_image,)
        
    inputs = dict(
        image=input_image,
    )

    payload = {
        "inputs": [
            {"name": name, "shape": [1, -1], "datatype": "BYTES", "data": [data.decode('utf8')]}
            for name, data in inputs.items()
        ]
    }
    start_time = time.time()
    captions = invoke_endpoint(endpoint_name, payload)
    print('{:.2f} sec'.format(time.time()-start_time))
    print(captions)    


## Clean up the endpoint

In [None]:
# Delete the SageMaker endpoint
model.sagemaker_session.delete_endpoint(model.endpoint_name)
model.sagemaker_session.delete_endpoint_config(model.endpoint_name)
model.delete_model()