### Model Deployment with Llava1.6 on Inf2

#### Init SageMaker Runtime

In [54]:
import sagemaker
from sagemaker import image_uris
import boto3
import os
import time
import json
from pathlib import Path
from sagemaker.utils import name_from_base

role = sagemaker.get_execution_role()  # execution role for the endpoint
sess = sagemaker.session.Session()  # sagemaker session for interacting with different AWS APIs
default_bucket = sess.default_bucket()  # bucket to house artifacts
region = sess._region_name


In [55]:
%%writefile serving.properties
engine=Python
option.model_id=cszhzleo/LLaVA-1.6-Mistral-7B-nc2-bs1-token4096-neuron-219
option.tensor_parallel_degree=2

Writing serving.properties


In [56]:
# Construct code artifacts tar
code_tarname = 'code'

!mkdir -p {code_tarname}
!rm -rf {code_tarname}.tar.gz
!rm -rf {code_tarname}/.ipynb_checkpoints

!cp model.py {code_tarname}/
!cp utils.py {code_tarname}/
!cp requirements.txt {code_tarname}/
!mv serving.properties {code_tarname}/
!tar czvf {code_tarname}.tar.gz {code_tarname}/

code/
code/utils.py
code/model.py
code/requirements.txt
code/serving.properties


In [57]:
s3_code_artifact = sess.upload_data(f"{code_tarname}.tar.gz", 
                                    default_bucket, 
                                    sagemaker.utils.name_from_base("tmp/v0"))

In [58]:
# Specify a inference container version, 
# - https://github.com/aws/deep-learning-containers/blob/master/available_images.md#large-model-inference-containers
inference_image_uri = f"763104351884.dkr.ecr.{region}.amazonaws.com/djl-inference:0.29.0-neuronx-sdk2.19.1"

# name a SageMaker Endpoint
#endpoint_name = sagemaker.utils.name_from_base(code_tarname)
endpoint_name = "llava16-endpoint-hf"

In [59]:
from sagemaker.model import Model

model = Model(image_uri=inference_image_uri,
              model_data=s3_code_artifact, 
              role=role)

In [60]:
model.deploy(initial_instance_count = 1,
             instance_type = 'ml.inf2.xlarge', 
             endpoint_name = endpoint_name,
             container_startup_health_check_timeout = 900
            )

Your model is not compiled. Please compile your model before using Inferentia.


-------------------!

In [61]:
from sagemaker import serializers, deserializers

predictor = sagemaker.Predictor(
            endpoint_name=endpoint_name,
            sagemaker_session=sess,
            serializer=serializers.JSONSerializer(),
            deserializer=deserializers.StringDeserializer(),
            )

In [62]:
from io import BytesIO
from PIL import Image
import requests
import base64
def load_image(image_file):
    if image_file.startswith("http") or image_file.startswith("https"):
        response = requests.get(image_file)
        image = Image.open(BytesIO(response.content)).convert("RGB")
    else:
        image = Image.open(image_file).convert("RGB")
    return image

def image_path_handler(image_path):
    img = load_image(image_path)
    byte_io = BytesIO()
    img.save(byte_io, format='PNG')
    encoded_image = base64.b64encode(byte_io.getvalue()).decode('utf-8')
    return encoded_image

In [63]:
image_file="https://llava-vl.github.io/static/images/view.jpg"
prompt = "What are the things I should be cautious about when I visit here?"
byte_image=image_path_handler(image_file)

In [64]:
result=predictor.predict(
    {        
        "prompt":prompt,
        "image": byte_image,
        "parameters": {
            "top_k": 100,
            "top_p": 0.1,
            "temperature": 0.2,
        }}
)

In [65]:
print(result)

 When visiting a location like the one shown in the image, which appears to be a serene lake with a dock and surrounded by forest and mountains, here are several things to be cautious about:

1. **Water Safety**: If you plan to swim or engage in water activities, make sure you are aware of the water's depth and currents. Lakes can have unseen hazards like underwater rocks or sudden drop-offs.

2. **Weather Conditions**: Mountain weather can change rapidly. Check the forecast before you go and be prepared for sudden changes in weather.

3. **Wildlife**: Forested areas can be home to wildlife. Be aware of your surroundings and know what to do if you encounter wildlife.

4. **Leave No Trace**: Practice Leave No Trace principles to protect the environment. This includes packing out all trash, staying on designated trails, and not disturbing the natural habitat.

5. **Navigation**: Have a map or GPS device to navigate the area, especially if you plan to hike or explore the surrounding fores

In [66]:
smr_client = boto3.client("sagemaker-runtime")

response_model = smr_client.invoke_endpoint(
            EndpointName=endpoint_name,
            Body=json.dumps(
            {        
                "prompt":prompt,
                "image": byte_image,
                "parameters": {
                "top_k": 100,
                "top_p": 0.1,
                "temperature": 0.2,
            }}
            ),
            ContentType="application/json",
        )

result=response_model['Body'].read().decode('utf8')

In [67]:
print(result)

 When visiting a location like the one shown in the image, which appears to be a serene lake with a dock and surrounded by forest and mountains, here are several things to be cautious about:

1. **Water Safety**: If you plan to swim or engage in water activities, make sure you are aware of the water's depth and currents. Lakes can have unseen hazards like underwater rocks or sudden drop-offs.

2. **Weather Conditions**: Mountain weather can change rapidly. Check the forecast before you go and be prepared for sudden changes in weather.

3. **Wildlife**: Forested areas can be home to wildlife. Be aware of your surroundings and know what to do if you encounter wildlife.

4. **Leave No Trace**: Practice Leave No Trace principles to protect the environment. This includes packing out all trash, staying on designated trails, and not disturbing the natural habitat.

5. **Navigation**: Have a map or GPS device to navigate the area, especially if you plan to hike or explore the surrounding fores

In [68]:
predictor.delete_model()
predictor.delete_endpoint()