### Install packages

In [6]:
%pip install sagemaker --upgrade --quiet
%pip install -U huggingface_hub

Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.


### Imports & Setup

In [234]:
import sagemaker
from sagemaker.djl_inference import DJLModel
from sagemaker import image_uris
import boto3
import json

role = sagemaker.get_execution_role() # iam role for the endpoint
session = sagemaker.session.Session() # sagemaker session for interacting with aws APIs
bucket = session.default_bucket()

def get_aws_region():
    # Get the current AWS region from the default session
    session = boto3.session.Session()
    return session.region_name

region = get_aws_region()
prefix="Intern-vl2"

### Download the pretrained model

In [56]:
import shutil, os

#define model_id
model_id = "OpenGVLab/InternVL2-1B"

# clear any existing model artifacts
def create_folder(path):
    shutil.rmtree(path, ignore_errors=True)
    os.makedirs(path)
    
dir_path = "inference/pretrained"
create_folder(dir_path)

model_name = model_id.split('/')[-1]
local_dir = f"{dir_path}/{model_name}"

!huggingface-cli download --resume-download --local-dir-use-symlinks False {model_id} --local-dir {local_dir}

Fetching 19 files:   0%|                                 | 0/19 [00:00<?, ?it/s]Downloading 'configuration_internvl_chat.py' to 'inference/pretrained/InternVL2-1B/.cache/huggingface/download/configuration_internvl_chat.py.2b06ab6f2eddcdb9379c3e4effbbfdda8c538633.incomplete'
Downloading 'conversation.py' to 'inference/pretrained/InternVL2-1B/.cache/huggingface/download/conversation.py.2fe37ad08c18c49fd5a4d7e0aa9be10fbeead22c.incomplete'
Downloading '.gitattributes' to 'inference/pretrained/InternVL2-1B/.cache/huggingface/download/.gitattributes.3ecf72ff46e87246d8fc73fcaf99995ea09063b2.incomplete'
Downloading 'examples/image1.jpg' to 'inference/pretrained/InternVL2-1B/.cache/huggingface/download/examples/image1.jpg.fd9891ef7e00774157a9dcd726b2ea9fa0c5ecff.incomplete'
Downloading 'added_tokens.json' to 'inference/pretrained/InternVL2-1B/.cache/huggingface/download/added_tokens.json.92cb8e68c377d444a75b942a63f65408188bc25b.incomplete'
Downloading 'configuration_intern_vit.py' to 'inference

Push the parameter into `serving.properties` file

In [118]:
!sed -i 's@option.model_id=.*@option.model_id={model_name}@g' inference/serving.properties

### Upload model folder to S3

In [259]:
!rm -rf `find -type d -name .ipynb_checkpoints`

In [260]:
!find . | grep -E "(/__pycache__$|\.pyc$|\.pyo$)" | xargs rm -rf

In [261]:
model_s3_uri = f"s3://{bucket}/{prefix}/models/model-version-01/"
!aws s3 sync ./inference {model_s3_uri}

Completed 6.2 KiB/6.2 KiB (93.9 KiB/s) with 1 file(s) remainingupload: inference/model.py to s3://sagemaker-us-west-2-376678947624/Intern-vl2/models/model-version-01/model.py


### Get inference container image

In [262]:
inference_image_uri = image_uris.retrieve(
    framework="djl-deepspeed", region=session.boto_session.region_name, version="0.27.0"
)
inference_image_uri

'763104351884.dkr.ecr.us-west-2.amazonaws.com/djl-inference:0.27.0-deepspeed0.12.6-cu121'

In [263]:
# This is how we can specify uncompressed model artifacts
model_data = {
    "S3DataSource": {
        "S3Uri": model_s3_uri,
        'S3DataType': 'S3Prefix',
        'CompressionType': 'None'
    }
} 

### Create SageMaker Model

In [264]:
# create your SageMaker Model
model = sagemaker.Model(
    image_uri=inference_image_uri,
    model_data=model_data, 
    role=role)

In [265]:
# instance type you will deploy your model to
instance_type = "ml.g5.xlarge"
endpoint_name = sagemaker.utils.name_from_base(prefix)

# deploy your model
model.deploy(
    instance_type=instance_type,
    initial_instance_count=1,
    endpoint_name=endpoint_name,
)

-------------!

### Test Inference Endpoint

In [266]:
sm_runtime = boto3.client("sagemaker-runtime")

Upload and sync example images to S3

In [267]:
!rm -rf `find -type d -name .ipynb_checkpoints`

In [268]:
!find . | grep -E "(/__pycache__$|\.pyc$|\.pyo$)" | xargs rm -rf

In [269]:
examples_s3_uri = f"s3://{bucket}/{prefix}/examples/"
!aws s3 sync ./examples {examples_s3_uri}

### pure-text conversation

In [270]:
%time
prompt = "Hello, who are you?"
inputs = {"prompt": prompt}

response = sm_runtime.invoke_endpoint(
    EndpointName=endpoint_name, Body=json.dumps(inputs)
)

print(response["Body"].read().decode('utf-8'))

CPU times: user 2 μs, sys: 1 μs, total: 3 μs
Wall time: 5.25 μs
I am an AI assistant whose name is InternVL, developed jointly by Shanghai AI Lab and SenseTime.


### single-image single-round conversation

In [271]:
%time

# single-image single-round conversation (单图单轮对话)
prompt = "<image>\nPlease describe the image shortly."
inputs = {"prompt": prompt, "images":[f"{examples_s3_uri}examples_image1.jpg"]}

response = sm_runtime.invoke_endpoint(
    EndpointName=endpoint_name, Body=json.dumps(inputs)
)

print(response["Body"].read().decode('utf-8'))

CPU times: user 2 μs, sys: 0 ns, total: 2 μs
Wall time: 4.53 μs
The image shows a red panda, a small, furry mammal known for its distinctive red and white fur. The red panda is resting on a wooden structure, possibly a platform or a bench, with its head leaning over the edge. The animal has a calm and curious expression, with its eyes looking directly at the camera. The background features a blurred natural setting with greenery, suggesting that the photo was taken in a zoo or a wildlife park.


### single-image multi-round conversation

In [272]:
# single-image single-round conversation (单图单轮对话)
prompt = "<image>\nPlease describe the image shortly."
inputs = {"prompt": prompt, "images":[f"{examples_s3_uri}examples_image1.jpg"], "parameters":{"reset_history":True}}

response = sm_runtime.invoke_endpoint(
    EndpointName=endpoint_name, Body=json.dumps(inputs)
)

print(response["Body"].read().decode('utf-8'))

print("=====\n")

prompt = 'Please write a poem according to the image.'

inputs = {"prompt": prompt}

response = sm_runtime.invoke_endpoint(
    EndpointName=endpoint_name, Body=json.dumps(inputs)
)

print(response["Body"].read().decode('utf-8'))

The image shows a red panda, a small, furry mammal known for its distinctive red and white fur. The red panda is resting on a wooden structure, possibly a platform or a bench, with its head leaning over the edge. The animal's large, expressive eyes and gentle expression are evident, and it appears to be looking directly at the camera. The background is blurred but suggests an outdoor setting with greenery, indicating that the photo might have been taken in a zoo or a wildlife park.
=====

In the forest of green,
Where the red panda rests,
A gentle creature,
Its fur a red and white,
A sight to behold,
A peaceful presence,
A symbol of nature's grace.


### multi-image multi-round conversation, combined images

In [273]:
%time
# single-image single-round conversation (单图单轮对话)
prompt = "<image>\nDescribe the two images in detail."
inputs = {
    "prompt": prompt, 
    "images":[f"{examples_s3_uri}examples_image1.jpg",
              f"{examples_s3_uri}examples_image2.jpg",],
    "parameters":{"reset_history":True}
}

response = sm_runtime.invoke_endpoint(
    EndpointName=endpoint_name, Body=json.dumps(inputs)
)

print(response["Body"].read().decode('utf-8'))
print("=====\n")

prompt = 'What are the similarities and differences between these two images.'

inputs = {"prompt": prompt}

response = sm_runtime.invoke_endpoint(
    EndpointName=endpoint_name, Body=json.dumps(inputs)
)

print(response["Body"].read().decode('utf-8'))

CPU times: user 2 μs, sys: 0 ns, total: 2 μs
Wall time: 4.77 μs
The image depicts a panda cub, which is a species of Asian bear, resting in a natural setting. The cub is positioned in a somewhat relaxed manner, with its head resting on its paws and its front paws holding onto a wooden structure. The cub's fur is predominantly black with white patches, and it has a large, expressive face with large, dark eyes and a small, black nose. The background features lush green foliage, indicating that the cub is in a natural habitat, likely a zoo or wildlife park.

The environment is filled with various green plants and trees, suggesting a dense, forested area. The presence of the wooden structure, possibly a part of a wooden fence or platform, adds to the naturalistic setting. The overall scene is serene and peaceful, capturing the natural beauty and tranquility of the panda's habitat.
=====

The two images depict two different aspects of the same subject, the panda cub. The similarities betwee

### video understanding


In [275]:
prompt = 'What is the red panda doing?'
inputs = {
    "prompt": prompt, 
    "video":f"{examples_s3_uri}red-panda.mp4",
    "parameters":{"reset_history":True}
}

response = sm_runtime.invoke_endpoint(
    EndpointName=endpoint_name, Body=json.dumps(inputs)
)

print(response["Body"].read().decode('utf-8'))
print("=====\n")

prompt = 'Describe this video in detail. Don\'t repeat.'

inputs = {"prompt": prompt,
    "video":f"{examples_s3_uri}red-panda.mp4",
    "parameters":{"reset_history":True}}

response = sm_runtime.invoke_endpoint(
    EndpointName=endpoint_name, Body=json.dumps(inputs)
)

print(response["Body"].read().decode('utf-8'))


The red panda is seen hanging upside down from a tree branch.
=====

In the video, we see two pandas hanging from a tree branch. The panda on the left is black and white, while the one on the right is black and red. They appear to be enjoying the view and seem to be in a relaxed state. The background shows a green forest, which adds to the natural setting of the scene. The camera angle is slightly elevated, giving us a clear view of their movements and expressions. The lighting is bright, and the colors are vivid, making the scene look lively and engaging. Overall, the video captures a peaceful moment between two pandas in their natural habitat.
