# SageMaker Example

## 1. Create your container repository

open aws console and create a repository for your container: https://us-west-2.console.aws.amazon.com/ecr/create-repository?region=us-west-2

for example `236995464743.dkr.ecr.us-west-2.amazonaws.com/sagemaker_endpoint/vllm`

In [None]:
# login
!aws ecr get-login-password --region us-west-2 | docker login --username AWS --password-stdin 236995464743.dkr.ecr.us-west-2.amazonaws.com

container = "236995464743.dkr.ecr.us-west-2.amazonaws.com/sagemaker_endpoint/vllm:latest"

## 2. Build the container

demo codes are in `app/`
build and push the docker with following commands:

In [None]:
!docker build -t sagemaker_endpoint/vllm .
!docker tag sagemaker_endpoint/vllm:latest {container}
!docker push {container}

## 3. Deploy on SageMaker

define the model and deploy on SageMaker


In [None]:
import boto3
import sagemaker
from sagemaker import Model

In [None]:
sess = sagemaker.Session()
role = sagemaker.get_execution_role()

### Option 1: deploy vllm by scripts

In [None]:
%%bash
rm vllm_by_scripts.tar.gz
tar czvf vllm_by_scripts.tar.gz vllm_by_scripts/

In [None]:
s3_code_prefix = f"sagemaker_endpoint/vllm/mymodel"
bucket = sess.default_bucket() 
code_artifact = sess.upload_data("mymodel.tar.gz", bucket, s3_code_prefix)
print(f"S3 Code or Model tar ball uploaded to --- > {code_artifact}")

In [None]:
model = Model(
    name="sagemaker-vllm",
    model_data=code_artifact,
    image_uri=container,
    role=role,
)

In [None]:
# 部署模型到endpoint
endpoint_name = sagemaker.utils.name_from_base("sagemaker-vllm")
print(f"endpoint_name: {endpoint_name}")
predictor = model.deploy(
    initial_instance_count=1,
    instance_type='ml.g5.2xlarge',
    endpoint_name=endpoint_name,
)

## 4. Test

you can invoke your model with SageMaker SDK

In [None]:
runtime = boto3.client('runtime.sagemaker')

