In [1]:
!pip install --upgrade pip
!pip -q install sagemaker awscli boto3 pandas --upgrade 



## Example: TorchServe Performance Tuning on Amazon SageMaker

In this example, we’ll show you how you can tune TorchServe performance, build a TorchServe container and host it using Amazon SageMaker. With Amazon SageMaker hosting you get a fully-managed hosting experience. Just specify the type of instance, and the maximum and minimum number desired, and SageMaker takes care of the rest.

Performance tuning parameters in TorchServe:(https://github.com/pytorch/serve/blob/master/docs/configuration.md#other-properties)
* number_of_netty_threads
* netty_client_threads
* async_logging
* minWorkers
* maxWorkers
* batchSize 

## config.properties

In [18]:
!cat config.properties

inference_address=http://0.0.0.0:8080
management_address=http://0.0.0.0:8081
number_of_netty_threads=32
job_queue_size=1000
model_store=/opt/ml/model
load_models=all
install_py_dep_per_model=true
-XX:-UseContainerSupport -XX:+UnlockDiagnosticVMOptions -XX:+PrintActiveCpus
models={\
  "TransformerEn2Fr": {\
    "1.0": {\
        "defaultVersion": true,\
        "marName": "TransformerEn2Fr.mar",\
        "minWorkers": 1,\
        "maxWorkers": 4,\
        "batchSize": 4,\
        "maxBatchDelay": 500,\
        "responseTimeout": 120\
    }\
  }\
}



### Clone the TorchServe repository

In [None]:
!git clone https://github.com/pytorch/serve.git

In [None]:
!cd /home/ec2-user/SageMaker/torchserve_batch/serve && git checkout issue_1107

### Download a PyTorch model 

In [47]:
model_name = "TransformerEn2Fr"
mar_file = f'{model_name}.mar'
mar_url = f'https://torchserve.pytorch.org/mar_files/{mar_file}'
!wget -q {mar_url}
!ls *.mar

TransformerEn2Fr.mar


### Upload the TransformerEn2Fr.mar archive file to Amazon S3
Create a compressed tar.gz file from the TransformerEn2Fr.mar file since Amazon SageMaker expects that models are in a tar.gz file. 
Uploads the model to your default Amazon SageMaker S3 bucket under the models directory

### Create a boto3 session and get specify a role with SageMaker access

In [19]:
import boto3, time, json
sess    = boto3.Session()
sm      = sess.client('sagemaker')
region  = sess.region_name
account = boto3.client('sts').get_caller_identity().get('Account')

In [20]:
import sagemaker
role = sagemaker.get_execution_role()
sagemaker_session = sagemaker.Session(boto_session=sess)

In [5]:
bucket_name = sagemaker_session.default_bucket()
prefix = 'torchserve'

!tar cvfz {model_name}.tar.gz {mar_file}
!aws s3 cp {model_name}.tar.gz s3://{bucket_name}/{prefix}/models/

TransformerEn2Fr.mar
upload: ./TransformerEn2Fr.tar.gz to s3://sagemaker-us-east-2-057122759684/torchserve/models/TransformerEn2Fr.tar.gz


### Create an Amazon ECR registry
Create a new docker container registry for your torchserve container images.

In [6]:
registry_name = 'torchserve-perf'
!aws ecr create-repository --repository-name {registry_name}


An error occurred (RepositoryAlreadyExistsException) when calling the CreateRepository operation: The repository with name 'torchserve-perf' already exists in the registry with id '057122759684'


### Build a TorchServe Docker container and push it to Amazon ECR

In [27]:
image_label = 'v1'
image = f'{account}.dkr.ecr.{region}.amazonaws.com/{registry_name}:{image_label}'

!docker build -t {registry_name}:{image_label} .
!$(aws ecr get-login --no-include-email --region {region})
!docker tag {registry_name}:{image_label} {image}
!docker push {image}

Sending build context to Docker daemon  4.969GB
Step 1/18 : FROM nvidia/cuda:10.2-cudnn7-runtime-ubuntu18.04
 ---> 096b22f1b242
Step 2/18 : ENV PYTHONUNBUFFERED TRUE
 ---> Running in 3dbd2afe0a54
Removing intermediate container 3dbd2afe0a54
 ---> 274f60ce63c3
Step 3/18 : RUN apt-get update &&     DEBIAN_FRONTEND=noninteractive apt-get install --no-install-recommends -y     fakeroot     ca-certificates     dpkg-dev     g++     python3-dev     openjdk-11-jdk     curl     vim     && rm -rf /var/lib/apt/lists/*     && cd /tmp     && curl -O https://bootstrap.pypa.io/get-pip.py     && python3 get-pip.py
 ---> Running in 7f4994769a54
Get:1 http://security.ubuntu.com/ubuntu bionic-security InRelease [88.7 kB]
Ign:2 https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64  InRelease
Ign:3 https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64  InRelease
Get:4 http://security.ubuntu.com/ubuntu bionic-security/main amd64 Packages [2221 kB]
Get:5 

Get:2 http://archive.ubuntu.com/ubuntu bionic-updates/main amd64 libexpat1 amd64 2.2.5-3ubuntu0.2 [80.5 kB]
Get:3 http://archive.ubuntu.com/ubuntu bionic-updates/main amd64 python3.6-minimal amd64 3.6.9-1~18.04ubuntu1.4 [1610 kB]
Get:4 http://archive.ubuntu.com/ubuntu bionic-updates/main amd64 python3-minimal amd64 3.6.7-1~18.04 [23.7 kB]
Get:5 http://archive.ubuntu.com/ubuntu bionic/main amd64 mime-support all 3.60ubuntu1 [30.1 kB]
Get:6 http://archive.ubuntu.com/ubuntu bionic/main amd64 libmpdec2 amd64 2.4.2-1ubuntu1 [84.1 kB]
Get:7 http://archive.ubuntu.com/ubuntu bionic-updates/main amd64 libpython3.6-stdlib amd64 3.6.9-1~18.04ubuntu1.4 [1712 kB]
Get:8 http://archive.ubuntu.com/ubuntu bionic-updates/main amd64 python3.6 amd64 3.6.9-1~18.04ubuntu1.4 [203 kB]
Get:9 http://archive.ubuntu.com/ubuntu bionic-updates/main amd64 libpython3-stdlib amd64 3.6.7-1~18.04 [7240 B]
Get:10 http://archive.ubuntu.com/ubuntu bionic-updates/main amd64 python3 amd64 3.6.7-1~18.04 [47.2 kB]
Get:11 http:

Get:81 http://archive.ubuntu.com/ubuntu bionic-updates/main amd64 patch amd64 2.7.6-2ubuntu1.1 [102 kB]
Get:82 http://archive.ubuntu.com/ubuntu bionic/main amd64 make amd64 4.1-9.1ubuntu1 [154 kB]
Get:83 http://archive.ubuntu.com/ubuntu bionic-updates/main amd64 dpkg-dev all 1.19.0.5ubuntu2.3 [607 kB]
Get:84 http://archive.ubuntu.com/ubuntu bionic/main amd64 libfakeroot amd64 1.22-2ubuntu1 [25.9 kB]
Get:85 http://archive.ubuntu.com/ubuntu bionic/main amd64 fakeroot amd64 1.22-2ubuntu1 [62.3 kB]
Get:86 http://archive.ubuntu.com/ubuntu bionic-updates/main amd64 libcc1-0 amd64 8.4.0-1ubuntu1~18.04 [39.4 kB]
Get:87 http://archive.ubuntu.com/ubuntu bionic-updates/main amd64 libgomp1 amd64 8.4.0-1ubuntu1~18.04 [76.5 kB]
Get:88 http://archive.ubuntu.com/ubuntu bionic-updates/main amd64 libitm1 amd64 8.4.0-1ubuntu1~18.04 [27.9 kB]
Get:89 http://archive.ubuntu.com/ubuntu bionic-updates/main amd64 libatomic1 amd64 8.4.0-1ubuntu1~18.04 [9192 B]
Get:90 http://archive.ubuntu.com/ubuntu bionic-updat

Selecting previously unselected package libmpdec2:amd64.
Preparing to unpack .../2-libmpdec2_2.4.2-1ubuntu1_amd64.deb ...
Unpacking libmpdec2:amd64 (2.4.2-1ubuntu1) ...
Selecting previously unselected package libpython3.6-stdlib:amd64.
Preparing to unpack .../3-libpython3.6-stdlib_3.6.9-1~18.04ubuntu1.4_amd64.deb ...
Unpacking libpython3.6-stdlib:amd64 (3.6.9-1~18.04ubuntu1.4) ...
Selecting previously unselected package python3.6.
Preparing to unpack .../4-python3.6_3.6.9-1~18.04ubuntu1.4_amd64.deb ...
Unpacking python3.6 (3.6.9-1~18.04ubuntu1.4) ...
Selecting previously unselected package libpython3-stdlib:amd64.
Preparing to unpack .../5-libpython3-stdlib_3.6.7-1~18.04_amd64.deb ...
Unpacking libpython3-stdlib:amd64 (3.6.7-1~18.04) ...
Setting up python3-minimal (3.6.7-1~18.04) ...
Selecting previously unselected package python3.
(Reading database ... 5514 files and directories currently installed.)
Preparing to unpack .../0-python3_3.6.7-1~18.04_amd64.deb ...
Unpacking python3 (3.6.

Selecting previously unselected package libcups2:amd64.
Preparing to unpack .../033-libcups2_2.2.7-1ubuntu2.8_amd64.deb ...
Unpacking libcups2:amd64 (2.2.7-1ubuntu2.8) ...
Selecting previously unselected package liblcms2-2:amd64.
Preparing to unpack .../034-liblcms2-2_2.9-1ubuntu0.1_amd64.deb ...
Unpacking liblcms2-2:amd64 (2.9-1ubuntu0.1) ...
Selecting previously unselected package libjpeg8:amd64.
Preparing to unpack .../035-libjpeg8_8c-2ubuntu8_amd64.deb ...
Unpacking libjpeg8:amd64 (8c-2ubuntu8) ...
Selecting previously unselected package libfreetype6:amd64.
Preparing to unpack .../036-libfreetype6_2.8.1-2ubuntu2.1_amd64.deb ...
Unpacking libfreetype6:amd64 (2.8.1-2ubuntu2.1) ...
Selecting previously unselected package fonts-dejavu-core.
Preparing to unpack .../037-fonts-dejavu-core_2.37-1_all.deb ...
Unpacking fonts-dejavu-core (2.37-1) ...
Selecting previously unselected package fontconfig-config.
Preparing to unpack .../038-fontconfig-config_2.12.6-0ubuntu2_all.deb ...
Unpacking 

Selecting previously unselected package gcc-7.
Preparing to unpack .../080-gcc-7_7.5.0-3ubuntu1~18.04_amd64.deb ...
Unpacking gcc-7 (7.5.0-3ubuntu1~18.04) ...
Selecting previously unselected package gcc.
Preparing to unpack .../081-gcc_4%3a7.4.0-1ubuntu2.3_amd64.deb ...
Unpacking gcc (4:7.4.0-1ubuntu2.3) ...
Selecting previously unselected package libc-dev-bin.
Preparing to unpack .../082-libc-dev-bin_2.27-3ubuntu1.4_amd64.deb ...
Unpacking libc-dev-bin (2.27-3ubuntu1.4) ...
Selecting previously unselected package linux-libc-dev:amd64.
Preparing to unpack .../083-linux-libc-dev_4.15.0-147.151_amd64.deb ...
Unpacking linux-libc-dev:amd64 (4.15.0-147.151) ...
Selecting previously unselected package libc6-dev:amd64.
Preparing to unpack .../084-libc6-dev_2.27-3ubuntu1.4_amd64.deb ...
Unpacking libc6-dev:amd64 (2.27-3ubuntu1.4) ...
Selecting previously unselected package libstdc++-7-dev:amd64.
Preparing to unpack .../085-libstdc++-7-dev_7.5.0-3ubuntu1~18.04_amd64.deb ...
Unpacking libstdc++

Selecting previously unselected package vim-runtime.
Preparing to unpack .../125-vim-runtime_2%3a8.0.1453-1ubuntu1.4_all.deb ...
Adding 'diversion of /usr/share/vim/vim80/doc/help.txt to /usr/share/vim/vim80/doc/help.txt.vim-tiny by vim-runtime'
Adding 'diversion of /usr/share/vim/vim80/doc/tags to /usr/share/vim/vim80/doc/tags.vim-tiny by vim-runtime'
Unpacking vim-runtime (2:8.0.1453-1ubuntu1.4) ...
Selecting previously unselected package vim.
Preparing to unpack .../126-vim_2%3a8.0.1453-1ubuntu1.4_amd64.deb ...
Unpacking vim (2:8.0.1453-1ubuntu1.4) ...
Setting up libquadmath0:amd64 (8.4.0-1ubuntu1~18.04) ...
Setting up libedit2:amd64 (3.1-20170329-1) ...
Setting up libgomp1:amd64 (8.4.0-1ubuntu1~18.04) ...
Setting up libatomic1:amd64 (8.4.0-1ubuntu1~18.04) ...
Setting up libglvnd0:amd64 (1.0.0-2ubuntu2.3) ...
Setting up libcc1-0:amd64 (8.4.0-1ubuntu1~18.04) ...
Setting up make (4.1-9.1ubuntu1) ...
Setting up libllvm10:amd64 (1:10.0.0-4ubuntu1~18.04.2) ...
Setting up libnghttp2-14:am

Setting up libperl5.26:amd64 (5.26.1-6ubuntu0.5) ...
Setting up libexpat1-dev:amd64 (2.2.5-3ubuntu0.2) ...
Setting up libkrb5-3:amd64 (1.16-2ubuntu0.2) ...
Setting up libavahi-common3:amd64 (0.7-3.1ubuntu1.3) ...
Setting up libdrm-radeon1:amd64 (2.4.101-2~18.04.1) ...
Setting up libdrm-nouveau2:amd64 (2.4.101-2~18.04.1) ...
Setting up libxcb1:amd64 (1.13-2~ubuntu18.04) ...
Setting up libpython3.6:amd64 (3.6.9-1~18.04ubuntu1.4) ...
Setting up binutils-x86-64-linux-gnu (2.30-21ubuntu1~18.04.5) ...
Setting up libpython3-stdlib:amd64 (3.6.7-1~18.04) ...
Setting up cpp (4:7.4.0-1ubuntu2.3) ...
Setting up libxcb-present0:amd64 (1.13-2~ubuntu18.04) ...
Setting up libfontconfig1:amd64 (2.12.6-0ubuntu2) ...
Setting up libxcb-dri2-0:amd64 (1.13-2~ubuntu18.04) ...
Setting up libxcb-dri3-0:amd64 (1.13-2~ubuntu18.04) ...
Setting up libxcb-glx0:amd64 (1.13-2~ubuntu18.04) ...
Setting up python3 (3.6.7-1~18.04) ...
Setting up libdrm-amdgpu1:amd64 (2.4.101-2~18.04.1) ...
Setting up vim (2:8.0.1453-1ubu

update-alternatives: using /usr/lib/jvm/java-11-openjdk-amd64/bin/pack200 to provide /usr/bin/pack200 (pack200) in auto mode
update-alternatives: using /usr/lib/jvm/java-11-openjdk-amd64/bin/unpack200 to provide /usr/bin/unpack200 (unpack200) in auto mode
update-alternatives: using /usr/lib/jvm/java-11-openjdk-amd64/lib/jexec to provide /usr/bin/jexec (jexec) in auto mode
Setting up openjdk-11-jdk-headless:amd64 (11.0.11+9-0ubuntu2~18.04) ...
update-alternatives: using /usr/lib/jvm/java-11-openjdk-amd64/bin/jar to provide /usr/bin/jar (jar) in auto mode
update-alternatives: using /usr/lib/jvm/java-11-openjdk-amd64/bin/jarsigner to provide /usr/bin/jarsigner (jarsigner) in auto mode
update-alternatives: using /usr/lib/jvm/java-11-openjdk-amd64/bin/javac to provide /usr/bin/javac (javac) in auto mode
update-alternatives: using /usr/lib/jvm/java-11-openjdk-amd64/bin/javadoc to provide /usr/bin/javadoc (javadoc) in auto mode
update-alternatives: using /usr/lib/jvm/java-11-openjdk-amd64/bin

Adding debian:Buypass_Class_2_Root_CA.pem
Adding debian:Actalis_Authentication_Root_CA.pem
Adding debian:Trustwave_Global_ECC_P256_Certification_Authority.pem
Adding debian:QuoVadis_Root_CA_1_G3.pem
Adding debian:DigiCert_Assured_ID_Root_G3.pem
Adding debian:Entrust_Root_Certification_Authority_-_EC1.pem
Adding debian:emSign_ECC_Root_CA_-_C3.pem
Adding debian:OISTE_WISeKey_Global_Root_GC_CA.pem
Adding debian:T-TeleSec_GlobalRoot_Class_3.pem
Adding debian:GlobalSign_ECC_Root_CA_-_R5.pem
Adding debian:Starfield_Services_Root_Certificate_Authority_-_G2.pem
Adding debian:CFCA_EV_ROOT.pem
Adding debian:Hellenic_Academic_and_Research_Institutions_RootCA_2015.pem
Adding debian:Entrust_Root_Certification_Authority.pem
Adding debian:GlobalSign_Root_CA.pem
Adding debian:certSIGN_ROOT_CA.pem
Adding debian:Atos_TrustedRoot_2011.pem
Adding debian:GlobalSign_ECC_Root_CA_-_R4.pem
Adding debian:SSL.com_EV_Root_Certification_Authority_RSA_R2.pem
Adding debian:GlobalSign_Root_CA_-_R2.pem
done.
Setting u

  Downloading future-0.18.2.tar.gz (829 kB)
Building wheels for collected packages: torchserve, future
  Building wheel for torchserve (setup.py): started
  Building wheel for torchserve (setup.py): finished with status 'done'
  Created wheel for torchserve: filename=torchserve-0.4.0b20210711-py3-none-any.whl size=18086068 sha256=c43cda85825c9641d22dbf741e3645d12fcf9370a90fc197ce62343b552d1a51
  Stored in directory: /tmp/pip-ephem-wheel-cache-34bwe794/wheels/e0/2d/47/eeb1e34cdf27eebe4ed67c397489eaadf725da4a8e148eb347
  Building wheel for future (setup.py): started
  Building wheel for future (setup.py): finished with status 'done'
  Created wheel for future: filename=future-0.18.2-py3-none-any.whl size=491070 sha256=cd9b1e606c111935d12e7dcafe1ca31ff55c08dba977222ebca2b6d8cfda6202
  Stored in directory: /root/.cache/pip/wheels/6e/9c/ed/4499c9865ac1002697793e0ae05ba6be33553d098f3347fb94
Successfully built torchserve future
Installing collected packages: future, torchserve
Successfully in

[11Bf96311f: Pushing  1.788GB/2.021GB[15A[2K[13A[2K[13A[2K[15A[2K[12A[2K[10A[2K[11A[2K[13A[2K[11A[2K[13A[2K[10A[2K[7A[2K[8A[2K[4A[2K[8A[2K[12A[2K[8A[2K[11A[2K[13A[2K[11A[2K[13A[2K[8A[2K[13A[2K[8A[2K[13A[2K[8A[2K[12A[2K[13A[2K[11A[2K[13A[2K[11A[2K[13A[2K[11A[2K[12A[2K[8A[2K[12A[2K[8A[2K[12A[2K[8A[2K[12A[2K[13A[2K[11A[2K[12A[2K[11A[2K[12A[2K[8A[2K[12A[2K[8A[2K[11A[2K[13A[2K[11A[2K[13A[2K[11A[2K[8A[2KPushing  64.38MB/280.3MB[12A[2K[13A[2K[12A[2K[13A[2K[12A[2K[13A[2K[11A[2K[12A[2K[11A[2K[13A[2K[11A[2K[12A[2K[8A[2K[13A[2K[13A[2K[11A[2K[13A[2K[11A[2K[11A[2K[11A[2K[8A[2K[11A[2K[8A[2K[8A[2K[11A[2K[13A[2K[8A[2K[13A[2K[8A[2K[13A[2K[11A[2K[11A[2K[8A[2K[13A[2K[8A[2K[11A[2K[8A[2K[11A[2K[8A[2K[8A[2K[13A[2K[8A[2K[11A[2K[8A[2K[11A[2K[8A[2K[11A[2K[13A[2K[8A[2K[13A[2K[8A[2K[13A[2K[8A[2K[11A

[11Bf96311f: Pushed   2.029GB/2.021GB[11A[2K[11A[2K[11A[2K[11A[2K[11A[2K[11A[2K[11A[2K[11A[2K[11A[2K[11A[2K[11A[2K[11A[2K[11A[2K[11A[2K[11A[2K[11A[2K[11A[2K[11A[2K[11A[2K[11A[2K[11A[2K[11A[2K[11A[2K[11A[2K[11A[2K[11A[2K[11A[2K[11A[2K[11A[2K[11A[2K[11A[2K[11A[2K[11A[2K[11A[2K[11A[2K[11A[2K[11A[2K[11A[2K[11A[2K[11A[2K[11A[2K[11A[2K[11A[2K[11A[2K[11A[2K[11A[2K[11A[2K[11A[2K[11A[2K[11A[2K[11A[2K[11A[2K[11A[2K[11A[2K[11A[2K[11A[2K[11A[2K[11A[2K[11A[2K[11A[2K[11A[2K[11A[2K[11A[2K[11A[2K[11A[2K[11A[2K[11A[2K[11A[2K[11A[2K[11A[2K[11A[2K[11A[2K[11A[2K[11A[2K[11A[2K[11A[2K[11A[2K[11A[2K[11A[2Kv1: digest: sha256:9f11b068576c9ee01010546c773e374b87f59593f9c820d77bace661edf23095 size: 3888


### Deploy endpoint and make prediction using Amazon SageMaker SDK

In [28]:
from sagemaker.model import Model
from sagemaker.predictor import Predictor

model_data = f's3://{bucket_name}/{prefix}/models/{model_name}.tar.gz'
sm_model_name = f'torchserve-{model_name}'

torchserve_model = Model(model_data = model_data, 
                         image_uri = image,
                         role  = role,
                         predictor_cls=Predictor,
                         name  = sm_model_name)

In [29]:
endpoint_name = 'torchserve-endpoint-' + time.strftime("%Y-%m-%d-%H-%M-%S", time.gmtime())

predictor = torchserve_model.deploy(instance_type='ml.g4dn.xlarge',
                                    initial_instance_count=1,
                                    endpoint_name = endpoint_name)

Using already existing model: torchserve-TransformerEn2Fr


-----------------!

### Test the TorchServe hosted model

In [34]:
payload = "Hi James, when are you coming back home? I am waiting for you. Please come as soon as possible."    
response = predictor.predict(data=payload)
print(response)

b'{"input": "Hi James, when are you coming back home? I am waiting for you.\\nPlease come as soon as possible.", "french_output": "Bonjour James, quand rentrerez-vous chez vous, je vous attends et je vous prie de venir le plus t\\u00f4t possible."}'


### Batch Transform Jobs

In [60]:
batch_input = f's3://{bucket_name}/{model_name}/batch_transform_torchserve_sagemaker_input/'
batch_output = f's3://{bucket_name}/{model_name}/batch_transform_torchserve_sagemaker_output/'
batch_job_name = f'{model_name}-batch-' + time.strftime("%Y-%m-%d-%H-%M-%S", time.gmtime())
!aws s3 cp test.csv $batch_input

Completed 360 Bytes/360 Bytes (4.3 KiB/s) with 1 file(s) remainingupload: ./test.csv to s3://sagemaker-us-east-2-057122759684/TransformerEn2Fr/batch_transform_torchserve_sagemaker_input/test.csv


In [61]:
#from sagemaker.transformer import Transformer

sm_transformer = torchserve_model.transformer(model_name = sm_model_name,
                             instance_count = 1,
                             instance_type = "ml.p2.xlarge",
                             output_path = batch_output,
                             sagemaker_session = sess,
                             strategy = "MultiRecord",
                             assemble_with = "Line")

In [59]:
sm_transformer.transform(data = f'{batch_input}test.csv', 
                         content_type = "text/csv", 
                         split_type = "Line")
sm_transformer.wait()

AttributeError: 'Session' object has no attribute 'local_mode'

In [None]:
!aws s3 cp --recursive $sm_transformer.output_path ./

In [None]:
!head -c 10000 test.csv.out

### Deploy endpoint and make prediction using Python SDK (Boto3)

In [None]:
model_data = f's3://{bucket_name}/{prefix}/models/{model_name}.tar.gz'
sm_model_name = f'torchserve-{model_name}-boto'

container = {
    'Image': image,
    'ModelDataUrl': model_data
}

create_model_response = sm.create_model(
    ModelName         = sm_model_name,
    ExecutionRoleArn  = role,
    PrimaryContainer  = container)

print(create_model_response['ModelArn'])

In [None]:
import time
endpoint_config_name = 'torchserve-endpoint-config-' + time.strftime("%Y-%m-%d-%H-%M-%S", time.gmtime())
print(endpoint_config_name)

create_endpoint_config_response = sm.create_endpoint_config(
    EndpointConfigName = endpoint_config_name,
    ProductionVariants = [{
        'InstanceType'        : 'ml.g4dn.xlarge',
        'InitialVariantWeight': 1,
        'InitialInstanceCount': 1,
        'ModelName'           : sm_model_name,
        'VariantName'         : 'AllTraffic'}])

print("Endpoint Config Arn: " + create_endpoint_config_response['EndpointConfigArn'])

In [None]:
endpoint_name = 'torchserve-endpoint-' + time.strftime("%Y-%m-%d-%H-%M-%S", time.gmtime())
print(endpoint_name)

create_endpoint_response = sm.create_endpoint(
    EndpointName         = endpoint_name,
    EndpointConfigName   = endpoint_config_name)
print(create_endpoint_response['EndpointArn'])

In [None]:
resp = sm.describe_endpoint(EndpointName=endpoint_name)
status = resp['EndpointStatus']
print("Status: " + status)

while status=='Creating':
    time.sleep(60)
    resp = sm.describe_endpoint(EndpointName=endpoint_name)
    status = resp['EndpointStatus']
    print("Status: " + status)

print("Arn: " + resp['EndpointArn'])
print("Status: " + status)

In [None]:
!wget https://s3.amazonaws.com/model-server/inputs/kitten.jpg    
file_name = 'kitten.jpg'
with open(file_name, 'rb') as f:
    payload = f.read()
    payload = payload

In [None]:
import json
client = boto3.client('runtime.sagemaker')

response = client.invoke_endpoint(EndpointName=endpoint_name, 
                                   ContentType='application/x-image', 
                                   Body=payload)

print(*json.loads(response['Body'].read()), sep = '\n')