# Async inference for Sklearn models

Asynchronous inference is a new inference option for near real-time inference needs. Requests can take up to 15 minutes to process and have payload sizes of up to 1 GB. Asynchronous inference is suitable for workloads that do not have sub-second latency requirements and have relaxed latency requirements. For example, you might need to process an inference on a large image of several MBs within 5 minutes. In addition, asynchronous inference endpoints let you control costs by scaling down endpoints instance count to zero when they are idle, so you only pay when your endpoints are processing requests.

## Train a model locally or remote

In [27]:
!pip install -U scikit-learn

Collecting scikit-learn
  Obtaining dependency information for scikit-learn from https://files.pythonhosted.org/packages/d0/0b/26ad95cf0b747be967b15fb71a06f5ac67aba0fd2f9cd174de6edefc4674/scikit_learn-1.3.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata
  Downloading scikit_learn-1.3.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (11 kB)
Downloading scikit_learn-1.3.2-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (10.8 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m10.8/10.8 MB[0m [31m93.5 MB/s[0m eta [36m0:00:00[0m:00:01[0m0:01[0m
[?25hInstalling collected packages: scikit-learn
  Attempting uninstall: scikit-learn
    Found existing installation: scikit-learn 1.3.1
    Uninstalling scikit-learn-1.3.1:
      Successfully uninstalled scikit-learn-1.3.1
Successfully installed scikit-learn-1.3.2


In [28]:
from sklearn import datasets, svm
digits = datasets.load_digits()
clf = svm.SVC(gamma=0.001, C=100.,probability=True)
clf.fit(digits.data[:-1], digits.target[:-1])
clf.predict(digits.data[-1:])

array([8])

In [29]:
!pip show scikit-learn

Name: scikit-learn
Version: 1.3.2
Summary: A set of python modules for machine learning and data mining
Home-page: http://scikit-learn.org
Author: 
Author-email: 
License: new BSD
Location: /home/ec2-user/anaconda3/envs/python3/lib/python3.10/site-packages
Requires: joblib, numpy, scipy, threadpoolctl
Required-by: shap


### Save model file

In [30]:
!pip install joblib
from joblib import dump
dump(clf, 'model.joblib')



['model.joblib']

## Step 1 : Write a model transform script

#### Make sure you have a ...

- "load_model" function
    - input args are model path
    - returns loaded model object
    - model name is the same as what you saved the model file as (see above step)
<br><br>
- "predict" function
    - input args are the loaded model object and a payload
    - returns the result of model.predict
    - make sure you format it as a single (or multiple) string return inside a list for real time (for mini batch)
    - from a client, a list  or string or np.array that is sent for prediction is interpreted as bytes. Do what you have to for converting back to list or string or np.array
    - return the error for debugging


In [31]:
%%writefile modelscript_sklearn.py
import sklearn
from joblib import load
import numpy as np
import os

#Return loaded model
def load_model(modelpath):
    print(modelpath)
    clf = load(os.path.join(modelpath,'model.joblib'))
    print("loaded")
    return clf

# return prediction based on loaded model (from the step above) and an input payload
def predict(model, payload):
    try:
        # print(payload)
        out = [str(model.predict(np.frombuffer(payload).reshape((1,64))))]
    except Exception as e:
        out = [type(payload),str(e)] #useful for debugging!
    
    return out

Writing modelscript_sklearn.py


## Does this work locally? (not "_in a container locally_", but _actually_ in local)

In [32]:
from modelscript_sklearn import *
model = load_model('.')

.
loaded


In [33]:
predict(model,digits.data[-1:].tobytes())

b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00$@\x00\x00\x00\x00\x00\x00,@\x00\x00\x00\x00\x00\x00 @\x00\x00\x00\x00\x00\x00\xf0?\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00@\x00\x00\x00\x00\x00\x000@\x00\x00\x00\x00\x00\x00,@\x00\x00\x00\x00\x00\x00\x18@\x00\x00\x00\x00\x00\x00\xf0?\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00.@\x00\x00\x00\x00\x00\x00.@\x00\x00\x00\x00\x00\x00 @\x00\x00\x00\x00\x00\x00.@\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x14@\x00\x00\x00\x00\x00\x000@\x00\x00\x00\x00\x00\x000@\x00\x00\x00\x00\x00\x00$@\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\

['[8]']

### ok great! Now let's install ezsmdeploy
In some cases, installs fail due to an existing package installed called greenlet.
This is not a direct dependency of ezsmdeploy but interferes with the installation. 
To fix this, either install in a virtualenv as seen above, or do:
pip install ezsmdeploy[locust] --ignore-installed greenlet

In [35]:
%pip install -U ezsmdeploy

Note: you may need to restart the kernel to use updated packages.


In [36]:
!pip show joblib

Name: joblib
Version: 1.3.0
Summary: Lightweight pipelining with Python functions
Home-page: 
Author: 
Author-email: Gael Varoquaux <gael.varoquaux@normalesup.org>
License: BSD 3-Clause
Location: /home/ec2-user/anaconda3/envs/python3/lib/python3.10/site-packages
Requires: 
Required-by: nltk, scikit-learn


In [37]:
import ezsmdeploy

#### If you have been running other inference containers in local mode, stop existing containers to avoid conflict

In [38]:
# !docker container stop $(docker container ls -aq) >/dev/null

## Deploy on SageMaker

In [39]:
ezonsm = ezsmdeploy.Deploy(model = 'model.joblib', # example of multimodel endpoint. 
                  script = 'modelscript_sklearn.py',
                  requirements = ['Cython','scikit-learn==1.3.1','numpy==1.22.3','joblib==1.3.0'],
                  asynchronous=True)

[K0:00:00.131735 | compressed model(s)
[K0:00:00.300300 | uploaded model tarball(s) ; check returned modelpath
[K0:00:00.300849 | added requirements file
[K0:00:00.301820 | added source file
[K0:00:00.302459 | added Dockerfile
[K0:00:00.303470 | added model_handler and docker utils
[K0:00:00.303511 | building docker container
[32m∙∙∙[0m [K

https://docs.docker.com/engine/reference/commandline/login/#credentials-store



[K0:00:37.037885 | built docker container
[K0:00:37.427664 | created model(s). Now deploying on ml.m5.xlarge
[K0:02:39.659656 | deployed model
[K0:02:40.971776 | set up autoscaling
[K[32m0:02:40.971857 | Done! ✔[0m 


In [40]:
#!./src/build-docker.sh test

In [41]:
with open('inputfile.txt','wb') as f:
    f.write(digits.data[-1:].tobytes())

In [42]:
import sagemaker

In [43]:
!aws s3 cp inputfile.txt s3://{sagemaker.session.Session().default_bucket()}/asyncinput/

upload: ./inputfile.txt to s3://sagemaker-us-east-1-716845917484/asyncinput/inputfile.txt


In [44]:
out = ezonsm.predict(input_path='s3://sagemaker-us-east-1-716845917484/asyncinput/inputfile.txt')

In [45]:
out.get_result()

b'[8]'

### Install the additional locust testing functionality to enable automated testing

In [46]:
ezonsm.predictor.delete_endpoint()