## Train a model locally or remote

In [1]:
%cd ~/SageMaker/easy-amazon-sagemaker-deployments/dev/

/home/ec2-user/SageMaker/easy-amazon-sagemaker-deployments/dev


In [2]:
%pip uninstall -y sklearn scikit-learn

[0mFound existing installation: scikit-learn 1.2.1
Uninstalling scikit-learn-1.2.1:
  Successfully uninstalled scikit-learn-1.2.1
Note: you may need to restart the kernel to use updated packages.


In [3]:
%pip install --upgrade pip
%pip install --upgrade scikit-learn==1.2.1 sagemaker

Looking in indexes: https://pypi.org/simple, https://pip.repos.neuron.amazonaws.com
Note: you may need to restart the kernel to use updated packages.
Looking in indexes: https://pypi.org/simple, https://pip.repos.neuron.amazonaws.com
Collecting scikit-learn==1.2.1
  Using cached scikit_learn-1.2.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (9.8 MB)
Installing collected packages: scikit-learn
Successfully installed scikit-learn-1.2.1
Note: you may need to restart the kernel to use updated packages.


In [4]:
import sklearn

### Make sure these versions match when you use ezsmdeploy

In [5]:
sklearn.show_versions()


System:
    python: 3.8.12 | packaged by conda-forge | (default, Oct 12 2021, 21:59:51)  [GCC 9.4.0]
executable: /home/ec2-user/anaconda3/envs/python3/bin/python
   machine: Linux-5.10.102-99.473.amzn2.x86_64-x86_64-with-glibc2.10

Python dependencies:
      sklearn: 1.2.1
          pip: 23.0.1
   setuptools: 59.4.0
        numpy: 1.20.3
        scipy: 1.5.3
       Cython: 0.29.24
       pandas: 1.3.4
   matplotlib: 3.5.0
       joblib: 1.2.0
threadpoolctl: 3.0.0

Built with OpenMP: True

threadpoolctl info:
       user_api: openmp
   internal_api: openmp
         prefix: libgomp
       filepath: /home/ec2-user/anaconda3/envs/python3/lib/python3.8/site-packages/scikit_learn.libs/libgomp-a34b3233.so.1.0.0
        version: None
    num_threads: 8

       user_api: blas
   internal_api: openblas
         prefix: libopenblas
       filepath: /home/ec2-user/anaconda3/envs/python3/lib/libopenblasp-r0.3.7.so
        version: 0.3.7
threading_layer: pthreads
   architecture: Haswell
    num_th

In [6]:
from sklearn import datasets, svm
digits = datasets.load_digits()
clf = svm.SVC(gamma=0.001, C=100.,probability=True)
clf.fit(digits.data[:-1], digits.target[:-1])
clf.predict(digits.data[-1:])

array([8])

In [7]:
digits.data[-1:]

array([[ 0.,  0., 10., 14.,  8.,  1.,  0.,  0.,  0.,  2., 16., 14.,  6.,
         1.,  0.,  0.,  0.,  0., 15., 15.,  8., 15.,  0.,  0.,  0.,  0.,
         5., 16., 16., 10.,  0.,  0.,  0.,  0., 12., 15., 15., 12.,  0.,
         0.,  0.,  4., 16.,  6.,  4., 16.,  6.,  0.,  0.,  8., 16., 10.,
         8., 16.,  8.,  0.,  0.,  1.,  8., 12., 14., 12.,  1.,  0.]])

### Save model file

In [8]:
!pip install joblib
from joblib import dump
dump(clf, 'model.joblib')

Looking in indexes: https://pypi.org/simple, https://pip.repos.neuron.amazonaws.com


['model.joblib']

## Step 1 : Write a model transform script

#### Make sure you have a ...

- "load_model" function
    - input args are model path
    - returns loaded model object
    - model name is the same as what you saved the model file as (see above step)
<br><br>
- "predict" function
    - input args are the loaded model object and a payload
    - returns the result of model.predict
    - make sure you format it as a single (or multiple) string return inside a list for real time (for mini batch)
    - from a client, a list  or string or np.array that is sent for prediction is interpreted as bytes. Do what you have to for converting back to list or string or np.array
    - return the error for debugging


In [66]:
%%writefile modelscript_sklearn.py
import sklearn
from joblib import load
import numpy as np
import os

#Return loaded model
def load_model(modelpath):
    print(modelpath)
    clf = load(os.path.join(modelpath,'model.joblib'))
    print("loaded")
    return clf

# return prediction based on loaded model (from the step above) and an input payload
def predict(model, payload):
    print(type(payload))
    try:
        print(np.frombuffer(payload))
        print(np.frombuffer(payload).reshape((1,64)))
        print( model.predict(np.frombuffer(payload).reshape((1,64))) )
        
        out = str(int(model.predict(np.frombuffer(payload).reshape((1,64))) ) )
        
    except Exception as e:
        out = [type(payload),str(e)] #useful for debugging!
    
    return out

Overwriting modelscript_sklearn.py


## Does this work locally? (not "_in a container locally_", but _actually_ in local)

In [67]:
from modelscript_sklearn import *
model = load_model('.')

.
loaded


In [68]:
predict(model,digits.data[-1:])

<class 'numpy.ndarray'>
[ 0.  0. 10. 14.  8.  1.  0.  0.  0.  2. 16. 14.  6.  1.  0.  0.  0.  0.
 15. 15.  8. 15.  0.  0.  0.  0.  5. 16. 16. 10.  0.  0.  0.  0. 12. 15.
 15. 12.  0.  0.  0.  4. 16.  6.  4. 16.  6.  0.  0.  8. 16. 10.  8. 16.
  8.  0.  0.  1.  8. 12. 14. 12.  1.  0.]
[[ 0.  0. 10. 14.  8.  1.  0.  0.  0.  2. 16. 14.  6.  1.  0.  0.  0.  0.
  15. 15.  8. 15.  0.  0.  0.  0.  5. 16. 16. 10.  0.  0.  0.  0. 12. 15.
  15. 12.  0.  0.  0.  4. 16.  6.  4. 16.  6.  0.  0.  8. 16. 10.  8. 16.
   8.  0.  0.  1.  8. 12. 14. 12.  1.  0.]]
[8]


array([8])

### ok great! Now let's install ezsmdeploy
In some cases, installs fail due to an existing package installed called greenlet.
This is not a direct dependency of ezsmdeploy but interferes with the installation. 
To fix this, either install in a virtualenv as seen above, or do:
pip install ezsmdeploy[locust] --ignore-installed greenlet

In [69]:
!pip uninstall -y ezsmdeploy

Found existing installation: ezsmdeploy 2.0.dev0
Uninstalling ezsmdeploy-2.0.dev0:
  Successfully uninstalled ezsmdeploy-2.0.dev0


### Install local dev version 

In [70]:
%pip install -e ../

Looking in indexes: https://pypi.org/simple, https://pip.repos.neuron.amazonaws.com
Obtaining file:///home/ec2-user/SageMaker/easy-amazon-sagemaker-deployments
  Preparing metadata (setup.py) ... [?25ldone
Installing collected packages: ezsmdeploy
  Running setup.py develop for ezsmdeploy
Successfully installed ezsmdeploy-2.0.dev0
Note: you may need to restart the kernel to use updated packages.


### Note: you may need to restart the kernel to use updated packages.

In [71]:
import ezsmdeploy

#### If you have been running other inference containers in local mode, stop existing containers to avoid conflict

In [72]:
!docker container stop $(docker container ls -aq) >/dev/null

[36ms2kkez1w9j-algo-1-bdah7 |[0m [2023-03-09 14:44:48 +0000] [10] [INFO] Handling signal: term
[36ms2kkez1w9j-algo-1-bdah7 exited with code 0
[0mAborting on container exit...


## Deploy locally

In [73]:
ez = ezsmdeploy.Deploy(model = 'model.joblib', # if you intend to add models later, pass model as list, otherwise str
                  script = 'modelscript_sklearn.py',
                  requirements = ['scikit-learn==1.2.1','numpy==1.22.0','joblib==1.2.0'], #or pass in the path to requirements.txt
                  instance_type = 'local',
                  autoscale = True,
                  wait = True)

[K0:00:00.173709 | compressed model(s)
[K0:00:00.256168 | uploaded model tarball(s) ; check returned modelpath
[K0:00:00.257109 | added requirements file
[K0:00:00.259105 | added source file
[K0:00:00.260516 | added Dockerfile
[K0:00:00.262394 | added model_handler and docker utils
[K0:00:00.262485 | building docker container
[32m∙●∙[0m [K

https://docs.docker.com/engine/reference/commandline/login/#credentials-store



[K0:00:40.303684 | built docker container
[K2m∙●∙[0m [K

update_endpoint is a no-op in sagemaker>=2.
See: https://sagemaker.readthedocs.io/en/stable/v2.html for details.


[K0:00:40.743596 | created model(s). Now deploying on local
[32m∙∙●[0m [KAttaching to wok4yk4c8k-algo-1-ezvdb
[36mwok4yk4c8k-algo-1-ezvdb |[0m Starting the inference server with 8 workers.
[32m∙∙∙[0m [K[36mwok4yk4c8k-algo-1-ezvdb |[0m [2023-03-09 14:45:31 +0000] [10] [INFO] Starting gunicorn 20.1.0
[36mwok4yk4c8k-algo-1-ezvdb |[0m [2023-03-09 14:45:31 +0000] [10] [INFO] Listening at: unix:/tmp/gunicorn.sock (10)
[36mwok4yk4c8k-algo-1-ezvdb |[0m [2023-03-09 14:45:31 +0000] [10] [INFO] Using worker: gevent
[36mwok4yk4c8k-algo-1-ezvdb |[0m [2023-03-09 14:45:31 +0000] [12] [INFO] Booting worker with pid: 12
[32m∙∙∙[0m [K[36mwok4yk4c8k-algo-1-ezvdb |[0m [2023-03-09 14:45:31 +0000] [13] [INFO] Booting worker with pid: 13
[36mwok4yk4c8k-algo-1-ezvdb |[0m [2023-03-09 14:45:31 +0000] [14] [INFO] Booting worker with pid: 14
[32m●∙∙[0m [K[36mwok4yk4c8k-algo-1-ezvdb |[0m [2023-03-09 14:45:31 +0000] [22] [INFO] Booting worker with pid: 22
[32m∙●∙[0m [K[36mwok4yk4c8k

## Test containerized version locally

In [74]:
import sagemaker
ez.predictor.serializer = sagemaker.serializers.IdentitySerializer()

In [75]:
out = ez.predictor.predict(digits.data[-1:].tobytes())#.decode()
out

[36mwok4yk4c8k-algo-1-ezvdb |[0m received input data
[36mwok4yk4c8k-algo-1-ezvdb |[0m b'\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00$@\x00\x00\x00\x00\x00\x00,@\x00\x00\x00\x00\x00\x00 @\x00\x00\x00\x00\x00\x00\xf0?\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00@\x00\x00\x00\x00\x00\x000@\x00\x00\x00\x00\x00\x00,@\x00\x00\x00\x00\x00\x00\x18@\x00\x00\x00\x00\x00\x00\xf0?\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00.@\x00\x00\x00\x00\x00\x00.@\x00\x00\x00\x00\x00\x00 @\x00\x00\x00\x00\x00\x00.@\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x14@\x00\x00\x00\x00\x00\x000@\x00\x00\x00\x00\x00\x000@\x00\x00\x00\x00\x00\x00$@\x00\x00\x00\x00\x00\x00\x00\x00\x0

b'8'

[36mwok4yk4c8k-algo-1-ezvdb |[0m 172.21.0.1 - - [09/Mar/2023:14:46:05 +0000] "POST /invocations HTTP/1.1" 200 1 "-" "python-urllib3/1.26.8"


In [20]:
!docker container stop $(docker container ls -aq) >/dev/null

[36malgo-1-sdnh9_1  |[0m [2023-03-08 20:31:31 +0000] [10] [INFO] Handling signal: term
[36mtmp1tanha9z_algo-1-sdnh9_1 exited with code 0
[0mAborting on container exit...


## Deploy on SageMaker

In [77]:
ezonsm = ezsmdeploy.Deploy(model = 'model.joblib', # if you intend to add models later, pass model as list, otherwise str
                  script = 'modelscript_sklearn.py',
                  requirements = ['scikit-learn==1.2.1','numpy==1.22.0','joblib==1.2.0'], #or pass in the path to requirements.txt
                  autoscale = True,
                  wait = True)

[K0:00:00.174026 | compressed model(s)
[K0:00:00.666720 | uploaded model tarball(s) ; check returned modelpath
[K0:00:00.667488 | added requirements file
[K0:00:00.669371 | added source file
[K0:00:00.670669 | added Dockerfile
[K0:00:00.672636 | added model_handler and docker utils
[K0:00:00.672759 | building docker container
[32m∙∙●[0m [K

https://docs.docker.com/engine/reference/commandline/login/#credentials-store



[K0:00:40.695327 | built docker container
[K2m∙∙●[0m [K

update_endpoint is a no-op in sagemaker>=2.
See: https://sagemaker.readthedocs.io/en/stable/v2.html for details.


[K0:00:40.923249 | created model(s). Now deploying on ml.m5.xlarge
[K0:02:43.054103 | deployed model
[K0:02:44.215367 | set up autoscaling
[K0:02:44.216112 | estimated cost is $0.3 per hour
[K[32m0:02:44.216282 | Done! ✔[0m 


To debug docker build errors try this....

In [76]:
# !./src/build-docker.sh test

In [78]:
out = ezonsm.predictor.predict(digits.data[-1:].tobytes())#.decode()
out

b'8'

### Don't leave resources running

In [79]:
ezonsm.predictor.delete_endpoint()