# Deploy a Serverless Model Server with Nuclio-KFServing
  --------------------------------------------------------------------

The following notebook demonstrates how to deploy **any pickled model** using **[nuclio](https://github.com/nuclio/nuclio)** + **[KFServing](https://github.com/kubeflow/kfserving)** (a.k.a <b>Nuclio-serving</b>)

#### **notebook how-to's**
* Write and test model serving (KFServing) class in a notebook.
* Deploy the model server as a Nuclio-serving function.
* Invoke and test the serving function.

<a id='top'></a>
#### **steps**
**[define a new function and its dependencies](#define-function)**<br>
**[test the model serving class locally](#test-locally)**<br>
**[deploy our serving class using as a serverless function](#deploy)**<br>
**[test our model server using HTTP request](#test-model-server)**<br>

In [1]:
# nuclio: ignore
# if the nuclio-jupyter package is not installed run !pip install nuclio-jupyter
import nuclio

In [19]:
%%nuclio cmd
python -m pip install kfserving
python -m pip install git+https://github.com/mlrun/mlrun.git@development
python -m pip install cloudpickle

Collecting kubernetes>=10.0.1
  Using cached kubernetes-10.0.1-py2.py3-none-any.whl (1.5 MB)
[31mERROR: kfp 0.2.1 has requirement kubernetes<=10.0.0,>=8.0.0, but you'll have kubernetes 10.0.1 which is incompatible.[0m
Installing collected packages: kubernetes
  Attempting uninstall: kubernetes
    Found existing installation: kubernetes 10.0.0
    Uninstalling kubernetes-10.0.0:
      Successfully uninstalled kubernetes-10.0.0
Successfully installed kubernetes-10.0.1
Collecting git+https://github.com/mlrun/mlrun.git@development
  Cloning https://github.com/mlrun/mlrun.git (to revision development) to /tmp/pip-req-build-pd11u_4m
  Running command git clone -q https://github.com/mlrun/mlrun.git /tmp/pip-req-build-pd11u_4m
  Running command git checkout -b development --track origin/development
  Switched to a new branch 'development'
  Branch development set up to track remote branch development from origin.
Collecting kubernetes<=10.0.0,>=8.0.0
  Using cached kubernetes-10.0.0-py2.py3

<a id='define-function'></a>
### **define a new function and its dependencies**

In [20]:
# %nuclio config kind="nuclio:serving"
%nuclio env MODEL_CLASS=ClassifierModel

%nuclio: setting 'MODEL_CLASS' environment variable


In [21]:
# %nuclio config spec.build.baseImage = "yjbds/mlrun-serving:nobase"

In [22]:
import kfserving
import os
import numpy as np
from cloudpickle import load

In [23]:
TARGET_PATH = '/User/repos/demos/dask/artifacts'
MODEL_FILE = 'lgbm-model.pkl'

In [24]:
class ClassifierModel(kfserving.KFModel):
    def __init__(self, name: str, model_dir: str, model = None):
        super().__init__(name)
        self.name = name
        self.model_dir = model_dir
        if not model is None:
            self.classifier = model
            self.ready = True

    def load(self):
        model_file = os.path.join(
            kfserving.Storage.download(self.model_dir), MODEL_FILE)
        self.classifier = load(open(model_file, 'rb'))
        self.ready = True

    def predict(self, body):
        try:
            feats = np.asarray(body['instances'])
            result: np.ndarray = self.classifier.predict(feats)
            return result.tolist()
        except Exception as e:
            raise Exception("Failed to predict %s" % e)

The following end-code annotation tells ```nuclio``` to stop parsing the notebook from this cell. _**Please do not remove this cell**_:

In [25]:
# nuclio: end-code

______________________________________________

<a id='test-locally'></a>
### **test the model serving class locally**
The class above can be tested locally. Just instantiate the class, `.load()` will load the model to a local dir.

In [None]:
# model = load(open(TARGET_PATH + '/' + MODEL_FILE, 'rb'))

# my_server = ClassifierModel('classifier', model_dir=TARGET_PATH, model = model)
# my_server.load()

### _data_
Make some classification data using scikit learn's `make_classification`:

In [None]:
# import dask
# import pandas as pd
# import dask.dataframe as dd
# import pyarrow.parquet as pq
# import pyarrow

# xtest = pd.read_parquet('/User/repos/demos/dask/artifacts/test_set')

# xtest.head()

# xtest.pop('index')

# ytest = xtest.pop('ArrDelay')

# event = {"instances": xtest.values.tolist()}

# ytest

# my_server.predict(event)

<a id='deploy'></a>
### **deploy our serving class using as a serverless function**

In [26]:
import mlrun

In [27]:
fn = mlrun.new_model_server('generic', 
                            models={'classifier_gen': TARGET_PATH}, 
                            model_class='ClassifierModel')
fn.apply(mlrun.mount_v3io())

<mlrun.runtimes.function.RemoteRuntime at 0x7fdd581cf198>

In [28]:
%nuclio show

%nuclio: notebook model-server exported
Config:
apiVersion: nuclio.io/v1
kind: Function
metadata:
  annotations:
    nuclio.io/generated_by: function generated at 17-02-2020 by admin from /User/repos/demos/dask/model_server.ipynb
  labels: {}
  name: model-server
spec:
  build:
    commands:
    - python -m pip install kfserving
    - python -m pip install git+https://github.com/mlrun/mlrun.git@development
    - python -m pip install cloudpickle
    functionSourceCode: IyBHZW5lcmF0ZWQgYnkgbnVjbGlvLmV4cG9ydC5OdWNsaW9FeHBvcnRlciBvbiAyMDIwLTAyLTE3IDE1OjUwCgppbXBvcnQga2ZzZXJ2aW5nCmltcG9ydCBvcwppbXBvcnQgbnVtcHkgYXMgbnAKZnJvbSBjbG91ZHBpY2tsZSBpbXBvcnQgbG9hZAoKVEFSR0VUX1BBVEggPSAnL1VzZXIvcmVwb3MvZGVtb3MvZGFzay9hcnRpZmFjdHMnCk1PREVMX0ZJTEUgPSAnbGdibS1tb2RlbC5wa2wnCgpjbGFzcyBDbGFzc2lmaWVyTW9kZWwoa2ZzZXJ2aW5nLktGTW9kZWwpOgogICAgZGVmIF9faW5pdF9fKHNlbGYsIG5hbWU6IHN0ciwgbW9kZWxfZGlyOiBzdHIsIG1vZGVsID0gTm9uZSk6CiAgICAgICAgc3VwZXIoKS5fX2luaXRfXyhuYW1lKQogICAgICAgIHNlbGYubmFtZSA9IG5hbWUKICAgICAgICBzZW

In [29]:
# print(fn.to_yaml())

In [None]:
fn.deploy()

[mlrun] 2020-02-17 15:50:59,929 deploy started


<a id="test-model-server"></a>
### **test our model server using HTTP request**

In [None]:
import json
import requests

resp = requests.post(addr + '/classifier_gen/predict', json=event)

In [None]:
resp.__dict__['_content'] 

In [None]:
json.loads(resp.content)

**[back to top](#top)**