# Deploy a Serverless XGBoost Model Server
  --------------------------------------------------------------------

The following notebook demonstrates how to deploy an XGBoost model server (a.k.a <b>Nuclio-serving</b>)

#### **notebook how-to's**
* Write and test model serving class in a notebook.
* Deploy the model server function.
* Invoke and test the serving function.

<a id="top"></a>
#### **steps**
**[define a new function and its dependencies](#define-function)**<br>
**[test the model serving class locally](#test-locally)**<br>
**[deploy our serving class using as a serverless function](#deploy)**<br>
**[test our model server using HTTP request](#test-model-server)**<br>

In [1]:
# nuclio: ignore
import nuclio 

<a id="define-function"></a>
### **define a new function and its dependencies**

In [2]:
%nuclio config kind="nuclio:serving"
%nuclio env MODEL_CLASS=XGBoostModel

%nuclio config spec.build.baseImage = "yjbds/ml-models:0.4.8"

%nuclio: setting kind to 'nuclio:serving'
%nuclio: setting 'MODEL_CLASS' environment variable
%nuclio: setting spec.build.baseImage to 'yjbds/ml-models:0.4.8'


## Function Code

In [3]:
# import kfserving
import os
import json
import numpy as np
import xgboost as xgb
from cloudpickle import load

### Model Serving Class

import mlrun
class XGBoostModel(mlrun.runtimes.MLModelServer):
    def load(self):
        model_file, extra_data = self.get_model(".pkl")
        self.model = load(open(str(model_file), "rb"))
  

    def predict(self, body):
        try:
            feats = np.asarray(body["instances"], dtype=np.float32).reshape(-1, 4)
            result = self.model.predict(feats, validate_features=False)
            return result.tolist()
        except Exception as e:
            raise Exception("Failed to predict %s" % e)

The following end-code annotation tells ```nuclio``` to stop parsing the notebook from this cell. _**Please do not remove this cell**_:

In [4]:
# nuclio: end-code

### mlconfig

In [5]:
from mlrun import mlconf
import os
mlconf.dbpath = mlconf.dbpath or "http://mlrun-api:8080"
mlconf.artifact_path = mlconf.artifact_path or f"{os.environ['HOME']}/artifacts"

<a id="test-locally"></a>
## Test the function locally

The class above can be tested locally. Just instantiate the class, `.load()` will load the model to a local dir.

> **Verify there is a model file in the model_dir path (generated by the training notebook)**

In [6]:
model_dir = os.path.join(mlconf.artifact_path, "models")
print(model_dir)

my_server = XGBoostModel("my-model", model_dir=model_dir)
my_server.load()

/home/jovyan/data/models


In [7]:
REPO_URL = "https://raw.githubusercontent.com/yjb-ds/testdata/master"
DATA_PATH = "data/classifier-data.csv"
MODEL_PATH = "models/xgb_test"

In [8]:
import pandas as pd
xtest = pd.read_csv(f"{REPO_URL}/{DATA_PATH}")

We can use the `.predict(body)` method to test the model.

In [9]:
import json, numpy as np
preds = my_server.predict({"instances":xtest.values[:10,:-1].tolist()})

In [10]:
print("predicted class:", preds)

predicted class: [0, 0, 0, 1, 1, 1, 0, 0, 1, 0]


<a id="deploy"></a>
### **deploy our serving class using as a serverless function**
in the following section we create a new model serving function which wraps our class , and specify model and other resources.

the `models` dict store model names and the assosiated model **dir** URL (the URL can start with `S3://` and other blob store options), the faster way is to use a shared file volume, we use `.apply(mount_v3io())` to attach a v3io (iguazio data fabric) volume to our function. By default v3io will mount the current user home into the `\User` function path.

**verify the model dir does contain a valid `model.bst` file**

In [13]:
from mlrun import new_model_server
import requests

In [14]:
fn = new_model_server("xgb-test",
                      model_class="XGBoostModel",
                      models={"xgb_serving_v2": f"{model_dir}"})
fn.spec.description = "xgboost test data classification server"
fn.metadata.categories = ["serving", "ml"]
fn.metadata.labels = {"author": "yaronh", "framework": "xgboost"}

fn.export("function.yaml")

[mlrun] 2020-05-25 02:20:07,135 function spec saved to path: function.yaml


<mlrun.runtimes.function.RemoteRuntime at 0x7fdfc885c0d0>

In [15]:
from mlutils import get_vol_mount
fn.apply(get_vol_mount())

<mlrun.runtimes.function.RemoteRuntime at 0x7fdfc885c0d0>

## tests

In [16]:
addr = fn.deploy(project="churn-project") # dashboard="http://172.17.0.66:8070")

[mlrun] 2020-05-25 02:20:14,286 deploy started
[nuclio] 2020-05-25 02:20:15,386 (info) Building processor image
[nuclio] 2020-05-25 02:20:19,542 (info) Build complete
[nuclio] 2020-05-25 02:20:26,169 (info) Function deploy complete
[nuclio] 2020-05-25 02:20:26,173 done updating churn-project-xgb-test, function address: 192.168.99.135:31253


In [None]:
addr

<a id="test-model-server"></a>
### **test our model server using HTTP request**


We invoke our model serving function using test data, the data vector is specified in the `instances` attribute.

In [None]:
# KFServing protocol event
event_data = {"instances": xtest.values[:10,:-1].tolist()}

In [None]:
import json
resp = requests.put("http://192.168.99.135:30791" + "/xgb_serving_v2/predict", json=json.dumps(event_data))

# mlutils function for this?
tl = resp.text.replace("[","").replace("]","").split(",")
#assert preds == [int(i) for i in np.asarray(tl)]

In [None]:
tl

In [None]:
preds

**[back to top](#top)**