# XGBoost Boston Model

 * Wrap a xgboost python model for use as a prediction microservice in seldon-core
   * Run locally on Docker to test
   * Deploy on seldon-core running on a Kubernetes cluster
 
## Dependencies

 * [S2I](https://github.com/openshift/source-to-image) 

In [None]:
!pip install -r requirements.txt

## Train locally

In [None]:
#### Import Comet ####
from comet_ml import Experiment

#### Import Dependencies ####
import xgboost as xgb
from sklearn.datasets import load_boston
from sklearn.model_selection import train_test_split
import pandas as pd
import os

Let's create a Comet Experiment so we can track both XGBoost hyperparameters, metrics, and save the trained model.

In [None]:
API_KEY = "TODO_REPLACE_ME"

experiment = Experiment(project_name="xgboost_seldon", api_key=API_KEY)

Then load and prepare the data:

In [None]:
#### Load and configure boston housing dataset ####
boston = load_boston()
data = pd.DataFrame(boston.data)
data.columns = boston.feature_names
data["Price"] = boston.target
X, y = data.iloc[:, :-1], data.iloc[:, -1]

#### Split data into train and test sets ####
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=123
)

Then train the model:

In [None]:
#### Define hyperparameters for model ####
param = {
    "objective": "reg:squarederror",
    "colsample_bytree": 0.3,
    "learning_rate": 0.1,
    "max_depth": 5,
    "alpha": 10,
    "n_estimators": 50,
}

#### Initialize XGBoost Regressor ####
xg_reg = xgb.XGBRegressor(**param)

#### Train model ####
xg_reg.fit(
    X_train,
    y_train,
    eval_set=[(X_train, y_train), (X_test, y_test)],
    eval_metric="rmse",
)

Finally, save the model to Comet.ml:

In [None]:
os.makedirs("output", exist_ok=True)

xg_reg.save_model("output/0001.model")

model_name = "XGBoost Model (Boston)"

experiment.log_model(model_name, "output/0001.model")

experiment.end()

In [None]:
experiment.display(tab="assets")

You can check out the experiment assets above. To register an experiment model as a registry model, select the model and click `Register` to register it. Refers to [the documentation](https://www.comet.ml/docs/user-interface/models/) for more informations.

## Wrap model with Python Wrapper Class

Now that we got a trained model logged to Comet, let's see how to retrieve it and wrapt it with Seldon.

To retrieve the model, you can use the following command, replace the model version in the command below:

In [None]:
%%bash -s "{experiment.workspace}" "{API_KEY}"
export WORKSPACE="$1"
export REGISTRY_MODEL_NAME="TODO_REPLACE_ME"
export REGISTRY_MODEL_VERSION="TODO_REPLACE_ME"

env COMET_API_KEY="$2" python download_model.py --workspace "$WORKSPACE" --registry_name "$REGISTRY_MODEL_NAME" --model_version "$REGISTRY_MODEL_VERSION"

Then we need to define few files for building Seldon compatible Docker container, first the Model Python file:

In [None]:
%%writefile MyModel.py
import xgboost as xgb
import numpy as np
from typing import Dict, List, Union, Iterable


class MyModel:

    def __init__(self):
        """
        Add any initialization parameters.
        These will be passed at runtime from the graph definition parameters defined in your seldondeployment kubernetes resource manifest.
        """
        self._model = xgb.Booster(model_file="model/0001.model")

    def predict(
        self, X: np.ndarray, names: Iterable[str], meta: Dict = None
    ) -> Union[np.ndarray, List, str, bytes]:
        """
        Return a prediction.

        Parameters
        ----------
        X : array-like
        feature_names : array of feature names (optional)
        """
    
        dmatrix = xgb.DMatrix(X)
        result: np.ndarray = self._model.predict(dmatrix)
        return result


Checking the syntax of the model file:

In [None]:
!python MyModel.py

Then the Python dependencies:

In [None]:
%%writefile requirements.txt
xgboost
pip
seldon_core

And finally the definition file for Seldon:

In [None]:
%%bash
mkdir -p .s2i

In [None]:
%%writefile .s2i/environment
MODEL_NAME=MyModel
API_TYPE=REST
SERVICE_TYPE=MODEL
PERSISTENCE=0

## Build the Docker image

The recommended way of building Seldon images is to use [s2i](https://github.com/openshift/source-to-image) with official ready-to-use definition images. Please refer to [the Seldon documentation](https://docs.seldon.io/projects/seldon-core/en/latest/python/python_wrapping_s2i.html) for more information.

In [None]:
!s2i build . seldonio/seldon-core-s2i-python3:1.1.1-rc comet_ml/xgboost_seldon:0.1

## Test locally

Once the Docker image has been built, we can start it locally and test it:

In [None]:
!docker run --name "xgboost_predictor" -d --rm -p 5000:5000 comet_ml/xgboost_seldon:0.1

Send some random features that conform to the contract:

In [None]:
!curl -g http://localhost:5000/predict --data-urlencode 'json={"data": {"names": ["message"], "ndarray": [[0.00632, 18.0, 2.31, 0.0, 0.538, 6.575, 65.2, 4.09, 1.0, 296.0, 15.3, 396.9, 4.98]]}}'

In [None]:
!docker rm xgboost_predictor --force

## Push to production

Once you validated locally that your model is correctly predicting, you need to push your Docker image to a Kubernetes Cluster where Seldon is installed.

Installing and configuring Seldon is out of the scope of this notebook but you can, you can refer to the [Seldon-Core installation page](https://docs.seldon.io/projects/seldon-core/en/latest/workflow/install.html).

Once your Kubernetes cluster is ready, you can follow one of the [cloud-specific example notebooks](https://docs.seldon.io/projects/seldon-core/en/latest/examples/notebooks.html#cloud-specific-examples) to learn how to push the built Docker image and deploy it to your cluster.