# XGBoost Boston Model

 * Wrap a xgboost python model for use as a prediction microservice in seldon-core
   * Run locally on Docker to test
   * Deploy on seldon-core running on a Kubernetes cluster
 
## Dependencies

 * [S2I](https://github.com/openshift/source-to-image) 

First, we make sure we have an up-to-date comet_ml:

In [None]:
%pip install comet_ml --upgrade

Next, we install the necessary requirements to train the model:

In [None]:
%pip install -r train_requirements.txt

## Train locally

In [None]:
#### Import Comet ####
from comet_ml import Experiment

#### Import Dependencies ####
import xgboost as xgb
from sklearn.datasets import load_boston
from sklearn.model_selection import train_test_split
import pandas as pd
import os

Let's create a Comet Experiment so we can track both XGBoost hyperparameters, metrics, and save the trained model.

In [None]:
API_KEY = "YOUR-KEY-HERE"

experiment = Experiment(project_name="xgboost_seldon", api_key=API_KEY)

Then load and prepare the data:

In [None]:
#### Load and configure boston housing dataset ####
boston = load_boston()
data = pd.DataFrame(boston.data)
data.columns = boston.feature_names
data["Price"] = boston.target
X, y = data.iloc[:, :-1], data.iloc[:, -1]

#### Split data into train and test sets ####
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=123
)

Then train the model:

In [None]:
#### Define hyperparameters for model ####
param = {
    "objective": "reg:squarederror",
    "colsample_bytree": 0.3,
    "learning_rate": 0.1,
    "max_depth": 5,
    "alpha": 10,
    "n_estimators": 50,
}

#### Initialize XGBoost Regressor ####
xg_reg = xgb.XGBRegressor(**param)

#### Train model ####
xg_reg.fit(
    X_train,
    y_train,
    eval_set=[(X_train, y_train), (X_test, y_test)],
    eval_metric="rmse",
)

Finally, save the model to Comet.ml:

In [None]:
os.makedirs("output", exist_ok=True)

xg_reg.save_model("output/0001.model")

model_name = "XGBoost Model (Boston)"

experiment.log_model(model_name, "output/0001.model")

experiment.end()

In [None]:
experiment.display(tab="assets")

Now you have a choice; **do one of the following**:


1. You can register the experiment model via the Comet User Interface:

You can check out the experiment assets above. To register an experiment model as a registry model, click on the `+ Register` link, then click `Register new model`, and click the registered model to exactly the name and model number. Also, note the name of your workspace (usually your comet id). Refers to [the documentation](https://www.comet.ml/docs/user-interface/models/) for more information.

2. You can register the experiment model via the following code:

**BEGINNING OF OPTIONAL CODE**

To put an experiment model into the workspace registry, we will use the Comet API:

In [None]:
from comet_ml import API

api = API(API_KEY)

We get an `APIExperiment`:

In [None]:
api_experiment = api.get_experiment_by_id(experiment.id)

And remember the workspace we are in:

In [None]:
WORKSPACE = api_experiment.workspace

We see that we have registered the experiment model:

In [None]:
api_experiment.get_model_names()

We make sure we haven't already put this model in the workspace registered models:

In [None]:
api.get_registry_model_names(WORKSPACE)

If not listed, then we can register the experiment model, using the experiment model name:

In [None]:
response = api_experiment.register_model('XGBoost Model (Boston)')

Finally, we see that the workspace registry has the model, albeit by a standardized name, 'xgboost-model-boston':

In [None]:
api.get_registry_model_names(WORKSPACE)

**END OF OPTIONAL CODE**

## Download the Model

Now that we have a trained model logged to Comet, let's see how to retrieve it and wrap it with Seldon.

To retrieve the model, you can use the following command:

In [None]:
! COMET_API_KEY="$API_KEY" comet download_model \
    --workspace "$WORKSPACE" \
    --registry_name 'xgboost-model-boston' \
    --model_version "1.0.0"

Then we need to define few files for building Seldon compatible Docker container, first the Model Python file:

In [None]:
%%writefile MyModel.py
import xgboost as xgb
import numpy as np
from typing import Dict, List, Union, Iterable


class MyModel:

    def __init__(self):
        """
        Add any initialization parameters.
        These will be passed at runtime from the graph definition parameters defined in your seldondeployment kubernetes resource manifest.
        """
        self._model = xgb.Booster(model_file="model/0001.model")

    def predict(
        self, X: np.ndarray, names: Iterable[str], meta: Dict = None
    ) -> Union[np.ndarray, List, str, bytes]:
        """
        Return a prediction.

        Parameters
        ----------
        X : array-like
        feature_names : array of feature names (optional)
        """
    
        dmatrix = xgb.DMatrix(X)
        result: np.ndarray = self._model.predict(dmatrix)
        return result


Checking the syntax of the model file:

In [None]:
!python MyModel.py

Then the Python dependencies:

In [None]:
%%writefile requirements.txt
xgboost
pip
seldon_core

And finally the definition file for Seldon:

In [None]:
%%bash
mkdir -p .s2i

In [None]:
%%writefile .s2i/environment
MODEL_NAME=MyModel
API_TYPE=REST
SERVICE_TYPE=MODEL
PERSISTENCE=0

## Build the Docker image

The recommended way of building Seldon images is to use [s2i](https://github.com/openshift/source-to-image) with official ready-to-use definition images. Please refer to [the Seldon documentation](https://docs.seldon.io/projects/seldon-core/en/latest/python/python_wrapping_s2i.html) for more information.

In [None]:
!s2i build . seldonio/seldon-core-s2i-python3:1.1.1-rc comet_ml/xgboost_seldon:0.1

## Test locally

Once the Docker image has been built, we can start it locally and test it:

In [None]:
!docker run --name "xgboost_predictor" -d --rm -p 5000:5000 comet_ml/xgboost_seldon:0.1

Send some random features that conform to the contract:

In [None]:
!curl -g http://localhost:5000/predict --data-urlencode 'json={"data": {"names": ["message"], "ndarray": [[0.00632, 18.0, 2.31, 0.0, 0.538, 6.575, 65.2, 4.09, 1.0, 296.0, 15.3, 396.9, 4.98]]}}'

In [None]:
!docker rm xgboost_predictor --force

## Push to production

Once you validated locally that your model is correctly predicting, you need to push your Docker image to a Kubernetes Cluster where Seldon is installed.

Installing and configuring Seldon is out of the scope of this notebook but you can, you can refer to the [Seldon-Core installation page](https://docs.seldon.io/projects/seldon-core/en/latest/workflow/install.html).

Once your Kubernetes cluster is ready, you can follow one of the [cloud-specific example notebooks](https://docs.seldon.io/projects/seldon-core/en/latest/examples/notebooks.html#cloud-specific-examples) to learn how to push the built Docker image and deploy it to your cluster.