# Install necessary packages

We can install the necessary packages by either running `pip install --user <package_name>` or include everything in a `requirements.txt` file and run `pip install --user -r requirements.txt`.

> NOTE: Do not forget to use the `--user` argument. It is necessary if you want to use Kale to transform this notebook into a Kubeflow pipeline

In [None]:
!pip3 install --user -r requirements.txt

# Imports

In this section we import the packages we need for this example. Make it a habbit to gather your imports in a single place. It will make your life easier if you are going to transform this notebook into a Kubeflow pipeline using Kale.

In [None]:
import numpy as np
import xgboost as xgb

from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.metrics import precision_score, recall_score, f1_score, accuracy_score

# Project hyper-parameters

In this cell, we define the different hyper-parameters variables. Defining them in one place makes it easier to experiment with their values and also facilitates the execution of HP Tuning experiments using Kale and Katib.

In [None]:
ETA = .3
MAX_DEPTH = 3
OBJECTIVE = "multi:softprob"
STEPS = 20

# Load and preprocess data

In this section, we load and process the dataset to get it in a ready-to-use form by the model.

In [None]:
x, y = datasets.load_iris(return_X_y=True)

In [None]:
x_trn, x_tst, y_trn, y_tst = train_test_split(x, y, test_size=.2)

In [None]:
D_trn = xgb.DMatrix(x_trn, label=y_trn)
D_tst = xgb.DMatrix(x_tst, label=y_tst)

# Define and train the model

We are now ready to define our model. In this example, we use the Extreme Gradient Boosting algorithm inmplemented by [XGBoost](https://xgboost.ai/).

In [None]:
param = {"eta": float(ETA),
         "max_depth": int(MAX_DEPTH),
         "objective": OBJECTIVE,
         "num_class": 3}

steps = int(STEPS)

In [None]:
model = xgb.train(param, D_trn, steps)

# Evaluate the model

Finally, we are ready to evaluate the model using the test set.

In [None]:
preds = model.predict(D_tst)
max_preds = np.asarray([np.argmax(line) for line in preds])

In [None]:
precision = precision_score(y_tst, max_preds, average='macro')
recall = recall_score(y_tst, max_preds, average='macro')
f1 = f1_score(y_tst, max_preds, average='macro')
accuracy = accuracy_score(y_tst, max_preds)

# Serving

We can deploy the model as an inference server to KFServing, using the Kale `serve` API.

In [None]:
from kale.common.serveutils import serve
kfserver = serve(model)

In [None]:
import json

data = {"instances": [[6.8, 2.8, 4.8, 1.4], [5.1, 3.5, 1.4, 0.2]]}
res = kfserver.predict(json.dumps(data))


In [None]:
print(res)

# Pipeline metrics

In the last cell of the Notebook, we print the pipeline metrics. These will be picked up by Kubeflow Pipelines, which will make them available through its UI.

In [None]:
print(precision)
print(recall)
print(f1)
print(accuracy)