# Deploying an XGBoost model on Verta

Within Verta, a "Model" can be any arbitrary function: a traditional ML model (e.g., sklearn, PyTorch, TF, etc); a function (e.g., squaring a number, making a DB function etc.); or a mixture of the above (e.g., pre-processing code, a DB call, and then a model application.) See more [here](https://docs.verta.ai/verta/registry/concepts).

This notebook provides an example of how to deploy a XGBoost model on Verta as a Verta Standard Model either via  convenience functions or by extending [VertaModelBase](https://verta.readthedocs.io/en/master/_autogen/verta.registry.VertaModelBase.html?highlight=VertaModelBase#verta.registry.VertaModelBase).

<a href="https://colab.research.google.com/github/VertaAI/modeldb/blob/master/client/workflows/examples/xgboost.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## 0. Imports

In [1]:
import warnings
warnings.filterwarnings("ignore", category=FutureWarning)

import itertools
import time

import six

import numpy as np
import pandas as pd

import sklearn
from sklearn import datasets
from sklearn import model_selection

import xgboost as xgb

### 0.1 Verta import and setup

In [2]:
# restart your notebook if prompted on Colab
try:
    import verta
except ImportError:
    !pip install verta

In [3]:
import os

# Ensure credentials are set up, if not, use below
# os.environ['VERTA_EMAIL'] = 
# os.environ['VERTA_DEV_KEY'] = 
# os.environ['VERTA_HOST'] = 

from verta import Client
import os
client = Client(os.environ['VERTA_HOST'])

PROJECT_NAME = "Wine Multiclassification"
EXPERIMENT_NAME = "Boosted Trees"
proj = client.set_project(PROJECT_NAME)
expt = client.set_experiment(EXPERIMENT_NAME)

---

## 1. Model training

### 1.1 Prepare Data

In [4]:
data = datasets.load_wine()

X = data['data']
y = data['target']

dtrain = xgb.DMatrix(X, label=y)

In [5]:
df = pd.DataFrame(np.hstack((X, y.reshape(-1, 1))),
                  columns=data['feature_names'] + ['species'])

df.head()

### 1.2 Prepare Hyperparameters

In [6]:
grid = model_selection.ParameterGrid({
    'eta': [0.5, 0.7],
    'max_depth': [1, 2, 3],
    'num_class': [10],
})

### 1.3 Train the model and tune hyperparameters

In [7]:
def run_experiment(hyperparams):
    run = client.set_experiment_run()
    
    # log hyperparameters
    run.log_hyperparameters(hyperparams)
    
    # run cross validation on hyperparameters
    cv_history = xgb.cv(hyperparams, dtrain,
                        nfold=5,
                        metrics=("merror", "mlogloss"))

    # log observations from each iteration
    for _, iteration in cv_history.iterrows():
        for obs, val in iteration.iteritems():
            run.log_observation(obs, val)
            
    # log error from final iteration
    final_val_error = iteration['test-merror-mean']
    run.log_metric("val_error", final_val_error)
    print("{} Mean error: {:.4f}".format(hyperparams, final_val_error))
    
# NOTE: run_experiment() could also be defined in a module, and executed in parallel
for hyperparams in grid:
    run_experiment(hyperparams)

### 1.4 Select the best set of hyperparams and train on full dataset

In [8]:
best_run = expt.expt_runs.sort("metrics.val_error", descending=False)[0]
print("Validation Error: {:.4f}".format(best_run.get_metric("val_error")))

best_hyperparams = best_run.get_hyperparameters()
print("Hyperparameters: {}".format(best_hyperparams))

In [9]:
model = xgb.XGBClassifier(**best_hyperparams)
model.fit(X, y)

In [10]:
# Calculate and Log Accuracy on Full Training Set
train_acc = model.score(X, y)
best_run.log_metric("train_acc_full", train_acc)
print("Training accuracy: {:.4f}".format(train_acc))

## 2. Register Model for Deployment

In [11]:
registered_model = client.get_or_create_registered_model(
    name="wine", labels=["xgboost"])

In [12]:
from verta.environment import Python
model_version = registered_model.create_standard_model_from_xgboost(
    model, environment=Python(requirements=["xgboost", "sklearn"]), name="v1")

---

## 3. Deploy model to endpoint

In [13]:
wine_endpoint = client.get_or_create_endpoint("wine")
wine_endpoint.update(model_version, wait=True)

In [14]:
deployed_model = wine_endpoint.get_deployed_model()
deployed_model.predict([X[0]])

---