# Registering a XGBoost model on Verta

Within Verta, a "Model" can be any arbitrary function: a traditional ML model (e.g., sklearn, PyTorch, TF, etc); a function (e.g., squaring a number, making a DB function etc.); or a mixture of the above (e.g., pre-processing code, a DB call, and then a model application.) See more [here](https://docs.verta.ai/verta/registry/concepts).

This notebook provides an example of how to register a XGBoost model on Verta as a Verta Standard Model via convenience functions.

<a href="https://colab.research.google.com/github/VertaAI/examples/blob/registry_examples/registry/xgboost/xgboost-registered-model.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## 0. Imports

In [None]:
!pip install wget  # you may need pip3
!pip install verta # restart colab if prompted
!pip install xgboost

In [None]:
import warnings
warnings.filterwarnings("ignore", category=FutureWarning)

import numpy as np
import pandas as pd

import xgboost as xgb

---

## 1. Register Model

### 1.1 (Optional) Model Training 

A model has to exist before we can register, so we will also train one here in our notebook.

If you already have a trained xgboost model pickled into a file, you can skip this step and directly register it on the catalog

#### 1.1.1 Load Training Data

In [None]:
from sklearn import datasets

data = datasets.load_wine()

X = data['data']
y = data['target']

dtrain = xgb.DMatrix(X, label=y)

In [None]:
df = pd.DataFrame(np.hstack((X, y.reshape(-1, 1))),
                  columns=data['feature_names'] + ['species'])

df.head()

#### 1.1.2 Train/Test code

**Model Info**

We'll be training an XGBoost Regressor on the [Wine-Quality](https://archive.ics.uci.edu/ml/datasets/wine+quality) dataset to predict a score from 0-10 based physiochemical features of each.

In [None]:
hyperparams =  {
    'eta': 0.5,
    'max_depth' : 2,
}

model = xgb.XGBRegressor(**hyperparams)
model.fit(X, y)

In [None]:
# Calculate and Log Accuracy on Full Training Set
train_acc = model.score(X, y)
print("Training accuracy: {:.4f}".format(train_acc))

### 1.2 Register Model to Verta Model Catalog

Now that the model is in a good shape, we can register it into the Verta platform.

We'll connect to Verta through the [Verta Python Client](https://verta.readthedocs.io/en/main/_autogen/verta.Client.html), 
create a [registered model](https://verta.readthedocs.io/en/master/_autogen/verta.registry.entities.RegisteredModel.html) for our XGBoost model 
and a [version](https://verta.readthedocs.io/en/master/_autogen/verta.registry.entities.RegisteredModelVersion.html) to associate this particular model with on the catalog.

All of these can be viewed in the Verta web app once they are created.

In [None]:
# Paste your credentials in this cell or anywhere above this along with the code snippet to connect to Verta Platform

from verta import Client

client = Client(
        #   host="app.verta.ai",
        #   email="user@verta.ai",
        #   dev_key="a765b2de-786d-466c-b2d8-thiye06f80d5",
        )

In [None]:
# Create/Get a Verta registered model

from verta.registry import data_type, task_type

registered_model = client.get_or_create_registered_model(
    name="wine-xgboost-example", # Name to identify on the catalog
    desc="Models trained to predict a score from 0-10 given physiochemical features of a wine", # Small description to show on Model Card
    data_type=data_type.Tabular(), # Data Type of the Model
    task_type=task_type.Regression(), # Task Type of the Model
    labels=["tabular", "xgboost"]) # tags/labels to filter, search and categorize

#### 1.2.1 Register from the model object
 
If you are in the same file where you have the model object (can be loaded through joblib/cloudpickle if saved into a pickle file) handy, use the code below to package the model

In [None]:
from verta.environment import Python
from verta.utils import ModelAPI

# uncommment the below if you want to load it from a pickled object
# import cloudpickle
# model = cloudpickle.load(open("<filepath_to_model.pkl>", "rb"))

model_version = registered_model.create_standard_model_from_xgboost(
    model, 
    environment=Python(requirements=["xgboost", "sklearn"]), 
    name="v1", # Name to identify the version in the model versions tab
    labels=["std_model_xgboost"], # tags/labels to filter, search and categorize
    model_api=ModelAPI(X, y), # (Optional) To Populate the Model API
    )

---

And that's it! You should be able to see your model in the Model Catalog