# Tracking xgboost models in Verta using Autologging

Verta's experiment management system enables data scientists to track rich information about their modeling experiments including data such as metrics, hyperparameters, confusion matrices, examples of input and output data, and many others.

This notebook shows how to use Verta's experiment management system with models developed in XGBoost. See Verta [documentation](https://docs.verta.ai/verta/experiment-management) for full details on Verta's experiment management capabilities.

Updated for Verta version: 0.18.2

## 0. Imports

In [1]:
from __future__ import print_function

from sklearn import datasets
from sklearn.model_selection import train_test_split
import xgboost as xgb

### 0.1 Verta import and setup

In [2]:
# restart your notebook if prompted on Colab
try:
    import verta
except ImportError:
    !pip install verta

In [3]:
# import os
# os.environ['VERTA_EMAIL'] = 
# os.environ['VERTA_DEV_KEY'] = 
# os.environ['VERTA_HOST'] =

In [4]:
from verta import Client
import os

client = Client(os.environ['VERTA_HOST'])

---

## 1. Model Training

### 1.1 Prepare Data

In [5]:
digits = datasets.load_digits()
n_samples = len(digits.images)
data = digits.images.reshape((n_samples, -1))

X_train, X_test, y_train, y_test = train_test_split(
    data, digits.target, test_size=0.5, shuffle=False)

dtrain = xgb.DMatrix(X_train, label=y_train)
dtest = xgb.DMatrix(X_test, label=y_test)

### 1.2. Define hyperparams

In [6]:
params = {
    'eta': 0.08,
    'max_depth': 6,
    'subsample': 0.8,
    'colsample_bytree': 0.8,
    'objective': "multi:softmax",
    'eval_metric': "merror",
    'alpha': 8,
    'lambda': 2,
    'num_class': 10,
}
num_rounds = 20
early_stopping = 50

### 1.3 Train model

In [7]:
proj = client.set_project("Wine classification: Autologging")
expt = client.set_experiment("Boosted Trees")
run = client.set_experiment_run()

In [8]:
from verta.integrations.xgboost import verta_callback
from sklearn.metrics import accuracy_score

bst = xgb.train(
    params, dtrain,
    num_rounds,
    evals=[(dtrain, "train"), (dtest, "eval")],
    early_stopping_rounds=early_stopping,
    verbose_eval=False,
    callbacks=[verta_callback(run)],
)
run.log_metric("accuracy", accuracy_score(bst.predict(dtest), y_test))
# that's it! check your run below
# additional metadata for the run can be stored using regular API calls as documented here: https://verta.readthedocs.io/en/master/_autogen/verta.tracking.entities.ExperimentRun.html#verta.tracking.entities.ExperimentRun

In [9]:
run

---