# Arborium: Regression Example

This notebook demonstrates how to use Arborium to visualize trees in regression models.

## Installation

If you're running this notebook in Colab or outside the arborium repository, uncomment and run the following cell to install the package:

In [None]:
# Uncomment if running in Colab or if you haven't installed arborium yet
# !pip install arborium[xgboost]

## Importing Libraries

First, let's import the necessary libraries:

In [None]:
from arborium import XGBTreeVisualizer
import numpy as np
import xgboost as xgb
from sklearn.datasets import load_iris

## Loading and Preparing Data

For multi-class classification, we'll use the classic Iris dataset:

In [None]:
# Load a multi-class dataset
iris = load_iris()
X, y = iris.data, iris.target
feature_names = iris.feature_names
target_names = iris.target_names

# Take a quick look at our data
print(f"Number of samples: {X.shape[0]}")
print(f"Number of features: {X.shape[1]}")
print(f"Feature names: {feature_names}")
print(f"Target classes: {target_names}")

## Training a Multi-Class XGBoost Model

Now, let's train an XGBoost classifier for this multi-class problem:

In [None]:
# Train a multi-class model
model = xgb.XGBClassifier(n_estimators=30, max_depth=3)
model.fit(X, y)

print(f"Model trained with {model.n_estimators} trees of max depth {model.max_depth}")

## Visualizing the Trees with Class Information

For multi-class problems, XGBoost creates separate trees for each class. Arborium can show which trees correspond to which classes:

In [None]:
# Create a visualizer with target names
visualizer = XGBTreeVisualizer(model, X, y, 
                              feature_names=feature_names,
                              target_names=target_names)

# Show the trees
visualizer.show_tree()

## Exploring Trees for Specific Classes

You can also view trees associated with specific classes:

In [None]:
# Show a tree for the second class (versicolor)
# In multi-class XGBoost models, trees are organized by class in rounds
visualizer.show_tree(1)  # Tree index 1 should correspond to the second class

## Understanding Multi-Class Tree Structure

In multi-class XGBoost models:
- For K classes, each boosting round produces K trees (one per class)
- Trees are indexed as: round*num_classes + class_idx
- The class information is displayed in the tree header

In [None]:
# Get the number of classes in our model
num_classes = len(target_names)
print(f"Number of classes: {num_classes}")
print(f"Total number of trees: {model.n_estimators * num_classes}")

# Let's see a tree from a later round
# This will be for class 0 (setosa) in the 5th round
tree_idx = 5 * num_classes + 0
visualizer.show_tree(tree_idx)

## Conclusion

You've now learned how to use Arborium to visualize trees in multi-class XGBoost models. The visualizations help you understand how the model distinguishes between different classes and what features are most important for each class.

In the next example, we'll explore how to create simplified tree representations of complex models.