# Arborium: Regression Example

This notebook demonstrates how to use Arborium to visualize trees in regression models.

## Installation

If you're running this notebook in Colab or outside the arborium repository, uncomment and run the following cell to install the package:

In [1]:
# Uncomment if running in Colab or if you haven't installed arborium yet
# !pip install arborium[xgboost]

## Importing Libraries

First, let's import the necessary libraries:

In [1]:
from arborium import XGBTreeVisualizer
import numpy as np
import xgboost as xgb
from sklearn.datasets import fetch_california_housing
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error

## Loading and Preparing Data

For this regression, we'll use the california housing dataset

In [2]:
housing = fetch_california_housing()
X, y = housing.data, housing.target

# Split the data
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Create DMatrix for XGBoost
dtrain_reg = xgb.DMatrix(X_train, label=y_train)
dtest_reg = xgb.DMatrix(X_test, label=y_test)

## Training a Regression XGBoost Model

Now, let's train an XGBoost regressor for this problem, optimizing for the MSE:

In [3]:
# Set parameters for regression
params_reg = {
    'objective': 'reg:squarederror',
    'max_depth': 4,
    'learning_rate': 0.1,
    'eval_metric': 'rmse'
}

# Train the regression model
num_rounds = 50
reg_model = xgb.train(params_reg, dtrain_reg, num_rounds)

# Evaluate the model
y_pred = reg_model.predict(dtest_reg)
rmse = np.sqrt(mean_squared_error(y_test, y_pred))
print(f"Regression model RMSE: {rmse:.4f}")

Regression model RMSE: 0.5486


## Visualizing the Trees


In [4]:
reg_vizualizer = XGBTreeVisualizer(reg_model, X_train, y_train, feature_names=housing.feature_names)

reg_vizualizer.show_tree()

## Conclusion

You've now learned how to use Arborium to visualize trees in regression based XGBoost models. The visualizations help you understand how the model scores each data point before agreggating across all trees in the regression

In the next example, we'll explore how to create simplified tree representations of complex models.