# Analyzing results from hyperparameter tuning
In this example, we will go through how you can use Ray AIR to run a distributed hyperparameter experiment to find optimal hyperparameters for an XGBoost model.

What we'll cover:
- How to load data from an Sklearn example dataset
- How to initialize an XGBoost trainer
- How to define a search space for regular XGBoost parameters and for data preprocessors
- How to fetch the best obtained result from the tuning run
- How to fetch a dataframe to do further analysis on the results

We'll use the [Covertype dataset](https://scikit-learn.org/stable/modules/generated/sklearn.datasets.fetch_covtype.html#sklearn-datasets-fetch-covtype) provided from sklearn to train a multiclass classification task using XGBoost.

In this dataset, we try to predict the forst cover type (e.g. "lodgehole pine") from cartographic variables, like the distance to the closest road, or the hillshade at different times of the day. The features are binary, discrete and continuous and thus well suited for a decision-tree based classification task.

You can find more information about the dataset [on the dataset homepage](https://archive.ics.uci.edu/ml/datasets/Covertype).

We will train XGBoost models on this dataset. Because model training performance can be influenced by hyperparameter choices, we will generate several different configurations and train them in parallel. Notably each of these trials will itself start a distributed training job to speed up training. All of this happens automatically within Ray AIR.

First, let's make sure we have all dependencies installed:

In [None]:
%pip install "ray[all]" sklearn

Then we can start with some imports.

In [1]:
import pandas as pd
from sklearn.datasets import fetch_covtype

import ray
from ray import tune
from ray.ml import RunConfig
from ray.ml.train.integrations.xgboost import XGBoostTrainer
from ray.tune.tune_config import TuneConfig
from ray.tune.tuner import Tuner

We'll define a utility function to create a Ray Dataset from the Sklearn dataset. We expect the target column to be in the dataframe, so we'll add it to the dataframe manually.

In [3]:
def get_training_data() -> ray.data.Dataset:
    data_raw = fetch_covtype()
    df = pd.DataFrame(data_raw["data"], columns=data_raw["feature_names"])
    df["target"] = data_raw["target"]
    return ray.data.from_pandas(df)


train_dataset = get_training_data()

2022-05-11 12:56:03,661	INFO services.py:1484 -- View the Ray dashboard at [1m[32mhttp://127.0.0.1:8265[39m[22m


Let's take a look at the schema here:

In [4]:
print(train_dataset)

Dataset(num_blocks=1, num_rows=581012, schema={Elevation: float64, Aspect: float64, Slope: float64, Horizontal_Distance_To_Hydrology: float64, Vertical_Distance_To_Hydrology: float64, Horizontal_Distance_To_Roadways: float64, Hillshade_9am: float64, Hillshade_Noon: float64, Hillshade_3pm: float64, Horizontal_Distance_To_Fire_Points: float64, Wilderness_Area_0: float64, Wilderness_Area_1: float64, Wilderness_Area_2: float64, Wilderness_Area_3: float64, Soil_Type_0: float64, Soil_Type_1: float64, Soil_Type_2: float64, Soil_Type_3: float64, Soil_Type_4: float64, Soil_Type_5: float64, Soil_Type_6: float64, Soil_Type_7: float64, Soil_Type_8: float64, Soil_Type_9: float64, Soil_Type_10: float64, Soil_Type_11: float64, Soil_Type_12: float64, Soil_Type_13: float64, Soil_Type_14: float64, Soil_Type_15: float64, Soil_Type_16: float64, Soil_Type_17: float64, Soil_Type_18: float64, Soil_Type_19: float64, Soil_Type_20: float64, Soil_Type_21: float64, Soil_Type_22: float64, Soil_Type_23: float64, So

Since we'll be training a multiclass prediction model, we have to pass some information to XGBoost. For instance, XGBoost expects us to provide the number of classes, and multiclass-enabled evaluation metrices.

For a good overview of commonly used hyperparameters, see [our tutorial in the docs](https://docs.ray.io/en/latest/tune/examples/tune-xgboost.html#xgboost-hyperparameters).

In [5]:
# XGBoost specific params
params = {
    "tree_method": "approx",
    "objective": "multi:softmax",
    "eval_metric": ["mlogloss", "merror"],
    "num_class": 8,
    "min_child_weight": 2
}

With these parameters in place, we'll create a Ray AIR `XGBoostTrainer`.

Note a few things here. First, we pass in a `scaling_config` to configure the distributed training behavior of each individual XGBoost training job. Here, we want to distribute training across 2 workers.

The `label_column` specifies which columns in the dataset contains the target values. `params` are the XGBoost training params defined above - we can tune these later! The `datasets` dict contains the dataset we would like to train on. Lastly, we pass the number of boosting rounds to XGBoost.

In [6]:
trainer = XGBoostTrainer(
    scaling_config={"num_workers": 2},
    label_column="target",
    params=params,
    datasets={"train": train_dataset},
    num_boost_round=10,
)

We can now create the Tuner with a search space to override some of the default parameters in the XGBoost trainer.

Here, we just want to the XGBoost `max_depth` and `min_child_weights` parameters. Note that we specifically specified `min_child_weight=2` in the default XGBoost trainer - this value will be overwritten during tuning.

We configure Tune to minimize the `train-mlogloss` metric. In random search, this doesn't affect the evaluated configurations, but it will affect our default results fetching for analysis later.

In [7]:
tuner = Tuner(
    trainer,
    param_space={
        "params": {
            "max_depth": tune.randint(2, 8), 
            "min_child_weight": tune.randint(1, 10), 
        },
    },
    tune_config=TuneConfig(num_samples=4, metric="train-mlogloss", mode="min"),
)

Let's run the tuning. This will take a few minutes to complete.

In [8]:
results = tuner.fit()

Trial name,status,loc,params/max_depth,params/min_child_weight,iter,total time (s),train-mlogloss,train-merror
XGBoostTrainer_ffee6_00000,TERMINATED,127.0.0.1:73693,2,2,10,70.3546,0.79498,0.311035
XGBoostTrainer_ffee6_00001,TERMINATED,127.0.0.1:73701,2,6,10,66.4745,0.794983,0.311035
XGBoostTrainer_ffee6_00002,TERMINATED,127.0.0.1:73702,2,3,10,60.9231,0.79498,0.311035
XGBoostTrainer_ffee6_00003,TERMINATED,127.0.0.1:73703,4,3,10,81.6666,0.698445,0.262468


[2m[36m(GBDTTrainable pid=73702)[0m 2022-05-11 12:56:34,914	INFO main.py:984 -- [RayXGBoost] Created 2 new actors (2 total actors). Waiting until actors are ready for training.
[2m[36m(GBDTTrainable pid=73693)[0m 2022-05-11 12:56:36,564	INFO main.py:984 -- [RayXGBoost] Created 2 new actors (2 total actors). Waiting until actors are ready for training.
[2m[36m(GBDTTrainable pid=73703)[0m 2022-05-11 12:56:38,559	INFO main.py:984 -- [RayXGBoost] Created 2 new actors (2 total actors). Waiting until actors are ready for training.
[2m[36m(GBDTTrainable pid=73701)[0m 2022-05-11 12:56:39,235	INFO main.py:984 -- [RayXGBoost] Created 2 new actors (2 total actors). Waiting until actors are ready for training.
[2m[36m(GBDTTrainable pid=73702)[0m 2022-05-11 12:56:40,561	INFO main.py:1029 -- [RayXGBoost] Starting XGBoost training.
[2m[36m(_RemoteRayXGBoostActor pid=73737)[0m [12:56:40] task [xgboost.ray]:4599733072 got new rank 0
[2m[36m(_RemoteRayXGBoostActor pid=73738)[0m [12:

Trial XGBoostTrainer_ffee6_00002 reported train-mlogloss=1.57 with parameters={'params': {'max_depth': 2, 'min_child_weight': 3}}.


[2m[36m(raylet)[0m Spilled 2170 MiB, 42 objects, write throughput 387 MiB/s. Set RAY_verbose_spill_logs=0 to disable this message.
[2m[36m(GBDTTrainable pid=73703)[0m 2022-05-11 12:56:50,521	INFO main.py:1029 -- [RayXGBoost] Starting XGBoost training.
[2m[36m(_RemoteRayXGBoostActor pid=73776)[0m [12:56:50] task [xgboost.ray]:4557005776 got new rank 0
[2m[36m(_RemoteRayXGBoostActor pid=73777)[0m [12:56:50] task [xgboost.ray]:4561024272 got new rank 1


Trial XGBoostTrainer_ffee6_00002 reported train-mlogloss=1.33 with parameters={'params': {'max_depth': 2, 'min_child_weight': 3}}.
Trial XGBoostTrainer_ffee6_00001 reported train-mlogloss=1.57 with parameters={'params': {'max_depth': 2, 'min_child_weight': 6}}.
Trial XGBoostTrainer_ffee6_00000 reported train-mlogloss=1.57 with parameters={'params': {'max_depth': 2, 'min_child_weight': 2}}.
Trial XGBoostTrainer_ffee6_00003 reported train-mlogloss=1.52 with parameters={'params': {'max_depth': 4, 'min_child_weight': 3}}.
Trial XGBoostTrainer_ffee6_00002 reported train-mlogloss=1.07 with parameters={'params': {'max_depth': 2, 'min_child_weight': 3}}.
Trial XGBoostTrainer_ffee6_00000 reported train-mlogloss=1.18 with parameters={'params': {'max_depth': 2, 'min_child_weight': 2}}.
Trial XGBoostTrainer_ffee6_00001 reported train-mlogloss=1.18 with parameters={'params': {'max_depth': 2, 'min_child_weight': 6}}.
Trial XGBoostTrainer_ffee6_00003 reported train-mlogloss=1.27 with parameters={'par

[2m[36m(GBDTTrainable pid=73702)[0m 2022-05-11 12:57:12,434	INFO main.py:1113 -- Training in progress (32 seconds since last restart).


Trial XGBoostTrainer_ffee6_00002 reported train-mlogloss=0.99 with parameters={'params': {'max_depth': 2, 'min_child_weight': 3}}.
Trial XGBoostTrainer_ffee6_00000 reported train-mlogloss=1.07 with parameters={'params': {'max_depth': 2, 'min_child_weight': 2}}.
Trial XGBoostTrainer_ffee6_00001 reported train-mlogloss=1.07 with parameters={'params': {'max_depth': 2, 'min_child_weight': 6}}.
Trial XGBoostTrainer_ffee6_00003 reported train-mlogloss=1.10 with parameters={'params': {'max_depth': 4, 'min_child_weight': 3}}.


[2m[36m(GBDTTrainable pid=73693)[0m 2022-05-11 12:57:19,586	INFO main.py:1113 -- Training in progress (31 seconds since last restart).
[2m[36m(GBDTTrainable pid=73701)[0m 2022-05-11 12:57:19,605	INFO main.py:1113 -- Training in progress (31 seconds since last restart).


Trial XGBoostTrainer_ffee6_00002 reported train-mlogloss=0.88 with parameters={'params': {'max_depth': 2, 'min_child_weight': 3}}.


[2m[36m(GBDTTrainable pid=73703)[0m 2022-05-11 12:57:21,552	INFO main.py:1113 -- Training in progress (31 seconds since last restart).


Trial XGBoostTrainer_ffee6_00000 reported train-mlogloss=0.93 with parameters={'params': {'max_depth': 2, 'min_child_weight': 2}}.
Trial XGBoostTrainer_ffee6_00001 reported train-mlogloss=0.93 with parameters={'params': {'max_depth': 2, 'min_child_weight': 6}}.
Trial XGBoostTrainer_ffee6_00003 reported train-mlogloss=0.99 with parameters={'params': {'max_depth': 4, 'min_child_weight': 3}}.
Trial XGBoostTrainer_ffee6_00002 reported train-mlogloss=0.85 with parameters={'params': {'max_depth': 2, 'min_child_weight': 3}}.
Trial XGBoostTrainer_ffee6_00003 reported train-mlogloss=0.90 with parameters={'params': {'max_depth': 4, 'min_child_weight': 3}}.
Trial XGBoostTrainer_ffee6_00001 reported train-mlogloss=0.85 with parameters={'params': {'max_depth': 2, 'min_child_weight': 6}}.
Trial XGBoostTrainer_ffee6_00000 reported train-mlogloss=0.85 with parameters={'params': {'max_depth': 2, 'min_child_weight': 2}}.
Trial XGBoostTrainer_ffee6_00002 reported train-mlogloss=0.79 with parameters={'par

[2m[36m(GBDTTrainable pid=73702)[0m 2022-05-11 12:57:34,888	INFO main.py:1526 -- [RayXGBoost] Finished XGBoost training on training data with total N=581,012 in 60.11 seconds (54.31 pure XGBoost training time).


Trial XGBoostTrainer_ffee6_00002 completed. Last result: train-mlogloss=0.79498,train-merror=0.311035,should_checkpoint=True
Trial XGBoostTrainer_ffee6_00003 reported train-mlogloss=0.84 with parameters={'params': {'max_depth': 4, 'min_child_weight': 3}}.
Trial XGBoostTrainer_ffee6_00000 reported train-mlogloss=0.79 with parameters={'params': {'max_depth': 2, 'min_child_weight': 2}}.
Trial XGBoostTrainer_ffee6_00001 reported train-mlogloss=0.79 with parameters={'params': {'max_depth': 2, 'min_child_weight': 6}}.


[2m[36m(GBDTTrainable pid=73693)[0m 2022-05-11 12:57:40,484	INFO main.py:1526 -- [RayXGBoost] Finished XGBoost training on training data with total N=581,012 in 64.06 seconds (51.91 pure XGBoost training time).
[2m[36m(GBDTTrainable pid=73701)[0m 2022-05-11 12:57:40,502	INFO main.py:1526 -- [RayXGBoost] Finished XGBoost training on training data with total N=581,012 in 61.44 seconds (51.93 pure XGBoost training time).


Trial XGBoostTrainer_ffee6_00000 completed. Last result: train-mlogloss=0.79498,train-merror=0.311035,should_checkpoint=True
Trial XGBoostTrainer_ffee6_00001 completed. Last result: train-mlogloss=0.794983,train-merror=0.311035,should_checkpoint=True
Trial XGBoostTrainer_ffee6_00003 reported train-mlogloss=0.79 with parameters={'params': {'max_depth': 4, 'min_child_weight': 3}}.
Trial XGBoostTrainer_ffee6_00003 reported train-mlogloss=0.72 with parameters={'params': {'max_depth': 4, 'min_child_weight': 3}}.


[2m[36m(GBDTTrainable pid=73703)[0m 2022-05-11 12:57:52,067	INFO main.py:1113 -- Training in progress (62 seconds since last restart).
[2m[36m(GBDTTrainable pid=73703)[0m 2022-05-11 12:57:55,563	INFO main.py:1526 -- [RayXGBoost] Finished XGBoost training on training data with total N=581,012 in 77.27 seconds (65.02 pure XGBoost training time).


Trial XGBoostTrainer_ffee6_00003 reported train-mlogloss=0.70 with parameters={'params': {'max_depth': 4, 'min_child_weight': 3}}. This trial completed.


2022-05-11 12:57:55,963	INFO tune.py:753 -- Total run time: 91.48 seconds (90.03 seconds for the tuning loop).


Now that we obtained the results, we can analyze them. For instance, we can fetch the best observed result according to the configured `metric` and `mode` and print it:

In [9]:
# This will fetch the best result according to the `metric` and `mode` specified
# in the `TuneConfig` above:

best_result = results.get_best_result()

print("Best result error rate", best_result.metrics["train-merror"])

Best result error rate 0.262468


For more sophisticated analysis, we can get a pandas dataframe with all trial results:

In [10]:
df = results.get_dataframe()
print(df.columns)

Index(['train-mlogloss', 'train-merror', 'time_this_iter_s',
       'should_checkpoint', 'done', 'timesteps_total', 'episodes_total',
       'training_iteration', 'trial_id', 'experiment_id', 'date', 'timestamp',
       'time_total_s', 'pid', 'hostname', 'node_ip', 'time_since_restore',
       'timesteps_since_restore', 'iterations_since_restore', 'warmup_time',
       'config/params', 'logdir'],
      dtype='object')


As an example, let's group the results per preprocessor and fetch the minimal obtained values:

In [12]:
groups = df.groupby("config/params")
mins = groups.min()




TypeError: unhashable type: 'dict'

The `results.get_dataframe()` returns the last reported results per trial. If you want to obtain the best _ever_ observed results, you can pass the `filter_metric` and `filter_mode` arguments to `results.get_dataframe()`. In our example, we'll filter the minimum _ever_ observed `train-merror` for each trial:

In [11]:
df_min_error = results.get_dataframe(filter_metric="train-merror", filter_mode="min")
df_min_error["train-merror"]

0    0.197125
1    0.279679
2    0.262468
3    0.197125
4    0.310307
5    0.310307
6    0.310307
7    0.220687
Name: train-merror, dtype: float64

And that's how you analyze your hyperparameter tuning results. If you would like to have access to more analytics, please feel free to file a feature request e.g. [as a Github issue](https://github.com/ray-project/ray/issues) or on our [Discuss platform](https://discuss.ray.io/)!