# Analyzing results from hyperparameter tuning
In this example, we will go through how you can use Ray AIR to run a distributed hyperparameter experiment to find optimal hyperparameters for an XGBoost model.

What we'll cover:
- How to load data from an Sklearn example dataset
- How to initialize an XGBoost trainer
- How to define a search space for regular XGBoost parameters and for data preprocessors
- How to fetch the best obtained result from the tuning run
- How to fetch a dataframe to do further analysis on the results

We'll use the [Covertype dataset](https://scikit-learn.org/stable/modules/generated/sklearn.datasets.fetch_covtype.html#sklearn-datasets-fetch-covtype) provided from sklearn to train a multiclass classification task using XGBoost.

In this dataset, we try to predict the forst cover type (e.g. "lodgehole pine") from cartographic variables, like the distance to the closest road, or the hillshade at different times of the day. The features are binary, discrete and continuous and thus well suited for a decision-tree based classification task.

You can find more information about the dataset [on the dataset homepage](https://archive.ics.uci.edu/ml/datasets/Covertype).

We will train XGBoost models on this dataset. Because model training performance can be influenced by hyperparameter choices, we will generate several different configurations and train them in parallel. Notably each of these trials will itself start a distributed training job to speed up training. All of this happens automatically within Ray AIR.

First, let's make sure we have all dependencies installed:

In [1]:
%pip install "ray[all]" sklearn



You should consider upgrading via the '/Users/kai/.pyenv/versions/3.7.7/bin/python3.7 -m pip install --upgrade pip' command.[0m
Note: you may need to restart the kernel to use updated packages.


Then we can start with some imports.

In [2]:
import pandas as pd
from sklearn.datasets import fetch_covtype

import ray
from ray import tune
from ray.ml import RunConfig
from ray.ml.train.integrations.xgboost import XGBoostTrainer
from ray.tune.tune_config import TuneConfig
from ray.tune.tuner import Tuner

We'll define a utility function to create a Ray Dataset from the Sklearn dataset. We expect the target column to be in the dataframe, so we'll add it to the dataframe manually.

In [3]:
def get_training_data() -> ray.data.Dataset:
    data_raw = fetch_covtype()
    df = pd.DataFrame(data_raw["data"], columns=data_raw["feature_names"])
    df["target"] = data_raw["target"]
    return ray.data.from_pandas(df)


train_dataset = get_training_data()

2022-05-11 13:38:00,419	INFO services.py:1484 -- View the Ray dashboard at [1m[32mhttp://127.0.0.1:8265[39m[22m


Let's take a look at the schema here:

In [4]:
print(train_dataset)

Dataset(num_blocks=1, num_rows=581012, schema={Elevation: float64, Aspect: float64, Slope: float64, Horizontal_Distance_To_Hydrology: float64, Vertical_Distance_To_Hydrology: float64, Horizontal_Distance_To_Roadways: float64, Hillshade_9am: float64, Hillshade_Noon: float64, Hillshade_3pm: float64, Horizontal_Distance_To_Fire_Points: float64, Wilderness_Area_0: float64, Wilderness_Area_1: float64, Wilderness_Area_2: float64, Wilderness_Area_3: float64, Soil_Type_0: float64, Soil_Type_1: float64, Soil_Type_2: float64, Soil_Type_3: float64, Soil_Type_4: float64, Soil_Type_5: float64, Soil_Type_6: float64, Soil_Type_7: float64, Soil_Type_8: float64, Soil_Type_9: float64, Soil_Type_10: float64, Soil_Type_11: float64, Soil_Type_12: float64, Soil_Type_13: float64, Soil_Type_14: float64, Soil_Type_15: float64, Soil_Type_16: float64, Soil_Type_17: float64, Soil_Type_18: float64, Soil_Type_19: float64, Soil_Type_20: float64, Soil_Type_21: float64, Soil_Type_22: float64, Soil_Type_23: float64, So

Since we'll be training a multiclass prediction model, we have to pass some information to XGBoost. For instance, XGBoost expects us to provide the number of classes, and multiclass-enabled evaluation metrices.

For a good overview of commonly used hyperparameters, see [our tutorial in the docs](https://docs.ray.io/en/latest/tune/examples/tune-xgboost.html#xgboost-hyperparameters).

In [5]:
# XGBoost specific params
params = {
    "tree_method": "approx",
    "objective": "multi:softmax",
    "eval_metric": ["mlogloss", "merror"],
    "num_class": 8,
    "min_child_weight": 2
}

With these parameters in place, we'll create a Ray AIR `XGBoostTrainer`.

Note a few things here. First, we pass in a `scaling_config` to configure the distributed training behavior of each individual XGBoost training job. Here, we want to distribute training across 2 workers.

The `label_column` specifies which columns in the dataset contains the target values. `params` are the XGBoost training params defined above - we can tune these later! The `datasets` dict contains the dataset we would like to train on. Lastly, we pass the number of boosting rounds to XGBoost.

In [6]:
trainer = XGBoostTrainer(
    scaling_config={"num_workers": 2},
    label_column="target",
    params=params,
    datasets={"train": train_dataset},
    num_boost_round=10,
)

We can now create the Tuner with a search space to override some of the default parameters in the XGBoost trainer.

Here, we just want to the XGBoost `max_depth` and `min_child_weights` parameters. Note that we specifically specified `min_child_weight=2` in the default XGBoost trainer - this value will be overwritten during tuning.

We configure Tune to minimize the `train-mlogloss` metric. In random search, this doesn't affect the evaluated configurations, but it will affect our default results fetching for analysis later.

In [7]:
tuner = Tuner(
    trainer,
    param_space={
        "params": {
            "max_depth": tune.randint(2, 8), 
            "min_child_weight": tune.randint(1, 10), 
        },
    },
    tune_config=TuneConfig(num_samples=8, metric="train-mlogloss", mode="min"),
)

Let's run the tuning. This will take a few minutes to complete.

In [8]:
results = tuner.fit()

Trial name,status,loc,params/max_depth,params/min_child_...,iter,total time (s),train-mlogloss,train-merror
XGBoostTrainer_d21c1_00000,TERMINATED,127.0.0.1:80007,7,2,10,334.061,0.557201,0.195385
XGBoostTrainer_d21c1_00001,TERMINATED,127.0.0.1:80013,7,6,10,341.875,0.562744,0.199397
XGBoostTrainer_d21c1_00002,TERMINATED,127.0.0.1:80014,6,9,10,310.46,0.608054,0.220109
XGBoostTrainer_d21c1_00003,TERMINATED,127.0.0.1:80015,2,1,10,92.1437,0.79498,0.311035
XGBoostTrainer_d21c1_00004,TERMINATED,127.0.0.1:80016,7,7,10,342.625,0.559456,0.197772
XGBoostTrainer_d21c1_00005,TERMINATED,127.0.0.1:80182,6,8,10,283.301,0.607898,0.219964
XGBoostTrainer_d21c1_00006,TERMINATED,127.0.0.1:80298,6,4,10,88.7628,0.607927,0.220801
XGBoostTrainer_d21c1_00007,TERMINATED,127.0.0.1:80313,4,1,10,69.4095,0.69843,0.262468


[2m[36m(GBDTTrainable pid=80007)[0m 2022-05-11 13:38:12,105	INFO main.py:984 -- [RayXGBoost] Created 2 new actors (2 total actors). Waiting until actors are ready for training.
[2m[36m(GBDTTrainable pid=80016)[0m 2022-05-11 13:38:13,526	INFO main.py:984 -- [RayXGBoost] Created 2 new actors (2 total actors). Waiting until actors are ready for training.
[2m[36m(GBDTTrainable pid=80014)[0m 2022-05-11 13:38:15,073	INFO main.py:984 -- [RayXGBoost] Created 2 new actors (2 total actors). Waiting until actors are ready for training.
[2m[36m(GBDTTrainable pid=80015)[0m 2022-05-11 13:38:15,149	INFO main.py:984 -- [RayXGBoost] Created 2 new actors (2 total actors). Waiting until actors are ready for training.
[2m[36m(GBDTTrainable pid=80013)[0m 2022-05-11 13:38:15,322	INFO main.py:984 -- [RayXGBoost] Created 2 new actors (2 total actors). Waiting until actors are ready for training.
[2m[36m(GBDTTrainable pid=80007)[0m 2022-05-11 13:38:18,803	INFO main.py:1029 -- [RayXGBoost] Sta

Trial XGBoostTrainer_d21c1_00000 reported train-mlogloss=1.45 with parameters={'params': {'max_depth': 7, 'min_child_weight': 2}}.
Trial XGBoostTrainer_d21c1_00003 reported train-mlogloss=1.57 with parameters={'params': {'max_depth': 2, 'min_child_weight': 1}}.
Trial XGBoostTrainer_d21c1_00002 reported train-mlogloss=1.47 with parameters={'params': {'max_depth': 6, 'min_child_weight': 9}}.
Trial XGBoostTrainer_d21c1_00003 reported train-mlogloss=1.33 with parameters={'params': {'max_depth': 2, 'min_child_weight': 1}}.


[2m[36m(GBDTTrainable pid=80007)[0m 2022-05-11 13:38:49,547	INFO main.py:1113 -- Training in progress (31 seconds since last restart).


Trial XGBoostTrainer_d21c1_00001 reported train-mlogloss=1.45 with parameters={'params': {'max_depth': 7, 'min_child_weight': 6}}.
Trial XGBoostTrainer_d21c1_00004 reported train-mlogloss=1.45 with parameters={'params': {'max_depth': 7, 'min_child_weight': 7}}.
Trial XGBoostTrainer_d21c1_00000 reported train-mlogloss=1.17 with parameters={'params': {'max_depth': 7, 'min_child_weight': 2}}.
Trial XGBoostTrainer_d21c1_00003 reported train-mlogloss=1.18 with parameters={'params': {'max_depth': 2, 'min_child_weight': 1}}.


[2m[36m(GBDTTrainable pid=80014)[0m 2022-05-11 13:38:56,357	INFO main.py:1113 -- Training in progress (31 seconds since last restart).
[2m[36m(GBDTTrainable pid=80016)[0m 2022-05-11 13:38:59,693	INFO main.py:1113 -- Training in progress (32 seconds since last restart).
[2m[36m(GBDTTrainable pid=80015)[0m 2022-05-11 13:38:59,665	INFO main.py:1113 -- Training in progress (32 seconds since last restart).
[2m[36m(GBDTTrainable pid=80013)[0m 2022-05-11 13:38:59,699	INFO main.py:1113 -- Training in progress (32 seconds since last restart).


Trial XGBoostTrainer_d21c1_00003 reported train-mlogloss=1.07 with parameters={'params': {'max_depth': 2, 'min_child_weight': 1}}.
Trial XGBoostTrainer_d21c1_00002 reported train-mlogloss=1.20 with parameters={'params': {'max_depth': 6, 'min_child_weight': 9}}.
Trial XGBoostTrainer_d21c1_00001 reported train-mlogloss=1.17 with parameters={'params': {'max_depth': 7, 'min_child_weight': 6}}.
Trial XGBoostTrainer_d21c1_00004 reported train-mlogloss=1.17 with parameters={'params': {'max_depth': 7, 'min_child_weight': 7}}.
Trial XGBoostTrainer_d21c1_00003 reported train-mlogloss=0.99 with parameters={'params': {'max_depth': 2, 'min_child_weight': 1}}.
Trial XGBoostTrainer_d21c1_00000 reported train-mlogloss=0.99 with parameters={'params': {'max_depth': 7, 'min_child_weight': 2}}.
Trial XGBoostTrainer_d21c1_00003 reported train-mlogloss=0.93 with parameters={'params': {'max_depth': 2, 'min_child_weight': 1}}.
Trial XGBoostTrainer_d21c1_00002 reported train-mlogloss=1.03 with parameters={'par

[2m[36m(GBDTTrainable pid=80007)[0m 2022-05-11 13:39:20,050	INFO main.py:1113 -- Training in progress (61 seconds since last restart).


Trial XGBoostTrainer_d21c1_00003 reported train-mlogloss=0.88 with parameters={'params': {'max_depth': 2, 'min_child_weight': 1}}.
Trial XGBoostTrainer_d21c1_00004 reported train-mlogloss=0.99 with parameters={'params': {'max_depth': 7, 'min_child_weight': 7}}.
Trial XGBoostTrainer_d21c1_00001 reported train-mlogloss=0.99 with parameters={'params': {'max_depth': 7, 'min_child_weight': 6}}.
Trial XGBoostTrainer_d21c1_00003 reported train-mlogloss=0.85 with parameters={'params': {'max_depth': 2, 'min_child_weight': 1}}.


[2m[36m(GBDTTrainable pid=80016)[0m 2022-05-11 13:39:31,856	INFO main.py:1113 -- Training in progress (65 seconds since last restart).
[2m[36m(GBDTTrainable pid=80013)[0m 2022-05-11 13:39:31,839	INFO main.py:1113 -- Training in progress (65 seconds since last restart).


Trial XGBoostTrainer_d21c1_00000 reported train-mlogloss=0.87 with parameters={'params': {'max_depth': 7, 'min_child_weight': 2}}.
Trial XGBoostTrainer_d21c1_00002 reported train-mlogloss=0.91 with parameters={'params': {'max_depth': 6, 'min_child_weight': 9}}.


[2m[36m(GBDTTrainable pid=80015)[0m 2022-05-11 13:39:33,715	INFO main.py:1113 -- Training in progress (67 seconds since last restart).
[2m[36m(GBDTTrainable pid=80014)[0m 2022-05-11 13:39:33,715	INFO main.py:1113 -- Training in progress (68 seconds since last restart).


Trial XGBoostTrainer_d21c1_00003 reported train-mlogloss=0.82 with parameters={'params': {'max_depth': 2, 'min_child_weight': 1}}.
Trial XGBoostTrainer_d21c1_00004 reported train-mlogloss=0.87 with parameters={'params': {'max_depth': 7, 'min_child_weight': 7}}.
Trial XGBoostTrainer_d21c1_00003 reported train-mlogloss=0.79 with parameters={'params': {'max_depth': 2, 'min_child_weight': 1}}.


[2m[36m(GBDTTrainable pid=80015)[0m 2022-05-11 13:39:44,366	INFO main.py:1526 -- [RayXGBoost] Finished XGBoost training on training data with total N=581,012 in 89.39 seconds (77.19 pure XGBoost training time).


Trial XGBoostTrainer_d21c1_00003 completed. Last result: train-mlogloss=0.79498,train-merror=0.311035,should_checkpoint=True
Trial XGBoostTrainer_d21c1_00001 reported train-mlogloss=0.87 with parameters={'params': {'max_depth': 7, 'min_child_weight': 6}}.
Trial XGBoostTrainer_d21c1_00002 reported train-mlogloss=0.82 with parameters={'params': {'max_depth': 6, 'min_child_weight': 9}}.
Trial XGBoostTrainer_d21c1_00000 reported train-mlogloss=0.78 with parameters={'params': {'max_depth': 7, 'min_child_weight': 2}}.


[2m[36m(GBDTTrainable pid=80007)[0m 2022-05-11 13:39:50,943	INFO main.py:1113 -- Training in progress (92 seconds since last restart).


Trial XGBoostTrainer_d21c1_00004 reported train-mlogloss=0.78 with parameters={'params': {'max_depth': 7, 'min_child_weight': 7}}.
Trial XGBoostTrainer_d21c1_00002 reported train-mlogloss=0.76 with parameters={'params': {'max_depth': 6, 'min_child_weight': 9}}.
Trial XGBoostTrainer_d21c1_00000 reported train-mlogloss=0.71 with parameters={'params': {'max_depth': 7, 'min_child_weight': 2}}.
Trial XGBoostTrainer_d21c1_00001 reported train-mlogloss=0.78 with parameters={'params': {'max_depth': 7, 'min_child_weight': 6}}.


[2m[36m(GBDTTrainable pid=80013)[0m 2022-05-11 13:40:02,353	INFO main.py:1113 -- Training in progress (95 seconds since last restart).
[2m[36m(GBDTTrainable pid=80016)[0m 2022-05-11 13:40:02,407	INFO main.py:1113 -- Training in progress (95 seconds since last restart).
[2m[36m(GBDTTrainable pid=80182)[0m 2022-05-11 13:40:03,754	INFO main.py:984 -- [RayXGBoost] Created 2 new actors (2 total actors). Waiting until actors are ready for training.
[2m[36m(raylet)[0m Spilled 4223 MiB, 92 objects, write throughput 307 MiB/s.
[2m[36m(GBDTTrainable pid=80014)[0m 2022-05-11 13:40:04,436	INFO main.py:1113 -- Training in progress (99 seconds since last restart).
[2m[36m(GBDTTrainable pid=80182)[0m 2022-05-11 13:40:15,336	INFO main.py:1029 -- [RayXGBoost] Starting XGBoost training.
[2m[36m(_RemoteRayXGBoostActor pid=80202)[0m [13:40:15] task [xgboost.ray]:4618403664 got new rank 0
[2m[36m(_RemoteRayXGBoostActor pid=80203)[0m [13:40:15] task [xgboost.ray]:4561953808 got new r

[2m[1m[36m(scheduler +2m19s)[0m Tip: use `ray status` to view detailed cluster status. To disable these messages, set RAY_SCHEDULER_EVENTS=0.
Trial XGBoostTrainer_d21c1_00002 reported train-mlogloss=0.70 with parameters={'params': {'max_depth': 6, 'min_child_weight': 9}}.
Trial XGBoostTrainer_d21c1_00001 reported train-mlogloss=0.71 with parameters={'params': {'max_depth': 7, 'min_child_weight': 6}}.
Trial XGBoostTrainer_d21c1_00000 reported train-mlogloss=0.66 with parameters={'params': {'max_depth': 7, 'min_child_weight': 2}}.
Trial XGBoostTrainer_d21c1_00004 reported train-mlogloss=0.71 with parameters={'params': {'max_depth': 7, 'min_child_weight': 7}}.


[2m[36m(GBDTTrainable pid=80007)[0m 2022-05-11 13:40:21,910	INFO main.py:1113 -- Training in progress (123 seconds since last restart).


Trial XGBoostTrainer_d21c1_00002 reported train-mlogloss=0.67 with parameters={'params': {'max_depth': 6, 'min_child_weight': 9}}.


[2m[36m(GBDTTrainable pid=80016)[0m 2022-05-11 13:40:32,675	INFO main.py:1113 -- Training in progress (125 seconds since last restart).
[2m[36m(GBDTTrainable pid=80013)[0m 2022-05-11 13:40:32,695	INFO main.py:1113 -- Training in progress (125 seconds since last restart).


Trial XGBoostTrainer_d21c1_00005 reported train-mlogloss=1.47 with parameters={'params': {'max_depth': 6, 'min_child_weight': 8}}.


[2m[36m(GBDTTrainable pid=80014)[0m 2022-05-11 13:40:34,879	INFO main.py:1113 -- Training in progress (129 seconds since last restart).


Trial XGBoostTrainer_d21c1_00001 reported train-mlogloss=0.66 with parameters={'params': {'max_depth': 7, 'min_child_weight': 6}}.
Trial XGBoostTrainer_d21c1_00004 reported train-mlogloss=0.66 with parameters={'params': {'max_depth': 7, 'min_child_weight': 7}}.
Trial XGBoostTrainer_d21c1_00000 reported train-mlogloss=0.62 with parameters={'params': {'max_depth': 7, 'min_child_weight': 2}}.


[2m[36m(GBDTTrainable pid=80182)[0m 2022-05-11 13:40:45,568	INFO main.py:1113 -- Training in progress (30 seconds since last restart).


Trial XGBoostTrainer_d21c1_00002 reported train-mlogloss=0.63 with parameters={'params': {'max_depth': 6, 'min_child_weight': 9}}.
Trial XGBoostTrainer_d21c1_00005 reported train-mlogloss=1.20 with parameters={'params': {'max_depth': 6, 'min_child_weight': 8}}.


[2m[36m(GBDTTrainable pid=80007)[0m 2022-05-11 13:40:51,987	INFO main.py:1113 -- Training in progress (153 seconds since last restart).


Trial XGBoostTrainer_d21c1_00001 reported train-mlogloss=0.62 with parameters={'params': {'max_depth': 7, 'min_child_weight': 6}}.


[2m[36m(GBDTTrainable pid=80014)[0m 2022-05-11 13:41:20,516	INFO main.py:1113 -- Training in progress (175 seconds since last restart).
[2m[36m(GBDTTrainable pid=80182)[0m 2022-05-11 13:41:20,484	INFO main.py:1113 -- Training in progress (65 seconds since last restart).
[2m[36m(GBDTTrainable pid=80013)[0m 2022-05-11 13:41:21,391	INFO main.py:1113 -- Training in progress (174 seconds since last restart).



Trial XGBoostTrainer_d21c1_00000 reported train-mlogloss=0.58 with parameters={'params': {'max_depth': 7, 'min_child_weight': 2}}.
Trial XGBoostTrainer_d21c1_00002 reported train-mlogloss=0.61 with parameters={'params': {'max_depth': 6, 'min_child_weight': 9}}.
Trial XGBoostTrainer_d21c1_00004 reported train-mlogloss=0.62 with parameters={'params': {'max_depth': 7, 'min_child_weight': 7}}.
Trial XGBoostTrainer_d21c1_00005 reported train-mlogloss=1.03 with parameters={'params': {'max_depth': 6, 'min_child_weight': 8}}.


[2m[36m(GBDTTrainable pid=80016)[0m 2022-05-11 13:43:25,058	INFO main.py:1113 -- Training in progress (298 seconds since last restart).
[2m[36m(GBDTTrainable pid=80014)[0m 2022-05-11 13:43:25,002	INFO main.py:1113 -- Training in progress (299 seconds since last restart).
[2m[36m(GBDTTrainable pid=80007)[0m 2022-05-11 13:43:25,138	INFO main.py:1113 -- Training in progress (306 seconds since last restart).
[2m[36m(GBDTTrainable pid=80182)[0m 2022-05-11 13:43:25,195	INFO main.py:1113 -- Training in progress (190 seconds since last restart).
[2m[36m(GBDTTrainable pid=80014)[0m 2022-05-11 13:43:25,306	INFO main.py:1526 -- [RayXGBoost] Finished XGBoost training on training data with total N=581,012 in 310.45 seconds (299.72 pure XGBoost training time).
[2m[36m(_RemoteRayXGBoostActor pid=80085)[0m E0511 13:43:25.420395000 123145388179456 chttp2_transport.cc:1103]     Received a GOAWAY with error code ENHANCE_YOUR_CALM and debug data equal to "too_many_pings"
[2m[36m(_Remot

Trial XGBoostTrainer_d21c1_00002 completed. Last result: train-mlogloss=0.608054,train-merror=0.220109,should_checkpoint=True
Trial XGBoostTrainer_d21c1_00004 reported train-mlogloss=0.59 with parameters={'params': {'max_depth': 7, 'min_child_weight': 7}}.
Trial XGBoostTrainer_d21c1_00001 reported train-mlogloss=0.59 with parameters={'params': {'max_depth': 7, 'min_child_weight': 6}}.
Trial XGBoostTrainer_d21c1_00000 reported train-mlogloss=0.56 with parameters={'params': {'max_depth': 7, 'min_child_weight': 2}}.
Trial XGBoostTrainer_d21c1_00005 reported train-mlogloss=0.91 with parameters={'params': {'max_depth': 6, 'min_child_weight': 8}}.
Trial XGBoostTrainer_d21c1_00000 completed. Last result: train-mlogloss=0.557201,train-merror=0.195385,should_checkpoint=True


[2m[36m(GBDTTrainable pid=80007)[0m 2022-05-11 13:43:50,273	INFO main.py:1526 -- [RayXGBoost] Finished XGBoost training on training data with total N=581,012 in 338.30 seconds (331.46 pure XGBoost training time).
[2m[36m(_RemoteRayXGBoostActor pid=80040)[0m E0511 13:43:50.286508000 123145418387456 chttp2_transport.cc:1103]     Received a GOAWAY with error code ENHANCE_YOUR_CALM and debug data equal to "too_many_pings"
[2m[36m(_RemoteRayXGBoostActor pid=80041)[0m E0511 13:43:50.288599000 123145354575872 chttp2_transport.cc:1103]     Received a GOAWAY with error code ENHANCE_YOUR_CALM and debug data equal to "too_many_pings"
[2m[36m(GBDTTrainable pid=80013)[0m 2022-05-11 13:43:53,241	INFO main.py:1113 -- Training in progress (326 seconds since last restart).


Trial XGBoostTrainer_d21c1_00001 reported train-mlogloss=0.56 with parameters={'params': {'max_depth': 7, 'min_child_weight': 6}}. This trial completed.


[2m[36m(GBDTTrainable pid=80013)[0m 2022-05-11 13:43:54,124	INFO main.py:1526 -- [RayXGBoost] Finished XGBoost training on training data with total N=581,012 in 338.99 seconds (326.94 pure XGBoost training time).
[2m[36m(_RemoteRayXGBoostActor pid=80108)[0m E0511 13:43:54.143030000 123145546432512 chttp2_transport.cc:1103]     Received a GOAWAY with error code ENHANCE_YOUR_CALM and debug data equal to "too_many_pings"
[2m[36m(_RemoteRayXGBoostActor pid=80109)[0m E0511 13:43:54.140347000 4391695872 chttp2_transport.cc:1103]          Received a GOAWAY with error code ENHANCE_YOUR_CALM and debug data equal to "too_many_pings"
[2m[36m(GBDTTrainable pid=80016)[0m 2022-05-11 13:43:54,867	INFO main.py:1526 -- [RayXGBoost] Finished XGBoost training on training data with total N=581,012 in 341.49 seconds (327.67 pure XGBoost training time).
[2m[36m(_RemoteRayXGBoostActor pid=80060)[0m E0511 13:43:54.879104000 123145432559616 chttp2_transport.cc:1103]     Received a GOAWAY with er

Trial XGBoostTrainer_d21c1_00004 reported train-mlogloss=0.56 with parameters={'params': {'max_depth': 7, 'min_child_weight': 7}}. This trial completed.


[2m[36m(GBDTTrainable pid=80182)[0m 2022-05-11 13:43:55,319	INFO main.py:1113 -- Training in progress (220 seconds since last restart).
[2m[36m(GBDTTrainable pid=80298)[0m 2022-05-11 13:43:59,149	INFO main.py:984 -- [RayXGBoost] Created 2 new actors (2 total actors). Waiting until actors are ready for training.
[2m[36m(GBDTTrainable pid=80313)[0m 2022-05-11 13:44:01,140	INFO main.py:984 -- [RayXGBoost] Created 2 new actors (2 total actors). Waiting until actors are ready for training.


Trial XGBoostTrainer_d21c1_00005 reported train-mlogloss=0.76 with parameters={'params': {'max_depth': 6, 'min_child_weight': 8}}.


[2m[36m(GBDTTrainable pid=80298)[0m 2022-05-11 13:44:03,916	INFO main.py:1029 -- [RayXGBoost] Starting XGBoost training.
[2m[36m(_RemoteRayXGBoostActor pid=80333)[0m [13:44:03] task [xgboost.ray]:4709416592 got new rank 0
[2m[36m(_RemoteRayXGBoostActor pid=80335)[0m [13:44:03] task [xgboost.ray]:4465884944 got new rank 1
[2m[36m(GBDTTrainable pid=80313)[0m 2022-05-11 13:44:08,833	INFO main.py:1029 -- [RayXGBoost] Starting XGBoost training.
[2m[36m(_RemoteRayXGBoostActor pid=80355)[0m [13:44:08] task [xgboost.ray]:4659109136 got new rank 1
[2m[36m(_RemoteRayXGBoostActor pid=80354)[0m [13:44:08] task [xgboost.ray]:4488156496 got new rank 0


Trial XGBoostTrainer_d21c1_00005 reported train-mlogloss=0.71 with parameters={'params': {'max_depth': 6, 'min_child_weight': 8}}.
Trial XGBoostTrainer_d21c1_00006 reported train-mlogloss=1.47 with parameters={'params': {'max_depth': 6, 'min_child_weight': 4}}.
Trial XGBoostTrainer_d21c1_00007 reported train-mlogloss=1.52 with parameters={'params': {'max_depth': 4, 'min_child_weight': 1}}.
Trial XGBoostTrainer_d21c1_00005 reported train-mlogloss=0.67 with parameters={'params': {'max_depth': 6, 'min_child_weight': 8}}.
Trial XGBoostTrainer_d21c1_00006 reported train-mlogloss=1.20 with parameters={'params': {'max_depth': 6, 'min_child_weight': 4}}.
Trial XGBoostTrainer_d21c1_00007 reported train-mlogloss=1.27 with parameters={'params': {'max_depth': 4, 'min_child_weight': 1}}.


[2m[36m(GBDTTrainable pid=80182)[0m 2022-05-11 13:44:25,737	INFO main.py:1113 -- Training in progress (250 seconds since last restart).


Trial XGBoostTrainer_d21c1_00005 reported train-mlogloss=0.63 with parameters={'params': {'max_depth': 6, 'min_child_weight': 8}}.
Trial XGBoostTrainer_d21c1_00007 reported train-mlogloss=1.10 with parameters={'params': {'max_depth': 4, 'min_child_weight': 1}}.
Trial XGBoostTrainer_d21c1_00006 reported train-mlogloss=1.03 with parameters={'params': {'max_depth': 6, 'min_child_weight': 4}}.


[2m[36m(GBDTTrainable pid=80298)[0m 2022-05-11 13:44:34,612	INFO main.py:1113 -- Training in progress (31 seconds since last restart).


Trial XGBoostTrainer_d21c1_00005 reported train-mlogloss=0.61 with parameters={'params': {'max_depth': 6, 'min_child_weight': 8}}.


[2m[36m(GBDTTrainable pid=80182)[0m 2022-05-11 13:44:36,595	INFO main.py:1526 -- [RayXGBoost] Finished XGBoost training on training data with total N=581,012 in 273.05 seconds (261.23 pure XGBoost training time).
[2m[36m(_RemoteRayXGBoostActor pid=80202)[0m E0511 13:44:36.636644000 123145465180160 chttp2_transport.cc:1103]     Received a GOAWAY with error code ENHANCE_YOUR_CALM and debug data equal to "too_many_pings"
[2m[36m(_RemoteRayXGBoostActor pid=80203)[0m E0511 13:44:36.640197000 123145377869824 chttp2_transport.cc:1103]     Received a GOAWAY with error code ENHANCE_YOUR_CALM and debug data equal to "too_many_pings"


Trial XGBoostTrainer_d21c1_00005 completed. Last result: train-mlogloss=0.607898,train-merror=0.219964,should_checkpoint=True
Trial XGBoostTrainer_d21c1_00007 reported train-mlogloss=0.99 with parameters={'params': {'max_depth': 4, 'min_child_weight': 1}}.


[2m[36m(GBDTTrainable pid=80313)[0m 2022-05-11 13:44:38,915	INFO main.py:1113 -- Training in progress (30 seconds since last restart).


Trial XGBoostTrainer_d21c1_00006 reported train-mlogloss=0.91 with parameters={'params': {'max_depth': 6, 'min_child_weight': 4}}.
Trial XGBoostTrainer_d21c1_00007 reported train-mlogloss=0.90 with parameters={'params': {'max_depth': 4, 'min_child_weight': 1}}.
Trial XGBoostTrainer_d21c1_00006 reported train-mlogloss=0.82 with parameters={'params': {'max_depth': 6, 'min_child_weight': 4}}.
Trial XGBoostTrainer_d21c1_00007 reported train-mlogloss=0.84 with parameters={'params': {'max_depth': 4, 'min_child_weight': 1}}.
Trial XGBoostTrainer_d21c1_00007 reported train-mlogloss=0.79 with parameters={'params': {'max_depth': 4, 'min_child_weight': 1}}.
Trial XGBoostTrainer_d21c1_00006 reported train-mlogloss=0.76 with parameters={'params': {'max_depth': 6, 'min_child_weight': 4}}.
Trial XGBoostTrainer_d21c1_00006 reported train-mlogloss=0.71 with parameters={'params': {'max_depth': 6, 'min_child_weight': 4}}.
Trial XGBoostTrainer_d21c1_00007 reported train-mlogloss=0.72 with parameters={'par

[2m[36m(GBDTTrainable pid=80298)[0m 2022-05-11 13:45:04,963	INFO main.py:1113 -- Training in progress (61 seconds since last restart).
[2m[36m(GBDTTrainable pid=80313)[0m 2022-05-11 13:45:06,589	INFO main.py:1526 -- [RayXGBoost] Finished XGBoost training on training data with total N=581,012 in 65.59 seconds (57.75 pure XGBoost training time).


Trial XGBoostTrainer_d21c1_00006 reported train-mlogloss=0.67 with parameters={'params': {'max_depth': 6, 'min_child_weight': 4}}.
Trial XGBoostTrainer_d21c1_00007 reported train-mlogloss=0.70 with parameters={'params': {'max_depth': 4, 'min_child_weight': 1}}. This trial completed.
Trial XGBoostTrainer_d21c1_00006 reported train-mlogloss=0.63 with parameters={'params': {'max_depth': 6, 'min_child_weight': 4}}.
Trial XGBoostTrainer_d21c1_00006 reported train-mlogloss=0.61 with parameters={'params': {'max_depth': 6, 'min_child_weight': 4}}.


[2m[36m(GBDTTrainable pid=80298)[0m 2022-05-11 13:45:18,924	INFO main.py:1526 -- [RayXGBoost] Finished XGBoost training on training data with total N=581,012 in 79.91 seconds (75.00 pure XGBoost training time).


Trial XGBoostTrainer_d21c1_00006 completed. Last result: train-mlogloss=0.607927,train-merror=0.220801,should_checkpoint=True


2022-05-11 13:45:19,353	INFO tune.py:753 -- Total run time: 434.76 seconds (433.62 seconds for the tuning loop).


Now that we obtained the results, we can analyze them. For instance, we can fetch the best observed result according to the configured `metric` and `mode` and print it:

In [9]:
# This will fetch the best result according to the `metric` and `mode` specified
# in the `TuneConfig` above:

best_result = results.get_best_result()

print("Best result error rate", best_result.metrics["train-merror"])

Best result error rate 0.195385


For more sophisticated analysis, we can get a pandas dataframe with all trial results:

In [10]:
df = results.get_dataframe()
print(df.columns)

Index(['train-mlogloss', 'train-merror', 'time_this_iter_s',
       'should_checkpoint', 'done', 'timesteps_total', 'episodes_total',
       'training_iteration', 'trial_id', 'experiment_id', 'date', 'timestamp',
       'time_total_s', 'pid', 'hostname', 'node_ip', 'time_since_restore',
       'timesteps_since_restore', 'iterations_since_restore', 'warmup_time',
       'config/params/max_depth', 'config/params/min_child_weight', 'logdir'],
      dtype='object')


As an example, let's group the results per `min_child_weight` parameter and fetch the minimal obtained values:

In [11]:
groups = df.groupby("config/params/min_child_weight")
mins = groups.min()

for min_child_weight, row in mins.iterrows():
    print("Min child weight", min_child_weight, "error", row["train-merror"])


Min child weight 1 error 0.262468
Min child weight 2 error 0.195385
Min child weight 4 error 0.220801
Min child weight 6 error 0.199397
Min child weight 7 error 0.197772
Min child weight 8 error 0.219964
Min child weight 9 error 0.220109


As you can see in our example run, the min child weight of `2` showed the best prediction accuracy with `0.195385`.

The `results.get_dataframe()` returns the last reported results per trial. If you want to obtain the best _ever_ observed results, you can pass the `filter_metric` and `filter_mode` arguments to `results.get_dataframe()`. In our example, we'll filter the minimum _ever_ observed `train-merror` for each trial:

In [12]:
df_min_error = results.get_dataframe(filter_metric="train-merror", filter_mode="min")
df_min_error["train-merror"]

0    0.195385
1    0.199397
2    0.220109
3    0.310307
4    0.197772
5    0.219964
6    0.220801
7    0.262468
Name: train-merror, dtype: float64

The best ever observed `train-merror` is `0.195385`, the same as the minimum error in our grouped results. This is expected, as the classification error in XGBoost usually goes down over time - meaning our last results are usually the best results.

And that's how you analyze your hyperparameter tuning results. If you would like to have access to more analytics, please feel free to file a feature request e.g. [as a Github issue](https://github.com/ray-project/ray/issues) or on our [Discuss platform](https://discuss.ray.io/)!