https://github.com/h2oai/h2o-tutorials/blob/master/h2o-world-2017/automl/Python/automl_regression_powerplant_output.ipynb

# H2O AutoML Regression Demo: freMTPL2freq

This is a Jupyter Notebook. When you execute code within the notebook, the results appear beneath the code. To execute a code chunk, place your cursor on the cell and press Shift+Enter.

## Start H2O

Import the `h2o` Python module and `H2OAutoML` class and initialize a local H2O cluster.

In [1]:
import h2o
from h2o.automl import H2OAutoML

In [2]:
h2o.init()

Checking whether there is an H2O instance running at http://localhost:54321 ..... not found.
Attempting to start a local H2O server...
  Java Version: openjdk version "1.8.0_372"; OpenJDK Runtime Environment (build 1.8.0_372-b07); OpenJDK 64-Bit Server VM (build 25.372-b07, mixed mode)
  Starting server from /home/stever7/.local/lib/python3.9/site-packages/h2o/backend/bin/h2o.jar
  Ice root: /tmp/tmp0znf9y1j
  JVM stdout: /tmp/tmp0znf9y1j/h2o_stever7_started_from_python.out
  JVM stderr: /tmp/tmp0znf9y1j/h2o_stever7_started_from_python.err
  Server is running at http://127.0.0.1:54321
Connecting to H2O server at http://127.0.0.1:54321 ... successful.


0,1
H2O_cluster_uptime:,07 secs
H2O_cluster_timezone:,Etc/GMT
H2O_data_parsing_timezone:,UTC
H2O_cluster_version:,3.36.0.3
H2O_cluster_version_age:,"1 year, 4 months and 18 days !!!"
H2O_cluster_name:,H2O_from_python_stever7_uarlfn
H2O_cluster_total_nodes:,1
H2O_cluster_free_memory:,26.63 Gb
H2O_cluster_total_cores:,64
H2O_cluster_allowed_cores:,64


## Load Data

For the AutoML regression demo, we use the Combined Cycle Power Plant dataset:

http://archive.ics.uci.edu/ml/datasets/Combined+Cycle+Power+Plant

The goal here is to predict the energy output (in megawatts), given the temperature, ambient pressure, relative humidity and exhaust vacuum values. In this demo, you will use H2O's AutoML to outperform the state of the art results on this task:

https://www.sciencedirect.com/science/article/pii/S0142061514000908

In [3]:
# Use local data file or download from GitHub

# import os

# docker_data_path = "/home/h2o/data/automl/powerplant_output.csv"

# if os.path.isfile(docker_data_path):
#   data_path = docker_data_path
# else:
#   data_path = "https://github.com/h2oai/h2o-tutorials/raw/master/h2o-world-2017/automl/data/powerplant_output.csv"

# Load data into H2O
# df = h2o.import_file(data_path)

train = h2o.import_file("freMTPL2freq_dataset_train.csv")
test = h2o.import_file("freMTPL2freq_dataset_test.csv")

Parse progress: |████████████████████████████████████████████████████████████████| (done) 100%
Parse progress: |████████████████████████████████████████████████████████████████| (done) 100%


Let's take a look at the data.

In [4]:
# df.describe()
train.describe()

Rows:474765
Cols:12




Unnamed: 0,IDpol,ClaimNb,Exposure,VehPower,VehAge,DrivAge,BonusMalus,VehBrand,VehGas,Area,Density,Region
type,int,int,real,int,int,int,int,enum,enum,enum,int,enum
mins,3.0,0.0,0.00273224043715847,4.0,0.0,18.0,50.0,,,,1.0,
mean,2621440.384487062,0.03858329910587343,0.5287500879942947,6.455865533474456,7.042741145619412,45.48614788368983,59.78412267121629,,,,1795.9865449222248,
maxs,6114326.0,11.0,2.01,15.0,100.0,100.0,230.0,,,,27000.0,
sigma,1641195.000917722,0.20545811920077814,0.3645199134252984,2.0517258147879502,5.663396818846188,14.135156438087566,15.650619345522982,,,,3963.205126878521,
zeros,0,457477,0,0,40485,0,0,,,,0,
missing,0,0,0,0,0,0,0,0,0,0,0,0
0,3.0,0.0,0.77,5.0,0.0,55.0,50.0,B12,Regular,D,1217.0,Rhone-Alpes
1,5.0,0.0,0.75,6.0,2.0,52.0,50.0,B12,Diesel,B,54.0,Picardie
2,11.0,0.0,0.84,7.0,0.0,46.0,50.0,B12,Diesel,B,76.0,Aquitaine


Next, let's identify the response column and save the column name as `y`. In this dataset, we will use all columns except the response as predictors, so we can skip setting the `x` argument explicitly.

In [5]:
# y = "HourlyEnergyOutputMW"
y = "ClaimNb"

Lastly, let's split the data into two frames, a `train` (80%) and a `test` frame (20%). The `test` frame will be used to score the leaderboard and to demonstrate how to generate predictions using an AutoML leader model.

In [6]:
# splits = df.split_frame(ratios=[0.8], seed=1)
# train = splits[0]
# test = splits[1]

## Run AutoML

Run AutoML, stopping after 60 seconds. The `max_runtime_secs` argument provides a way to limit the AutoML run by time. When using a time-limited stopping criterion, the number of models train will vary between runs. If different hardware is used or even if the same machine is used but the available compute resources on that machine are not the same between runs, then AutoML may be able to train more models on one run vs another.

The `test` frame is passed explicitly to the `leaderboard_frame` argument here, which means that instead of using cross-validated metrics, we use test set metrics for generating the leaderboard.

In [7]:
# aml = H2OAutoML(max_runtime_secs=60, seed=1, project_name="powerplant_lb_frame")
# AutoML was not able to build any model within a max runtime constraint of 60 seconds, 
# you may want to increase this value before retrying.
# aml = H2OAutoML(max_runtime_secs=120, seed=1, project_name="powerplant_lb_frame")
aml = H2OAutoML(max_runtime_secs=3600, seed=1, project_name="freMTPL2freq_lb_frame", sort_metric="MAE")

In [8]:
aml.train(y=y, training_frame=train, leaderboard_frame=test)

AutoML progress: |███████████████████████████████████████████████████████████████| (done) 100%
Model Details
H2ODeepLearningEstimator :  Deep Learning
Model Key:  DeepLearning_grid_2_AutoML_1_20230705_141306_model_1


Status of Neuron Layers: predicting ClaimNb, regression, gaussian distribution, Quadratic loss, 15,501 weights/biases, 190.2 KB, 100,200 training samples, mini-batch size 1


Unnamed: 0,Unnamed: 1,layer,units,type,dropout,l1,l2,mean_rate,rate_rms,momentum,mean_weight,weight_rms,mean_bias,bias_rms
0,,1,52,Input,15.0,,,,,,,,,
1,,2,100,RectifierDropout,10.0,0.0,0.0,0.09789,0.256201,0.0,-0.004005,0.11163,0.541787,0.188627
2,,3,100,RectifierDropout,10.0,0.0,0.0,0.041452,0.112558,0.0,-0.018337,0.100492,0.908628,0.113794
3,,4,1,Linear,,0.0,0.0,0.001078,0.001898,0.0,0.018637,0.079657,0.377007,0.0




ModelMetricsRegression: deeplearning
** Reported on train data. **

MSE: 0.04157244754310789
RMSE: 0.20389322583918254
MAE: 0.04663188232938914
RMSLE: 0.1358681949914715
Mean Residual Deviance: 0.04157244754310789

ModelMetricsRegression: deeplearning
** Reported on cross-validation data. **

MSE: 0.04265217739519132
RMSE: 0.20652403587764626
MAE: 0.05542654457409077
RMSLE: 0.1359927715866486
Mean Residual Deviance: 0.04265217739519132

Cross-Validation Metrics Summary: 


Unnamed: 0,Unnamed: 1,mean,sd,cv_1_valid,cv_2_valid,cv_3_valid,cv_4_valid,cv_5_valid
0,mae,0.052682,0.003638,0.053257,0.047405,0.057291,0.054051,0.051406
1,mean_residual_deviance,0.042606,0.000932,0.042706,0.042757,0.041795,0.044032,0.04174
2,mse,0.042606,0.000932,0.042706,0.042757,0.041795,0.044032,0.04174
3,r2,-0.009313,0.003553,-0.011131,-0.01379,-0.004166,-0.008746,-0.008733
4,residual_deviance,0.042606,0.000932,0.042706,0.042757,0.041795,0.044032,0.04174
5,rmse,0.206402,0.002251,0.206654,0.206778,0.204438,0.209838,0.204304
6,rmsle,0.135695,0.001252,0.136938,0.134546,0.136144,0.136661,0.134187



Scoring History: 


Unnamed: 0,Unnamed: 1,timestamp,duration,training_speed,epochs,iterations,samples,training_rmse,training_deviance,training_mae,training_r2
0,,2023-07-05 15:12:38,0.000 sec,,0.0,0,0.0,,,,
1,,2023-07-05 15:12:38,21.213 sec,41292 obs/sec,0.021135,1,10034.0,0.207878,0.043213,0.052023,-0.058662
2,,2023-07-05 15:12:40,23.318 sec,43908 obs/sec,0.211052,10,100200.0,0.203893,0.041572,0.046632,-0.018469



Variable Importances: 


Unnamed: 0,variable,relative_importance,scaled_importance,percentage
0,Region.Midi-Pyrenees,0.997609,1.0,0.024778
1,Region.Centre,0.94529,0.947556,0.023478
2,VehBrand.B3,0.914559,0.916751,0.022715
3,Region.Picardie,0.905328,0.907498,0.022486
4,Region.Aquitaine,0.903534,0.9057,0.022441
5,Region.Rhone-Alpes,0.900883,0.903043,0.022375
6,VehBrand.B6,0.899609,0.901765,0.022344
7,VehAge,0.899157,0.901313,0.022333
8,VehBrand.B10,0.896734,0.898883,0.022272
9,Region.Pays-de-la-Loire,0.894255,0.896399,0.022211



See the whole table with table.as_data_frame()




*Note: If you see the following error, it means that you need to install the pandas module.*

```
H2OTypeError: Argument `python_obj` should be a None | list | tuple | dict | numpy.ndarray | pandas.DataFrame | scipy.sparse.issparse, got H2OTwoDimTable 
```

For demonstration purposes, we will also execute a second AutoML run, this time providing the original, full dataset, `df` (without passing a `leaderboard_frame`). This is a more efficient use of our data since we can use 100% of the data for training, rather than 80% like we did above. This time our leaderboard will use cross-validated metrics.

*Note: Using an explicit `leaderboard_frame` for scoring may be useful in some cases, which is why the option is available.*

In [9]:
# aml2 = H2OAutoML(max_runtime_secs=60, seed=1, project_name="powerplant_full_data")
# AutoML was not able to build any model within a max runtime constraint of 60 seconds, 
# you may want to increase this value before retrying.
# aml2 = H2OAutoML(max_runtime_secs=120, seed=1, project_name="powerplant_full_data")
aml2 = H2OAutoML(max_runtime_secs=3600, seed=1, project_name="freMTPL2freq_full_data", sort_metric="MAE")

In [10]:
# aml2.train(y=y, training_frame=df)
aml2.train(y=y, training_frame=train)

AutoML progress: |███████████████████████████████████████████████████████████████| (done) 100%

16:12:57.364: GBM_lr_annealing_selection_AutoML_2_20230705_151321 [GBM lr_annealing] failed: water.exceptions.H2OIllegalArgumentException: Can only convert jobs producing a single Model or ModelContainer.

Model Details
H2ODeepLearningEstimator :  Deep Learning
Model Key:  DeepLearning_grid_1_AutoML_2_20230705_151321_model_1


Status of Neuron Layers: predicting ClaimNb, regression, gaussian distribution, Quadratic loss, 5,401 weights/biases, 70.1 KB, 7,198,887 training samples, mini-batch size 1


Unnamed: 0,Unnamed: 1,layer,units,type,dropout,l1,l2,mean_rate,rate_rms,momentum,mean_weight,weight_rms,mean_bias,bias_rms
0,,1,52,Input,15.0,,,,,,,,,
1,,2,100,RectifierDropout,10.0,0.0,0.0,0.104578,0.263412,0.0,0.008034,0.222857,0.12994,0.581902
2,,3,1,Linear,,0.0,0.0,0.001043,0.00351,0.0,-0.015696,0.047486,0.09335,0.0




ModelMetricsRegression: deeplearning
** Reported on train data. **

MSE: 0.0410992118240603
RMSE: 0.2027294054252128
MAE: 0.05547715072092874
RMSLE: 0.13506227936617082
Mean Residual Deviance: 0.0410992118240603

ModelMetricsRegression: deeplearning
** Reported on cross-validation data. **

MSE: 0.04336288888976339
RMSE: 0.20823757799629583
MAE: 0.04411066005697582
RMSLE: 0.1373022221909369
Mean Residual Deviance: 0.04336288888976339

Cross-Validation Metrics Summary: 


Unnamed: 0,Unnamed: 1,mean,sd,cv_1_valid,cv_2_valid,cv_3_valid,cv_4_valid,cv_5_valid
0,mae,0.044111,0.001654,0.044048,0.041783,0.045138,0.046121,0.043464
1,mean_residual_deviance,0.043363,0.00086,0.043445,0.043268,0.042825,0.044757,0.04252
2,mse,0.043363,0.00086,0.043445,0.043268,0.042825,0.044757,0.04252
3,r2,-0.027274,0.001595,-0.028624,-0.025899,-0.028905,-0.025351,-0.027591
4,residual_deviance,0.043363,0.00086,0.043445,0.043268,0.042825,0.044757,0.04252
5,rmse,0.208229,0.002057,0.208434,0.208009,0.206941,0.211558,0.206204
6,rmsle,0.137297,0.001388,0.138497,0.135669,0.138233,0.138185,0.135899



Scoring History: 


Unnamed: 0,Unnamed: 1,timestamp,duration,training_speed,epochs,iterations,samples,training_rmse,training_deviance,training_mae,training_r2
0,,2023-07-05 16:04:14,0.000 sec,,0.0,0,0.0,,,,
1,,2023-07-05 16:04:15,2 min 47.023 sec,115673 obs/sec,0.210753,1,100058.0,0.202674,0.041077,0.054717,-0.006325
2,,2023-07-05 16:04:21,2 min 52.572 sec,125000 obs/sec,1.687426,8,801131.0,0.203941,0.041592,0.048114,-0.018947
3,,2023-07-05 16:04:26,2 min 58.102 sec,134191 obs/sec,3.373414,16,1601579.0,0.2051,0.042066,0.043905,-0.030557
4,,2023-07-05 16:04:32,3 min 3.696 sec,142765 obs/sec,5.269615,25,2501829.0,0.205949,0.042415,0.044172,-0.039108
5,,2023-07-05 16:04:37,3 min 8.850 sec,150004 obs/sec,7.163951,34,3401193.0,0.204043,0.041633,0.046658,-0.019962
6,,2023-07-05 16:04:42,3 min 14.164 sec,157253 obs/sec,9.268611,44,4400412.0,0.202893,0.041166,0.050845,-0.008501
7,,2023-07-05 16:04:47,3 min 19.399 sec,162617 obs/sec,11.376203,54,5401023.0,0.203663,0.041478,0.048854,-0.016166
8,,2023-07-05 16:04:53,3 min 24.660 sec,166391 obs/sec,13.482255,64,6400903.0,0.202491,0.041002,0.056169,-0.004503
9,,2023-07-05 16:04:57,3 min 29.027 sec,168072 obs/sec,15.163053,72,7198887.0,0.202729,0.041099,0.055477,-0.006875



Variable Importances: 


Unnamed: 0,variable,relative_importance,scaled_importance,percentage
0,Area.F,1.0,1.0,0.037311
1,Density,0.809705,0.809705,0.030211
2,VehGas.Regular,0.763717,0.763717,0.028495
3,Region.Centre,0.737761,0.737761,0.027526
4,VehBrand.B6,0.734093,0.734093,0.027389
5,Region.Ile-de-France,0.698926,0.698926,0.026077
6,Region.Midi-Pyrenees,0.694823,0.694823,0.025924
7,VehBrand.B1,0.694723,0.694723,0.02592
8,VehGas.Diesel,0.67707,0.67707,0.025262
9,VehBrand.B2,0.667195,0.667195,0.024893



See the whole table with table.as_data_frame()




*Note: We specify a `project_name` here for clarity.*

## Leaderboard

Next, we will view the AutoML Leaderboard. Since we specified a `leaderboard_frame` in the `H2OAutoML.train()` method for scoring and ranking the models, the AutoML leaderboard uses the performance on this data to rank the models.

After viewing the `"powerplant_lb_frame"` AutoML project leaderboard, we compare that to the leaderboard for the `"powerplant_full_data"` project. We can see that the results are better when the full dataset is used for training.

A default performance metric for each machine learning task (binary classification, multiclass classification, regression) is specified internally and the leaderboard will be sorted by that metric. In the case of regression, the default ranking metric is mean residual deviance. In the future, the user will be able to specify any of the H2O metrics so that different metrics can be used to generate rankings on the leaderboard.

In [11]:
aml.leaderboard.head()

model_id,mae,mean_residual_deviance,rmse,mse,rmsle
DeepLearning_grid_2_AutoML_1_20230705_141306_model_1,0.048126,0.0454145,0.213107,0.0454145,0.138879
DeepLearning_grid_1_AutoML_1_20230705_141306_model_2,0.0507244,0.0456418,0.213639,0.0456418,0.139603
DeepLearning_grid_1_AutoML_1_20230705_141306_model_1,0.0574449,0.0447577,0.21156,0.0447577,0.137656
XRT_1_AutoML_1_20230705_141306,0.0709409,0.0435306,0.20864,0.0435306,0.137558
StackedEnsemble_AllModels_3_AutoML_1_20230705_141306,0.0718528,0.042645,0.206507,0.042645,0.135436
StackedEnsemble_BestOfFamily_2_AutoML_1_20230705_141306,0.0720474,0.0427283,0.206708,0.0427283,0.135513
StackedEnsemble_AllModels_1_AutoML_1_20230705_141306,0.0721708,0.0426814,0.206595,0.0426814,0.135505
StackedEnsemble_AllModels_2_AutoML_1_20230705_141306,0.072202,0.0426731,0.206575,0.0426731,0.135492
StackedEnsemble_BestOfFamily_4_AutoML_1_20230705_141306,0.072338,0.0427189,0.206686,0.0427189,0.135524
StackedEnsemble_BestOfFamily_3_AutoML_1_20230705_141306,0.0724819,0.0427182,0.206684,0.0427182,0.135534




Now we will view a snapshot of the top models. Here we should see the two Stacked Ensembles at or near the top of the leaderboard. Stacked Ensembles can almost always outperform a single model.

In [12]:
aml2.leaderboard.head()

model_id,mae,mean_residual_deviance,rmse,mse,rmsle
DeepLearning_grid_1_AutoML_2_20230705_151321_model_1,0.0441107,0.0433629,0.208238,0.0433629,0.137302
DeepLearning_grid_1_AutoML_2_20230705_151321_model_4,0.0537114,0.0429311,0.207198,0.0429311,0.136613
DeepLearning_grid_1_AutoML_2_20230705_151321_model_3,0.0539925,0.0427525,0.206767,0.0427525,0.136184
DeepLearning_grid_2_AutoML_2_20230705_151321_model_1,0.0549071,0.0427968,0.206874,0.0427968,0.136334
DeepLearning_grid_1_AutoML_2_20230705_151321_model_2,0.0578613,0.0425108,0.206182,0.0425108,0.135888
DeepLearning_grid_3_AutoML_2_20230705_151321_model_1,0.0582405,0.04244,0.20601,0.04244,0.135648
DeepLearning_1_AutoML_2_20230705_151321,0.0626316,0.041909,0.204717,0.041909,0.135055
XRT_1_AutoML_2_20230705_151321,0.0696536,0.0414062,0.203485,0.0414062,0.135982
StackedEnsemble_BestOfFamily_6_AutoML_2_20230705_151321,0.0714696,0.0404434,0.201106,0.0404434,0.133631
StackedEnsemble_AllModels_3_AutoML_2_20230705_151321,0.0714789,0.0404909,0.201223,0.0404909,0.133711




This dataset comes from the UCI Machine Learning Repository of machine learning datasets. 

http://archive.ics.uci.edu/ml/datasets/Combined+Cycle+Power+Plant

The data was used in a publication in the *International Journal of Electrical Power & Energy Systems* in 2014. 

https://www.sciencedirect.com/science/article/pii/S0142061514000908

In the paper, the authors achieved a mean absolute error (MAE) of 2.818 and a Root Mean-Squared Error (RMSE) of 3.787 on their best model. So, with H2O's AutoML, we've already beaten the state-of-the-art in just 60 seconds of compute time!

## Predict Using Leader Model

If you need to generate predictions on a test set, you can make predictions on the `"H2OAutoML"` object directly, or on the leader model object.

In [13]:
pred = aml.predict(test)
pred.head()

deeplearning prediction progress: |██████████████████████████████████████████████| (done) 100%


predict
0.00683289
0.0101026
0.00671481
0.0167253
0.000656939
0.00596662
0.00211559
0.00135086
0.00946236
-0.000613941




If needed, the standard `model_performance()` method can be applied to the AutoML leader model and a test set to generate an H2O model performance object.

In [14]:
perf = aml.leader.model_performance(test)
perf


ModelMetricsRegression: deeplearning
** Reported on test data. **

MSE: 0.045414456056266136
RMSE: 0.21310667764353639
MAE: 0.04812604562896159
RMSLE: 0.1388788338311417
Mean Residual Deviance: 0.045414456056266136


