# Overview

H2O.ai is an ML framework that provides a great, easy-to-use, interface for many high performance algorithms. It also provides a function called AutoML that automatically tries a number of different classification algorithms like default Random Forest (DRF), an Extremely Randomized Forest (XRT), a random grid of Gradient Boosting Machines (GBMs), a random grid of Deep Neural Nets, a fixed grid of GLMs. AutoML then trains two Stacked Ensemble models - one ensemble contains all the models, and the second ensemble contains just the best performing model from each algorithm class/family. 

The final prediction is made with the all models ensemble and should provide the best performance out of the box

In [1]:
import numpy as np
import pandas as pd

In [2]:
%load_ext autoreload
%autoreload 2

In [3]:
# Sklearn imports
from sklearn.preprocessing import Imputer
from sklearn.preprocessing import StandardScaler
from sklearn.ensemble import RandomForestClassifier
from sklearn.pipeline import Pipeline, make_pipeline
from sklearn.model_selection import cross_val_score, train_test_split, GridSearchCV
from sklearn.metrics import confusion_matrix, f1_score, accuracy_score
from sklearn.utils import resample

In [4]:
import helperFunctions

In [5]:
import h2o
from h2o.automl import H2OAutoML
h2o.init()

Checking whether there is an H2O instance running at http://localhost:54321..... not found.
Attempting to start a local H2O server...
; OpenJDK 64-Bit Server VM (Zulu 8.20.0.5-win64) (build 25.121-b15, mixed mode)
  Starting server from C:\Users\sankalpg\AppData\Local\Continuum\anaconda3\envs\mlpy\lib\site-packages\h2o\backend\bin\h2o.jar
  Ice root: C:\Users\sankalpg\AppData\Local\Temp\tmp4qj02j02
  JVM stdout: C:\Users\sankalpg\AppData\Local\Temp\tmp4qj02j02\h2o_SANKALPG_started_from_python.out
  JVM stderr: C:\Users\sankalpg\AppData\Local\Temp\tmp4qj02j02\h2o_SANKALPG_started_from_python.err
  Server is running at http://127.0.0.1:54321
Connecting to H2O server at http://127.0.0.1:54321... successful.


0,1
H2O cluster uptime:,03 secs
H2O cluster timezone:,Asia/Kolkata
H2O data parsing timezone:,UTC
H2O cluster version:,3.18.0.8
H2O cluster version age:,4 months and 3 days !!!
H2O cluster name:,H2O_from_python_SANKALPG_2sozfu
H2O cluster total nodes:,1
H2O cluster free memory:,3.531 Gb
H2O cluster total cores:,4
H2O cluster allowed cores:,4


In [6]:
# Setting the random state for later use
random_state = 565

## Load datasets

In [7]:
X_train, y_train = helperFunctions.load_clean_encode('training.csv', delimiter=';')

In [8]:
X_valid, y_valid = helperFunctions.load_clean_encode('validation.csv', delimiter=';')

Make sure that the train and validation sets have the same columns

In [9]:
X_train, X_valid = helperFunctions.equalizeColumns(X_train, X_valid)

## H2O AutoML

In [10]:
train_h2o = h2o.H2OFrame(pd.concat([X_train, y_train], axis=1))
valid_h2o = h2o.H2OFrame(X_valid)

Parse progress: |█████████████████████████████████████████████████████████| 100%
Parse progress: |█████████████████████████████████████████████████████████| 100%


In [11]:
train_h2o['classLabel'] = train_h2o['classLabel'].asfactor()

In [12]:
print(train_h2o.columns, valid_h2o.columns)

['v2', 'v3', 'v5', 'v6', 'v7', 'v10', 'v13', 'v14', 'v1_b', 'v4_u', 'v4_y', 'v8_t', 'v9_t', 'v11_t', 'v12_p', 'v12_s', 'v12_o', 'classLabel'] ['v2', 'v3', 'v5', 'v6', 'v7', 'v10', 'v13', 'v14', 'v1_b', 'v4_u', 'v4_y', 'v8_t', 'v9_t', 'v11_t', 'v12_p', 'v12_s', 'v12_o']


In [15]:
# Run AutoML for 300 seconds
aml = H2OAutoML(max_runtime_secs=300, nfolds=10)

In [16]:
aml.train(x=train_h2o.columns, y='classLabel', training_frame = train_h2o)

AutoML progress: |████████████████████████████████████████████████████████| 100%
Parse progress: |█████████████████████████████████████████████████████████| 100%


__AutoML Statistics__

In [21]:
lb=aml.leaderboard
lb

model_id,auc,logloss
GBM_grid_0_AutoML_20180822_185603_model_19,0.994962,0.0638104
GBM_grid_0_AutoML_20180822_185603_model_15,0.994881,0.0736909
GBM_grid_0_AutoML_20180822_185603_model_20,0.994563,0.0800151
GBM_grid_0_AutoML_20180822_185603_model_13,0.991639,0.124533
StackedEnsemble_AllModels_0_AutoML_20180822_185603,0.991355,0.0349008
GBM_grid_0_AutoML_20180822_185603_model_26,0.989895,0.206664
StackedEnsemble_BestOfFamily_0_AutoML_20180822_185603,0.988795,0.0538617
GBM_grid_0_AutoML_20180822_185603_model_2,0.986881,0.067286
GBM_grid_0_AutoML_20180822_185603_model_11,0.986815,0.084471
GBM_grid_0_AutoML_20180822_185603_model_8,0.986737,0.0889163




In [22]:
aml.leader

Model Details
H2OGradientBoostingEstimator :  Gradient Boosting Machine
Model Key:  GBM_grid_0_AutoML_20180822_185603_model_19


ModelMetricsBinomial: gbm
** Reported on train data. **

MSE: 6.612286143144014e-06
RMSE: 0.0025714365913131155
LogLoss: 0.0009041862047124636
Mean Per-Class Error: 0.0
AUC: 1.0
Gini: 1.0
Confusion Matrix (Act/Pred) for max f1 @ threshold = 0.978078180633734: 


0,1,2,3,4
,0.0,1.0,Error,Rate
0,2659.0,0.0,0.0,(0.0/2659.0)
1,0.0,215.0,0.0,(0.0/215.0)
Total,2659.0,215.0,0.0,(0.0/2874.0)


Maximum Metrics: Maximum metrics at their respective thresholds



0,1,2,3
metric,threshold,value,idx
max f1,0.9780782,1.0,158.0
max f2,0.9780782,1.0,158.0
max f0point5,0.9780782,1.0,158.0
max accuracy,0.9780782,1.0,158.0
max precision,0.9994670,1.0,0.0
max recall,0.9780782,1.0,158.0
max specificity,0.9994670,1.0,0.0
max absolute_mcc,0.9780782,1.0,158.0
max min_per_class_accuracy,0.9780782,1.0,158.0


Gains/Lift Table: Avg response rate:  7.48 %



0,1,2,3,4,5,6,7,8,9,10,11
,group,cumulative_data_fraction,lower_threshold,lift,cumulative_lift,response_rate,cumulative_response_rate,capture_rate,cumulative_capture_rate,gain,cumulative_gain
,1,0.0100905,0.9979493,13.3674419,13.3674419,1.0,1.0,0.1348837,0.1348837,1236.7441860,1236.7441860
,2,0.0201809,0.9969806,13.3674419,13.3674419,1.0,1.0,0.1348837,0.2697674,1236.7441860,1236.7441860
,3,0.0302714,0.9959356,13.3674419,13.3674419,1.0,1.0,0.1348837,0.4046512,1236.7441860,1236.7441860
,4,0.0400139,0.9947126,13.3674419,13.3674419,1.0,1.0,0.1302326,0.5348837,1236.7441860,1236.7441860
,5,0.0501044,0.9929836,13.3674419,13.3674419,1.0,1.0,0.1348837,0.6697674,1236.7441860,1236.7441860
,6,0.1002088,0.0030409,6.5908915,9.9791667,0.4930556,0.7465278,0.3302326,1.0,559.0891473,897.9166667
,7,0.1499652,0.0011753,0.0,6.6682135,0.0,0.4988399,0.0,1.0,-100.0,566.8213457
,8,0.2000696,0.0007038,0.0,4.9982609,0.0,0.3739130,0.0,1.0,-100.0,399.8260870
,9,0.2999304,0.0003531,0.0,3.3341067,0.0,0.2494200,0.0,1.0,-100.0,233.4106729




ModelMetricsBinomial: gbm
** Reported on validation data. **

MSE: 0.014314498572088228
RMSE: 0.11964321364828107
LogLoss: 0.0492739008494348
Mean Per-Class Error: 0.010754027570171076
AUC: 0.9970104633781763
Gini: 0.9940209267563527
Confusion Matrix (Act/Pred) for max f1 @ threshold = 0.05486151932175458: 


0,1,2,3,4
,0.0,1.0,Error,Rate
0,667.0,2.0,0.003,(2.0/669.0)
1,1.0,53.0,0.0185,(1.0/54.0)
Total,668.0,55.0,0.0041,(3.0/723.0)


Maximum Metrics: Maximum metrics at their respective thresholds



0,1,2,3
metric,threshold,value,idx
max f1,0.0548615,0.9724771,54.0
max f2,0.0548615,0.9778598,54.0
max f0point5,0.0918785,0.9883721,50.0
max accuracy,0.0918785,0.9958506,50.0
max precision,0.9975287,1.0,0.0
max recall,0.0007781,1.0,153.0
max specificity,0.9975287,1.0,0.0
max absolute_mcc,0.0548615,0.9702812,54.0
max min_per_class_accuracy,0.0548615,0.9814815,54.0


Gains/Lift Table: Avg response rate:  7.47 %



0,1,2,3,4,5,6,7,8,9,10,11
,group,cumulative_data_fraction,lower_threshold,lift,cumulative_lift,response_rate,cumulative_response_rate,capture_rate,cumulative_capture_rate,gain,cumulative_gain
,1,0.0110650,0.9916996,13.3888889,13.3888889,1.0,1.0,0.1481481,0.1481481,1238.8888889,1238.8888889
,2,0.0207469,0.9636591,13.3888889,13.3888889,1.0,1.0,0.1296296,0.2777778,1238.8888889,1238.8888889
,3,0.0304288,0.9169179,13.3888889,13.3888889,1.0,1.0,0.1296296,0.4074074,1238.8888889,1238.8888889
,4,0.0401107,0.8332410,13.3888889,13.3888889,1.0,1.0,0.1296296,0.5370370,1238.8888889,1238.8888889
,5,0.0511757,0.6218337,13.3888889,13.3888889,1.0,1.0,0.1481481,0.6851852,1238.8888889,1238.8888889
,6,0.1009682,0.0097704,5.9506173,9.7207002,0.4444444,0.7260274,0.2962963,0.9814815,495.0617284,872.0700152
,7,0.1507607,0.0021819,0.0,6.5101937,0.0,0.4862385,0.0,0.9814815,-100.0,551.0193680
,8,0.2005533,0.0009544,0.0,4.8938697,0.0,0.3655172,0.0,0.9814815,-100.0,389.3869732
,9,0.3001383,0.0004216,0.1859568,3.3317972,0.0138889,0.2488479,0.0185185,1.0,-81.4043210,233.1797235




ModelMetricsBinomial: gbm
** Reported on cross-validation data. **

MSE: 0.017217559004664148
RMSE: 0.13121569648736445
LogLoss: 0.06381041174794941
Mean Per-Class Error: 0.03082729125305017
AUC: 0.9949622606855173
Gini: 0.9899245213710346
Confusion Matrix (Act/Pred) for max f1 @ threshold = 0.10430101146906881: 


0,1,2,3,4
,0.0,1.0,Error,Rate
0,2651.0,8.0,0.003,(8.0/2659.0)
1,27.0,188.0,0.1256,(27.0/215.0)
Total,2678.0,196.0,0.0122,(35.0/2874.0)


Maximum Metrics: Maximum metrics at their respective thresholds



0,1,2,3
metric,threshold,value,idx
max f1,0.1043010,0.9148418,183.0
max f2,0.0271962,0.9240622,219.0
max f0point5,0.1872401,0.9478168,168.0
max accuracy,0.1122910,0.9878219,181.0
max precision,0.9999323,1.0,0.0
max recall,0.0002475,1.0,379.0
max specificity,0.9999323,1.0,0.0
max absolute_mcc,0.1043010,0.9094265,183.0
max min_per_class_accuracy,0.0082481,0.9669049,258.0


Gains/Lift Table: Avg response rate:  7.48 %



0,1,2,3,4,5,6,7,8,9,10,11
,group,cumulative_data_fraction,lower_threshold,lift,cumulative_lift,response_rate,cumulative_response_rate,capture_rate,cumulative_capture_rate,gain,cumulative_gain
,1,0.0100905,0.9904483,13.3674419,13.3674419,1.0,1.0,0.1348837,0.1348837,1236.7441860,1236.7441860
,2,0.0201809,0.9735235,13.3674419,13.3674419,1.0,1.0,0.1348837,0.2697674,1236.7441860,1236.7441860
,3,0.0302714,0.9295918,13.3674419,13.3674419,1.0,1.0,0.1348837,0.4046512,1236.7441860,1236.7441860
,4,0.0400139,0.8458191,13.3674419,13.3674419,1.0,1.0,0.1302326,0.5348837,1236.7441860,1236.7441860
,5,0.0501044,0.6203748,13.3674419,13.3674419,1.0,1.0,0.1348837,0.6697674,1236.7441860,1236.7441860
,6,0.1002088,0.0088183,5.8482558,9.6078488,0.4375,0.71875,0.2930233,0.9627907,484.8255814,860.7848837
,7,0.1499652,0.0030320,0.4673931,6.5751686,0.0349650,0.4918794,0.0232558,0.9860465,-53.2606928,557.5168618
,8,0.2000696,0.0015644,0.1856589,4.9750131,0.0138889,0.3721739,0.0093023,0.9953488,-81.4341085,397.5013145
,9,0.2999304,0.0007254,0.0,3.3185993,0.0,0.2482599,0.0,0.9953488,-100.0,231.8599255



Cross-Validation Metrics Summary: 


0,1,2,3,4,5,6,7,8,9,10,11,12
,mean,sd,cv_1_valid,cv_2_valid,cv_3_valid,cv_4_valid,cv_5_valid,cv_6_valid,cv_7_valid,cv_8_valid,cv_9_valid,cv_10_valid
accuracy,0.9899052,0.0028023,0.9930556,0.9930556,0.9895833,0.9965278,0.989547,0.989547,0.9860627,0.9930314,0.9825784,0.9860627
auc,0.9933760,0.0068440,0.9993165,0.9992866,0.9656528,0.9996582,0.9955245,0.9955403,0.9898799,0.9989259,0.9942714,0.9957035
err,0.0100949,0.0028023,0.0069444,0.0069444,0.0104167,0.0034722,0.0104530,0.0104530,0.0139373,0.0069686,0.0174216,0.0139373
err_count,2.9,0.8031189,2.0,2.0,3.0,1.0,3.0,3.0,4.0,2.0,5.0,4.0
f0point5,0.9433372,0.0196735,0.9545454,0.9523810,0.9433962,0.990566,0.9405941,0.9693878,0.9574468,0.9292035,0.8910891,0.9047619
f1,0.9304918,0.0201798,0.9545454,0.9523810,0.9302326,0.9767442,0.9268293,0.9268293,0.9,0.9545454,0.8780488,0.9047619
f2,0.9189484,0.0293220,0.9545454,0.9523810,0.9174312,0.9633027,0.9134616,0.8878505,0.8490566,0.9813084,0.8653846,0.9047619
lift_top_group,13.374459,0.2138589,13.090909,13.714286,13.090909,13.090909,13.666667,13.045455,13.045455,13.666667,13.666667,13.666667
logloss,0.0638320,0.0171153,0.0484987,0.0278088,0.0884303,0.0286577,0.0823394,0.0556475,0.0856369,0.0446571,0.0922864,0.0843568


Scoring History: 


0,1,2,3,4,5,6,7,8,9,10,11,12,13
,timestamp,duration,number_of_trees,training_rmse,training_logloss,training_auc,training_lift,training_classification_error,validation_rmse,validation_logloss,validation_auc,validation_lift,validation_classification_error
,2018-08-22 19:00:07,3 min 19.135 sec,0.0,0.2630823,0.2659034,0.5,1.0,0.9251914,0.2628886,0.2656020,0.5,1.0,0.9253112
,2018-08-22 19:00:07,3 min 19.151 sec,5.0,0.2105940,0.1629371,0.9937562,13.3674419,0.0177453,0.2297363,0.1910848,0.9603471,13.3888889,0.0497925
,2018-08-22 19:00:07,3 min 19.175 sec,10.0,0.1688130,0.1145925,0.9995662,13.3674419,0.0045233,0.2033729,0.1497541,0.9836406,13.3888889,0.0207469
,2018-08-22 19:00:07,3 min 19.199 sec,15.0,0.1368532,0.0843245,0.9999064,13.3674419,0.0024356,0.1865864,0.1265183,0.9834468,13.3888889,0.0193638
,2018-08-22 19:00:07,3 min 19.227 sec,20.0,0.1132756,0.0644020,0.9999825,13.3674419,0.0010438,0.1780982,0.1131609,0.9867685,13.3888889,0.0235131
---,---,---,---,---,---,---,---,---,---,---,---,---,---
,2018-08-22 19:00:08,3 min 19.583 sec,90.0,0.0056482,0.0020549,1.0,13.3674419,0.0,0.1234485,0.0514367,0.9968582,13.3888889,0.0041494
,2018-08-22 19:00:08,3 min 19.603 sec,95.0,0.0043782,0.0015997,1.0,13.3674419,0.0,0.1234707,0.0519864,0.9967475,13.3888889,0.0055325
,2018-08-22 19:00:08,3 min 19.631 sec,100.0,0.0034492,0.0012325,1.0,13.3674419,0.0,0.1214294,0.0504672,0.9968997,13.3888889,0.0055325



See the whole table with table.as_data_frame()
Variable Importances: 


0,1,2,3
variable,relative_importance,scaled_importance,percentage
v6,102.0492020,1.0,0.1537388
v3,100.5008316,0.9848272,0.1514062
v8_t,96.8396072,0.9489502,0.1458905
v7,84.9129868,0.8320789,0.1279228
v13,70.0270767,0.6862090,0.1054970
v14,62.8388290,0.6157699,0.0946677
v2,35.1191063,0.3441390,0.0529075
v10,30.7486057,0.3013116,0.0463233
v5,24.4349117,0.2394425,0.0368116




__Validation Performance__

In [23]:
preds = aml.predict(valid_h2o)

Parse progress: |█████████████████████████████████████████████████████████| 100%
gbm prediction progress: |████████████████████████████████████████████████| 100%


In [24]:
print(valid_h2o.shape, preds.shape)

(195, 17) (195, 3)


In [43]:
accuracy_score(y_pred=preds['predict'].as_data_frame().values, y_true=y_valid)

0.8666666666666667

# Summary

The best performance with the AutoML classifier was about 86.67% accuracy on the validation dataset which is the highest performance seen in this exercise.
