# Aufgabe 7: AutoML

Unter AutoML versteht man...

https://www.ml4aad.org/automl/

###### (a) Wählen Sie ein AutoML Package. Begründen Sie Ihre Auswahl.

Wir haben uns für _H2O AutoML_ entschieden, weil es in Hinblick auf die Reproduzierbarkeit der Ergebnisse am leichtesten auf verschiedenen Plattformen installierbar ist.

Bei _auto-sklearn_ beispielsweise fehlten die Berechtigungen, um die benötigten Dependencies auf SDIL zu installieren.
Für andere AutoML-Bibliotheken wurde lediglich ein git-repo zur Verfügung gestellt, weshalb diese nicht einfach über eine `requirements.txt` installiert werden können.

Zudem fanden wir die Dokumentation zu _H2O AutoML_ auf den ersten Blick sehr übersichtlich.

###### (b) Führen Sie die Klassifikationsaufgabe von Aufgabe 3 mit AutoML durch. Vergleichen Sie die Ergebnisse mit den Ergebnissen aus Aufgabe 3.

http://docs.h2o.ai/h2o/latest-stable/h2o-docs/automl.html

In [1]:
# Import Libraries
import time
import multiprocessing

import pandas as pd
import numpy as np

from sklearn import svm
from sklearn import model_selection
from sklearn.model_selection import GridSearchCV

import h2o
from h2o.automl import H2OAutoML
from h2o.frame import H2OFrame

from sklearn.metrics import classification_report
from sklearn.metrics import confusion_matrix
from sklearn.metrics import accuracy_score

In [2]:
# initialize h2o
h2o.init()

Checking whether there is an H2O instance running at http://localhost:54321. connected.


0,1
H2O cluster uptime:,15 mins 08 secs
H2O cluster timezone:,Europe/Berlin
H2O data parsing timezone:,UTC
H2O cluster version:,3.22.1.2
H2O cluster version age:,4 months !!!
H2O cluster name:,H2O_from_python_carmen_wp3in5
H2O cluster total nodes:,1
H2O cluster free memory:,3.851 Gb
H2O cluster total cores:,8
H2O cluster allowed cores:,8


In [3]:
# Load dataset iris
url = "https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data"
names = ['sepal-length', 'sepal-width', 'petal-length', 'petal-width', 'class']
iris = pd.read_csv(url, names=names)

In [4]:
# Split-out validation dataset
array = iris.values
X = array[:,0:4]
Y = array[:,4]
test_size = 0.20
seed = 7
X_train, X_test, y_train, y_test = model_selection.train_test_split(X, Y, test_size=test_size, random_state=seed)

In [5]:
# prepare data by creating a train and test array and transforming it into an h2o frame
train_array = np.column_stack([X_train, y_train])
test_array = np.column_stack([X_test, y_test])

train = H2OFrame(python_obj=train_array, column_names=names)
test = H2OFrame(python_obj=test_array, column_names=names)

Parse progress: |█████████████████████████████████████████████████████████| 100%
Parse progress: |█████████████████████████████████████████████████████████| 100%


In [6]:
# Identify predictors and response
x = names
y = "class"
x.remove(y)

In [7]:
# set number of models to 20 and start training
aml = H2OAutoML(max_models=20, seed=1, nfolds=10)
aml.train(x=x, y=y, training_frame=train)                                 

AutoML progress: |████████████████████████████████████████████████████████| 100%


In [8]:
# View the AutoML Leaderboard
lb = aml.leaderboard
display(lb.head(rows=lb.nrows)) # Print all rows instead of default (10 rows)

model_id,mean_per_class_error,logloss,rmse,mse
DeepLearning_grid_1_AutoML_20190519_140823_model_5,0.00877193,0.106439,0.102452,0.0104964
DeepLearning_grid_1_AutoML_20190519_140823_model_2,0.00877193,0.0562566,0.127838,0.0163425
DRF_1_AutoML_20190519_140823,0.017094,0.0615018,0.135776,0.0184352
XRT_1_AutoML_20190519_140823,0.017094,0.0564404,0.130203,0.0169529
DeepLearning_grid_1_AutoML_20190519_140823_model_1,0.0173189,0.252379,0.129633,0.0168047
StackedEnsemble_BestOfFamily_AutoML_20190519_140823,0.0173189,0.142473,0.154669,0.0239225
StackedEnsemble_AllModels_AutoML_20190519_140823,0.0173189,0.129481,0.146512,0.0214658
DeepLearning_grid_1_AutoML_20190519_140823_model_4,0.0175439,0.0540176,0.130054,0.016914
DeepLearning_grid_1_AutoML_20190519_140823_model_3,0.0175439,0.382843,0.145793,0.0212555
DeepLearning_1_AutoML_20190519_140823,0.0258659,0.151966,0.186451,0.0347641




In [10]:
# The leader model is stored here
display(aml.leader)

Model Details
H2ODeepLearningEstimator :  Deep Learning
Model Key:  DeepLearning_grid_1_AutoML_20190519_140823_model_5


ModelMetricsMultinomial: deeplearning
** Reported on train data. **

MSE: 0.0003593883782032384
RMSE: 0.018957541459884465
LogLoss: 0.0021525641996402193
Mean Per-Class Error: 0.0
Confusion Matrix: Row labels: Actual class; Column labels: Predicted class



0,1,2,3,4
Iris-setosa,Iris-versicolor,Iris-virginica,Error,Rate
43.0,0.0,0.0,0.0,0 / 43
0.0,38.0,0.0,0.0,0 / 38
0.0,0.0,39.0,0.0,0 / 39
43.0,38.0,39.0,0.0,0 / 120


Top-3 Hit Ratios: 


0,1
k,hit_ratio
1,1.0
2,1.0
3,1.0



ModelMetricsMultinomial: deeplearning
** Reported on cross-validation data. **

MSE: 0.010496439506336861
RMSE: 0.10245213275640903
LogLoss: 0.10643940259325049
Mean Per-Class Error: 0.008771929824561403
Confusion Matrix: Row labels: Actual class; Column labels: Predicted class



0,1,2,3,4
Iris-setosa,Iris-versicolor,Iris-virginica,Error,Rate
43.0,0.0,0.0,0.0,0 / 43
0.0,37.0,1.0,0.0263158,1 / 38
0.0,0.0,39.0,0.0,0 / 39
43.0,37.0,40.0,0.0083333,1 / 120


Top-3 Hit Ratios: 


0,1
k,hit_ratio
1,0.9916667
2,1.0
3,1.0


Cross-Validation Metrics Summary: 


0,1,2,3,4,5,6,7,8,9,10,11,12
,mean,sd,cv_1_valid,cv_2_valid,cv_3_valid,cv_4_valid,cv_5_valid,cv_6_valid,cv_7_valid,cv_8_valid,cv_9_valid,cv_10_valid
accuracy,0.9916667,0.0176777,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,0.9166667,1.0
err,0.0083333,0.0176777,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0833333,0.0
err_count,0.1,0.2121320,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,0.0
logloss,0.1064394,0.2066515,0.0000013,0.0004192,0.0025991,0.0001614,0.0004910,0.0000004,0.0803682,0.0000511,0.9802672,0.0000351
max_per_class_error,0.0166667,0.0353553,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.1666667,0.0
mean_per_class_accuracy,0.9944444,0.0117851,1.0,1.0,1.0,1.0,1.0,1.0,1.0,1.0,0.9444444,1.0
mean_per_class_error,0.0055556,0.0117851,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0555556,0.0
mse,0.0104964,0.0177604,0.0000000,0.0000010,0.0000708,0.0000003,0.0000012,0.0000000,0.0215480,0.0000000,0.0833431,0.0000000
r2,0.9800838,0.0352580,1.0,0.9999983,0.9998501,0.9999995,0.999998,1.0,0.967678,0.9999999,0.8333138,1.0


Scoring History: 


0,1,2,3,4,5,6,7,8,9,10
,timestamp,duration,training_speed,epochs,iterations,samples,training_rmse,training_logloss,training_r2,training_classification_error
,2019-05-19 14:29:19,0.000 sec,,0.0,0,0.0,,,,
,2019-05-19 14:29:19,55.065 sec,63157 obs/sec,10.0,1,1200.0,0.2599565,0.2638794,0.9009452,0.0916667
,2019-05-19 14:29:24,1 min 0.071 sec,141878 obs/sec,5940.0,594,712800.0,0.0189575,0.0021526,0.9994732,0.0


Variable Importances: 


0,1,2,3
variable,relative_importance,scaled_importance,percentage
petal-width,1.0,1.0,0.3965032
petal-length,0.7063910,0.7063910,0.2800863
sepal-width,0.4085257,0.4085257,0.1619818
sepal-length,0.4071308,0.4071308,0.1614287




In [11]:
# predict on test data
predictions = aml.leader.predict(test).as_data_frame()['predict'].tolist()

deeplearning prediction progress: |███████████████████████████████████████| 100%


In [12]:
# get target for test data
y_test = test.as_data_frame()['class'].tolist()

In [13]:
# print results
print(accuracy_score(y_test, predictions))
print(confusion_matrix(y_test, predictions))
print(classification_report(y_test, predictions))

0.9
[[ 7  0  0]
 [ 0 11  1]
 [ 0  2  9]]
                 precision    recall  f1-score   support

    Iris-setosa       1.00      1.00      1.00         7
Iris-versicolor       0.85      0.92      0.88        12
 Iris-virginica       0.90      0.82      0.86        11

       accuracy                           0.90        30
      macro avg       0.92      0.91      0.91        30
   weighted avg       0.90      0.90      0.90        30



Der DeepLearning-Ansatz wird von AutoML als der geeignetste betrachtet. Dieser erreicht auf den Trainingsdaten eine Accuracy von 0.9 und liegt damit unterhalb der Accuracy des Models mit der höchsten Accuracy aus Aufgabe drei (0.967). 

# Zweiter Datensatz

In [14]:
# Load dataset
url_heart = "./data/heart.csv"
names_heart =  ['age', 'sex', 'chest_pain_type', 'resting_blood_pressure', 'cholesterol', 
                 'fasting_blood_sugar', 'rest_ecg', 'max_heart_rate_achieved', 'exercise_induced_angina',
                 'st_depression', 'st_slope', 'num_major_vessels', 'thalassemia', 'target']
heart = pd.read_csv(url_heart, names=names_heart)

In [15]:
# Split-out validation dataset
array_heart = heart.values
X_heart = array_heart[:,0:4]
Y_heart = array_heart[:,4]
test_size_heart = 0.20
seed_heart = 7
X_heart_train, X_heart_test, y_heart_train, y_heart_test = model_selection.train_test_split(X_heart, Y_heart, test_size=test_size_heart, random_state=seed_heart)

In [16]:
# prepare data by creating a train and test array and transforming it into an h2o frame
train_heart_array = np.column_stack([X_heart_train, y_heart_train])
test_heart_array = np.column_stack([X_heart_test, y_heart_test])

train_heart = H2OFrame(python_obj=train_heart_array, column_names=names_heart)
test_heart = H2OFrame(python_obj=test_heart_array, column_names=names_heart)

Parse progress: |█████████████████████████████████████████████████████████| 100%
Parse progress: |█████████████████████████████████████████████████████████| 100%


In [17]:
# Identify predictors and response
x_heart = names_heart
y_heart = "target"
x_heart.remove(y_heart)

In [None]:
# set number of models to 20 and start training
aml_heart = H2OAutoML(max_models=20, seed=1, nfolds=10)
aml_heart.train(x=x_heart, y=y_heart, training_frame=train_heart)                                 

AutoML progress: |██████████████████████████████████████████████

In [None]:
# View the AutoML Leaderboard
lb_heart = aml_heart.leaderboard
display(lb_heart.head(rows=lb_heart.nrows)) # Print all rows instead of default (10 rows)

In [None]:
# The leader model is stored here
display(aml_heart.leader)

In [None]:
# predict on test data
predictions_heart = aml.leader_heart.predict(test_heart).as_data_frame()['predict'].tolist()

In [None]:
# get target for test data
y_heart_test = test_heart.as_data_frame()['target'].tolist()

In [None]:
# print results
print(accuracy_score(y_heart_test, predictions_heart))
print(confusion_matrix(y_heart_test, predictions_heart))
print(classification_report(y_heart_test, predictions_heart))

# TODO: Vergleich der Ergebnisse

###### (c) Was ist Ihre Meinung zu AutoML?

AutoML nimmt einem viel Arbeit ab, was gerade für Anfänger hilfreich ist. Allerdings verliert man hierdurch auch etwas Kontrolle. Die Ergebnisse sind scheinbar schlechter als in Aufgabe 3. Jedoch ist es schwer nachzuvollziehen wieso, da eben alles automatisch passiert. Zudem ist nicht klar, wieso welche Modelle getestet wurden. 

Außerdem ist die Darstellung des aml.leader() seltsam. Leider wurde dafür kein Fix gefunden 