# Using Custom Metric Function In Binary Classification

In this notebook, we will show an example of how to calculate custom performance metrics on an H2O model for binary classification. The notebook will go through the following steps:

1. Train a GBM model in H2O
2. Write a script to calculate a weighted false negative loss
3. Train a GBM model in H2O using this loss function as a [`custom_metric_func`](https://github.com/h2oai/h2o-3/blob/master/h2o-docs/src/dev/custom_functions.md)
4. Train a Grid of GBMs and choose model based on this loss function


## 1. Train a  GBM Model in H2O

In [1]:
# Load H2O library
import h2o
h2o.init()

Checking whether there is an H2O instance running at http://localhost:54321..... not found.
Attempting to start a local H2O server...
  Java Version: java version "1.8.0_181"; Java(TM) SE Runtime Environment (build 1.8.0_181-b13); Java HotSpot(TM) 64-Bit Server VM (build 25.181-b13, mixed mode)
  Starting server from /anaconda3/lib/python3.6/site-packages/h2o/backend/bin/h2o.jar
  Ice root: /var/folders/wk/m00ydfj52f9fl7zvx5cjztgc0000gn/T/tmp1vof4pgy
  JVM stdout: /var/folders/wk/m00ydfj52f9fl7zvx5cjztgc0000gn/T/tmp1vof4pgy/h2o_patrickaboyoun_started_from_python.out
  JVM stderr: /var/folders/wk/m00ydfj52f9fl7zvx5cjztgc0000gn/T/tmp1vof4pgy/h2o_patrickaboyoun_started_from_python.err
  Server is running at http://127.0.0.1:54321
Connecting to H2O server at http://127.0.0.1:54321... successful.


0,1
H2O cluster uptime:,01 secs
H2O cluster timezone:,America/Los_Angeles
H2O data parsing timezone:,UTC
H2O cluster version:,3.20.0.7
H2O cluster version age:,15 days
H2O cluster name:,H2O_from_python_patrickaboyoun_7sdyyd
H2O cluster total nodes:,1
H2O cluster free memory:,3.556 Gb
H2O cluster total cores:,8
H2O cluster allowed cores:,8


In [2]:
# Import Data
train_path = "https://raw.githubusercontent.com/h2oai/app-consumer-loan/master/data/loan.csv"
train = h2o.import_file(train_path, destination_frame = "loan_train")
train["bad_loan"] = train["bad_loan"].asfactor()

Parse progress: |█████████████████████████████████████████████████████████| 100%


In [3]:
# Set target and predictor variables
y = "bad_loan"
x = train.col_names
x.remove(y)
x.remove("int_rate")

In [4]:
# Train GBM Model
from h2o.estimators import H2OGradientBoostingEstimator

gbm_v1 = H2OGradientBoostingEstimator(model_id = "gbm_v1.hex")

gbm_v1.train(y = y, x = x, training_frame = train)

gbm Model Build progress: |███████████████████████████████████████████████| 100%


In [5]:
print(gbm_v1)

Model Details
H2OGradientBoostingEstimator :  Gradient Boosting Machine
Model Key:  gbm_v1.hex


ModelMetricsBinomial: gbm
** Reported on train data. **

MSE: 0.1363951465191071
RMSE: 0.3693171354257843
LogLoss: 0.43467809200204266
Mean Per-Class Error: 0.3508577460272526
AUC: 0.7079429892082825
Gini: 0.41588597841656494
Confusion Matrix (Act/Pred) for max f1 @ threshold = 0.19860658480443358: 


0,1,2,3,4
,0.0,1.0,Error,Rate
0,95113.0,38858.0,0.29,(38858.0/133971.0)
1,12401.0,17615.0,0.4131,(12401.0/30016.0)
Total,107514.0,56473.0,0.3126,(51259.0/163987.0)


Maximum Metrics: Maximum metrics at their respective thresholds



0,1,2,3
metric,threshold,value,idx
max f1,0.1986066,0.4073350,228.0
max f2,0.1265125,0.5633597,311.0
max f0point5,0.2803267,0.3837915,152.0
max accuracy,0.4292597,0.8203272,62.0
max precision,0.7770519,1.0,0.0
max recall,0.0438524,1.0,396.0
max specificity,0.7770519,1.0,0.0
max absolute_mcc,0.2251476,0.2462816,202.0
max min_per_class_accuracy,0.1823683,0.6488206,245.0


Gains/Lift Table: Avg response rate: 18.30 %, avg score: 18.31 %



0,1,2,3,4,5,6,7,8,9,10,11,12,13
,group,cumulative_data_fraction,lower_threshold,lift,cumulative_lift,response_rate,score,cumulative_response_rate,cumulative_score,capture_rate,cumulative_capture_rate,gain,cumulative_gain
,1,0.0100008,0.4806708,3.5211761,3.5211761,0.6445122,0.5303616,0.6445122,0.5303616,0.0352146,0.0352146,252.1176084,252.1176084
,2,0.0200016,0.4375856,2.8082795,3.1647278,0.5140244,0.4573837,0.5792683,0.4938727,0.0280850,0.0632996,180.8279507,216.4727796
,3,0.0300024,0.4110619,2.6750278,3.0014945,0.4896341,0.4234982,0.5493902,0.4704145,0.0267524,0.0900520,167.5027810,200.1494467
,4,0.0400032,0.3902679,2.5117945,2.8790695,0.4597561,0.4001474,0.5269817,0.4528477,0.0251199,0.1151719,151.1794482,187.9069471
,5,0.0500040,0.3731066,2.3019231,2.7636402,0.4213415,0.3814565,0.5058537,0.4385695,0.0230211,0.1381930,130.1923060,176.3640189
,6,0.1000018,0.3147513,2.0763146,2.4199984,0.3800463,0.3412588,0.4429538,0.3899171,0.1038113,0.2420043,107.6314643,141.9998372
,7,0.1499997,0.2770929,1.7224882,2.1875044,0.3152824,0.2946252,0.4003984,0.3581544,0.0861207,0.328125,72.2488239,118.7504446
,8,0.2000037,0.2503682,1.5570461,2.0298802,0.285,0.2631637,0.3715470,0.3344053,0.0778585,0.4059835,55.7046075,102.9880242
,9,0.2999994,0.2116632,1.2990293,1.7862732,0.2377729,0.2298703,0.3269575,0.2995617,0.1298974,0.5358809,29.9029331,78.6273176



Scoring History: 


0,1,2,3,4,5,6,7,8
,timestamp,duration,number_of_trees,training_rmse,training_logloss,training_auc,training_lift,training_classification_error
,2018-09-16 12:47:25,0.025 sec,0.0,0.3866984,0.4759704,0.5,1.0,0.8169611
,2018-09-16 12:47:26,0.755 sec,1.0,0.3847635,0.4710759,0.6582899,2.5923891,0.3376304
,2018-09-16 12:47:26,1.005 sec,2.0,0.3831611,0.4671624,0.6641427,2.7218454,0.3575466
,2018-09-16 12:47:26,1.149 sec,3.0,0.3818189,0.4639557,0.6658226,2.8447858,0.3506497
,2018-09-16 12:47:26,1.263 sec,4.0,0.3806812,0.4612690,0.6685973,2.9280752,0.3523572
---,---,---,---,---,---,---,---,---
,2018-09-16 12:47:29,3.777 sec,30.0,0.3714227,0.4396077,0.6983760,3.3546115,0.3202998
,2018-09-16 12:47:29,3.876 sec,31.0,0.3712694,0.4392390,0.6990511,3.3512802,0.3205010
,2018-09-16 12:47:29,3.955 sec,32.0,0.3711742,0.4390186,0.6993193,3.3446176,0.3253916



See the whole table with table.as_data_frame()
Variable Importances: 


0,1,2,3
variable,relative_importance,scaled_importance,percentage
term,2747.9863281,1.0,0.2493995
annual_inc,1938.3043213,0.7053544,0.1759151
addr_state,1546.3793945,0.5627318,0.1403451
revol_util,1427.8056641,0.5195825,0.1295836
purpose,934.7630005,0.3401629,0.0848365
dti,847.0342407,0.3082382,0.0768744
loan_amnt,627.7856445,0.2284530,0.0569761
emp_length,249.1807861,0.0906776,0.0226149
home_ownership,239.7884827,0.0872597,0.0217625





## 2. Write Script to Calculate Weighted False Negative Loss

### Function to Calculate Normalized Cost Matrix Loss in H2O

In [6]:
def NormalizedCostMatrixLoss(actual, predicted, cost_tp, cost_tn, cost_fp, cost_fn):
    c1 = cost_tp + cost_tn - cost_fp - cost_fn
    c2 = cost_fn - cost_tn
    c3 = cost_fp - cost_tn
    c4 = cost_tn

    num = (actual * predicted * c1) + (actual * c2) + (predicted * c3) + c4
    denom = actual.ifelse(cost_fn, cost_fp)
    cost = num.sum() / denom.sum()
    return cost

In [7]:
train["weight"] = train["loan_amnt"]

In [8]:
loss_v1 = NormalizedCostMatrixLoss(train[y].asnumeric(), gbm_v1.predict(train)["p1"], 5000, 0, 5000, train["weight"])
print("NormalizedCostMatrixLoss: " + str(round(loss_v1, 4)))

gbm prediction progress: |████████████████████████████████████████████████| 100%
NormalizedCostMatrixLoss: 0.4239


### Python Script to calculate Weighted False Negative Loss in custom_metric_func

The weighted false negative loss metric is defined in a class stored in utils_model_metrics.py. This class contains three methods `map`, `reduce`, and `metric`. The `map` method takes 5 arguments `predicted`, `actual`, `weight`, `offset` and `model`.

```
class WeightedFalseNegativeLossMetric:
    def map(self, predicted, actual, weight, offset, model):
        cost_tp = 5000 # set prior to use
        cost_tn = 0 # do not change
        cost_fp = cost_tp # do not change
        cost_fn = weight # do not change

        # c1 = cost_tp + cost_tn - cost_fp - cost_fn
        # c2 = cost_fn - cost_tn
        # c3 = cost_fp - cost_tn
        # c4 = cost_tn
        # (y * p * c1) + (y * c2) + (p * c3) + c4

        y = actual[0]
        p = predicted[2] # [class, p0, p1]
        if y == 1:
            denom = cost_fn
        else:
            denom = cost_fp
        return [(y * (1 - p) * cost_fn) + (p * cost_fp), denom]

    def reduce(self, left, right):
        return [left[0] + right[0], left[1] + right[1]]

    def metric(self, last):
        return last[0] / last[1]
```

This class definition is uploaded to the H2O cluster using [`h2o.upload_custom_metric`](http://docs.h2o.ai/h2o/latest-stable/h2o-py/docs/h2o.html?highlight=custom_metric#h2o.upload_custom_metric).

In [9]:
from utils_model_metrics import WeightedFalseNegativeLossMetric

weighted_false_negative_loss_func = h2o.upload_custom_metric(WeightedFalseNegativeLossMetric,
                                                 func_name = "WeightedFalseNegativeLoss",
                                                 func_file = "weighted_false_negative_loss.py")

In [10]:
type(weighted_false_negative_loss_func)

str

In [11]:
print(weighted_false_negative_loss_func)

python:WeightedFalseNegativeLoss=weighted_false_negative_loss.WeightedFalseNegativeLossMetricWrapper


## 3. Train a GBM Model using custom_metric_func

The [`H2OGeneralizedLinearEstimator`](http://docs.h2o.ai/h2o/latest-stable/h2o-py/docs/modeling.html?highlight=automl#h2ogeneralizedlinearestimator),
[`H2ORandomForestEstimator`](http://docs.h2o.ai/h2o/latest-stable/h2o-py/docs/modeling.html?highlight=automl#h2orandomforestestimator), and
[`H2OGradientBoostingEstimator`](http://docs.h2o.ai/h2o/latest-stable/h2o-py/docs/modeling.html?highlight=automl#h2ogradientboostingestimator) models accept a `custom_metric_func` argument.

In [12]:
# Train GBM Model with custom_metric_function
gbm_v2 = H2OGradientBoostingEstimator(model_id = "gbm_v2.hex",
                                      custom_metric_func = weighted_false_negative_loss_func)

gbm_v2.train(y = y, x = x, training_frame = train, weights_column = "weight")

gbm Model Build progress: |███████████████████████████████████████████████| 100%


In [13]:
perf = gbm_v2.model_performance()
perf


ModelMetricsBinomial: gbm
** Reported on train data. **

MSE: 0.14194205086753872
RMSE: 0.376751975266937
LogLoss: 0.44785973758467373
Mean Per-Class Error: 0.340920265265635
AUC: 0.7186517880386486
Gini: 0.4373035760772972
Confusion Matrix (Act/Pred) for max f1 @ threshold = 0.2208251278711907: 


0,1,2,3,4
,0.0,1.0,Error,Rate
0,1256403850.0,465408000.0,0.2703,(465408000.0/1721811850.0)
1,175106300.0,247075625.0,0.4148,(175106300.0/422181925.0)
Total,1431510150.0,712483625.0,0.2987,(640514300.0/2143993775.0)


Maximum Metrics: Maximum metrics at their respective thresholds



0,1,2,3
metric,threshold,value,idx
max f1,0.2208251,0.4355039,225.0
max f2,0.1338078,0.5862034,314.0
max f0point5,0.3111775,0.4160830,149.0
max accuracy,0.4219306,0.8100069,81.0
max precision,0.8482319,1.0,0.0
max recall,0.0456742,1.0,398.0
max specificity,0.8482319,1.0,0.0
max absolute_mcc,0.2527278,0.2696590,196.0
max min_per_class_accuracy,0.1981595,0.6573922,246.0


Gains/Lift Table: Avg response rate: 19.69 %, avg score: 19.70 %



0,1,2,3,4,5,6,7,8,9,10,11,12,13
,group,cumulative_data_fraction,lower_threshold,lift,cumulative_lift,response_rate,score,cumulative_response_rate,cumulative_score,capture_rate,cumulative_capture_rate,gain,cumulative_gain
,1,0.0100040,0.4989871,3.6195299,3.6195299,0.7127353,0.5558560,0.7127353,0.5558560,0.0362099,0.0362099,261.9529924,261.9529924
,2,0.0200064,0.4620542,3.0621888,3.3408834,0.6029872,0.4785761,0.6578660,0.5172193,0.0306290,0.0668389,206.2188848,234.0883424
,3,0.0300004,0.4366559,2.6700122,3.1173954,0.5257622,0.4486896,0.6138581,0.4943900,0.0266843,0.0935232,167.0012195,211.7395446
,4,0.0400071,0.4169808,2.4709308,2.9557003,0.4865603,0.4265782,0.5820181,0.4774287,0.0247258,0.1182490,147.0930841,195.5700253
,5,0.0500068,0.4020904,2.2519128,2.8149662,0.4434327,0.4091541,0.5543056,0.4637761,0.0225184,0.1407674,125.1912818,181.4966201
,6,0.1000023,0.3450399,2.1048786,2.4599624,0.4144796,0.3710892,0.4844005,0.4174379,0.1052345,0.2460019,110.4878588,145.9962386
,7,0.1500058,0.3063922,1.8004637,2.2401226,0.3545361,0.3246164,0.4411110,0.3864964,0.0900295,0.3360314,80.0463721,124.0122594
,8,0.2000091,0.2763789,1.5714338,2.0729470,0.3094370,0.2909867,0.4081918,0.3626185,0.0785768,0.4146082,57.1433818,107.2947018
,9,0.3000010,0.2319078,1.3303017,1.8254195,0.2619547,0.2531673,0.3594503,0.3261378,0.1330194,0.5476276,33.0301710,82.5419471



WeightedFalseNegativeLoss: 0.42381109064735906




In [14]:
perf.custom_metric_name()

'WeightedFalseNegativeLoss'

In [15]:
perf.custom_metric_value()

0.42381109064735906

We can see that our custom weighted false negative loss function is in the model performance metrics labeled `WeightedFalseNegativeLoss`.  This value matches the value calculated in our original GBM model.

In [16]:
print("Loss V1: " + str(round(loss_v1, 4)))
print("Loss V2: " + str(round(gbm_v2.model_performance().custom_metric_value(), 4)))

Loss V1: 0.4239
Loss V2: 0.4238


## 4. Train a Grid of GBMs and choose model based on custom loss metric

In [17]:
from h2o.grid.grid_search import H2OGridSearch
gbm_hyper_parameters = {'max_depth': [4, 5, 6]}
gbm_grid = H2OGridSearch(H2OGradientBoostingEstimator(custom_metric_func = weighted_false_negative_loss_func,
                                                      nfolds = 5),
                           gbm_hyper_parameters)
gbm_grid.train(x = x, y = y, training_frame = train, weights_column = "weight", grid_id = "gbm_grid")

gbm Grid Build progress: |████████████████████████████████████████████████| 100%


In [18]:
sorted([[h2o.get_model(x).model_performance(xval = True).custom_metric_value(), x] for x in gbm_grid.model_ids])

[[0.423955824695979, 'gbm_grid_model_2'],
 [0.42850913512546235, 'gbm_grid_model_1'],
 [0.4308449289981675, 'gbm_grid_model_0']]

## Shutdown H2O Cluster

In [19]:
h2o.cluster().shutdown()

H2O session _sid_9b1d closed.
