You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
A customer reports that for a binary classification problem the threshold applied to binarize the predictions is different between the actual model (the threshold is then whatever value maximises the F1) and MOJO (when the threshold appears to be just 0.5). I was able to reproduce this on 3.28.0.4:
A customer reports that for a binary classification problem the threshold applied to binarize the predictions is different between the actual model (the threshold is then whatever value maximises the F1) and MOJO (when the threshold appears to be just 0.5). I was able to reproduce this on 3.28.0.4:
{code:r}
library(tidyverse)
library(h2o)
h2o.init()
cc <- read_csv('~/Downloads/creditcard_train_cat.csv')
cc_h2o <- as.h2o(cc)
cc_h2o['DEFAULT_PAYMENT_NEXT_MONTH'] = h2o.asfactor(cc_h2o['DEFAULT_PAYMENT_NEXT_MONTH'])
drf <- h2o.randomForest(y = 'DEFAULT_PAYMENT_NEXT_MONTH', training_frame = cc_h2o)
dir <- '/Users/vaclav/Downloads/'
mojo_file <- h2o.download_mojo(drf, dir)
mojo <- h2o.import_mojo(paste0(dir, mojo_file))
pred_h2o <- predict(drf, cc_h2o)
pred_mojo <- predict(mojo, cc_h2o)
all(pred_h2o$p0 == pred_mojo$p0) # TRUE
all(pred_h2o$predict == pred_mojo$predict) # FALSE! <- the thresholds must differ
pred_h2o$predict_mojo <- pred_mojo$predict
pred_df <- as.data.frame(pred_h2o)
max(filter(pred_df, predict == 1)$p0)
[1] 0.7108447
max(filter(pred_df, predict_mojo == 1)$p0)
[1] 0.499601
{code}
The text was updated successfully, but these errors were encountered: