You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When saving a model (which had keep_cross_validation_predictions = TRUE) using the binary format, the model will be saved, but the predictions will not. This means that if you save models and want to train stacked ensembles later, you will not be able to. The model saves key name of the predictions frame, but once the cluster is shutdown, that key is no longer valid.
Example:
{code}
fit <- h2o.gbm(y = 5, training_frame = as.h2o(train), nfolds = 3, keep_cross_validation_predictions = TRUE)
|========================================================================================================| 100%
h2o.saveModel(fit, path = "/Users/me/Downloads/foocv/")
[1] "/Users/me/Downloads/foocv/GBM_model_R_1536894672242_603"
h2o.shutdown()
Are you sure you want to shutdown the H2O instance running at http://localhost:54321/ (Y/N)? y
[1] TRUE
rm(list=ls())
h2o.init()
H2O is not running yet, starting it now...
Note: In case of errors look at the following log files:
/var/folders/gj/cm0k4b_s42j30zs376cq_5hh0000gn/T//Rtmp7Grlq4/h2o_me_started_from_r.out
/var/folders/gj/cm0k4b_s42j30zs376cq_5hh0000gn/T//Rtmp7Grlq4/h2o_me_started_from_r.err
java version "1.8.0_172"
Java(TM) SE Runtime Environment (build 1.8.0_172-b11)
Java HotSpot(TM) 64-Bit Server VM (build 25.172-b11, mixed mode)
Starting H2O JVM and connecting: . Connection successful!
R is connected to the H2O cluster:
H2O cluster uptime: 1 seconds 590 milliseconds
H2O cluster timezone: America/Los_Angeles
H2O data parsing timezone: UTC
H2O cluster version: 3.21.0.99999
H2O cluster version age: 52 minutes
H2O cluster name: H2O_started_from_R_me_zdk319
H2O cluster total nodes: 1
H2O cluster total memory: 3.56 GB
H2O cluster total cores: 8
H2O cluster allowed cores: 8
H2O cluster healthy: TRUE
H2O Connection ip: localhost
H2O Connection port: 54321
H2O Connection proxy: NA
H2O Internal Security: FALSE
H2O API Extensions: XGBoost, Algos, AutoML, Core V3, Core V4
R Version: R version 3.5.0 (2018-04-23)
fit <- h2o.loadModel("/Users/me/Downloads/foocv/GBM_model_R_1536894672242_603")
fit@model$cross_validation_predictions
[[1]]
[[1]]$__meta
[[1]]$__meta$schema_version
[1] 3
I'm not sure if we should try to save the predictions along with the binary model (if they were kept) or if we should write client side wrapper functions to save the CV pred frame separately at the same path (e.g. model_id.csv) and then load them up when the model is loaded.
The text was updated successfully, but these errors were encountered:
Erin LeDell commented: Another report of this causing issues for a user: [https://stackoverflow.com/questions/64985991/is-it-possible-to-use-loaded-h2o-grids-for-stacked-ensembles|https://stackoverflow.com/questions/64985991/is-it-possible-to-use-loaded-h2o-grids-for-stacked-ensembles]
When saving a model (which had keep_cross_validation_predictions = TRUE) using the binary format, the model will be saved, but the predictions will not. This means that if you save models and want to train stacked ensembles later, you will not be able to. The model saves key name of the predictions frame, but once the cluster is shutdown, that key is no longer valid.
Example:
{code}
H2O is not running yet, starting it now...
Note: In case of errors look at the following log files:
/var/folders/gj/cm0k4b_s42j30zs376cq_5hh0000gn/T//Rtmp7Grlq4/h2o_me_started_from_r.out
/var/folders/gj/cm0k4b_s42j30zs376cq_5hh0000gn/T//Rtmp7Grlq4/h2o_me_started_from_r.err
java version "1.8.0_172"
Java(TM) SE Runtime Environment (build 1.8.0_172-b11)
Java HotSpot(TM) 64-Bit Server VM (build 25.172-b11, mixed mode)
Starting H2O JVM and connecting: . Connection successful!
R is connected to the H2O cluster:
H2O cluster uptime: 1 seconds 590 milliseconds
H2O cluster timezone: America/Los_Angeles
H2O data parsing timezone: UTC
H2O cluster version: 3.21.0.99999
H2O cluster version age: 52 minutes
H2O cluster name: H2O_started_from_R_me_zdk319
H2O cluster total nodes: 1
H2O cluster total memory: 3.56 GB
H2O cluster total cores: 8
H2O cluster allowed cores: 8
H2O cluster healthy: TRUE
H2O Connection ip: localhost
H2O Connection port: 54321
H2O Connection proxy: NA
H2O Internal Security: FALSE
H2O API Extensions: XGBoost, Algos, AutoML, Core V3, Core V4
R Version: R version 3.5.0 (2018-04-23)
[[1]]$__meta$ schema_name
[1] "FrameKeyV3"
[[1]]$__meta$ schema_type
[1] "Key"
[[1]]$name
[1] "prediction_GBM_model_R_1536894672242_603_cv_1"
[[1]]$type
[1] "Key"
[[1]]$URL
[1] "/3/Frames/prediction_GBM_model_R_1536894672242_603_cv_1"
[[2]]$__meta$ schema_version
[[2]]$
__meta
[[2]]
[1] 3
[[2]]$__meta$ schema_name
[1] "FrameKeyV3"
[[2]]$__meta$ schema_type
[1] "Key"
[[2]]$name
[1] "prediction_GBM_model_R_1536894672242_603_cv_2"
[[2]]$type
[1] "Key"
[[2]]$URL
[1] "/3/Frames/prediction_GBM_model_R_1536894672242_603_cv_2"
[[3]]$__meta$ schema_version
[[3]]$
__meta
[[3]]
[1] 3
[[3]]$__meta$ schema_name
[1] "FrameKeyV3"
[[3]]$__meta$ schema_type
[1] "Key"
[[3]]$name
[1] "prediction_GBM_model_R_1536894672242_603_cv_3"
[[3]]$type
[1] "Key"
[[3]]$URL
[1] "/3/Frames/prediction_GBM_model_R_1536894672242_603_cv_3"
[1] "FrameKeyV3"
[1] "Key"
$name
[1] "cv_holdout_prediction_GBM_model_R_1536894672242_603"
$type
[1] "Key"
$URL
[1] "/3/Frames/cv_holdout_prediction_GBM_model_R_1536894672242_603"
{code}
Reported on h2ostream: https://groups.google.com/forum/#!topic/h2ostream/zoW_ewFwJAU
I'm not sure if we should try to save the predictions along with the binary model (if they were kept) or if we should write client side wrapper functions to save the CV pred frame separately at the same path (e.g. model_id.csv) and then load them up when the model is loaded.
The text was updated successfully, but these errors were encountered: