GAM prediction results in different results with cv on and off. #7015

exalate-issue-sync · 2023-05-11T15:38:28Z

No description provided.

exalate-issue-sync · 2023-05-11T15:38:29Z

Wendy Wong commented: A user from gitter reported the following:

Hello, I'm testing out h2o.gam and am coming across something unintuitive. Does it make sense for identical training models (same deviance) to have different predictions (from the training model) when comparing with and without cross-validation?

{{train <- h2o.createFrame(cols = 5, seed = 22, seed_for_column_types = 55, factors = 3, missing_fraction = 0) }}

{{train$fold <- h2o.kfold_column(train, nfolds = 3, seed = 11) }}

{{train$response <- 50 + ifelse(train$C5 == "c4.l0", 10, ifelse(train$C5 == "c4.l1", 15, ifelse(train$C5 == "c4.12", 20, 25))) + 0.2 * train$C1 + - 0.05 * train$C2 + -0.2 * train$C4 - 0.005 * train$C4^2 + 0.00005 * train$C4^3 + 5*h2o.runif(train) }}

{{params <- list( x = c("C1", "C2", "C5"), y = "response", training_frame = train, lambda = 0, keep_gam_cols = TRUE, gam_columns = c("C4"), scale = c(.05), num_knots = c(5), spline_orders = c(3) ) # no cross validation, bs = 0 (default) }}

{{h2o.residual_deviance(object = mod, train = TRUE) # [1] 76080.37 }}

{{ mod2 <- do.call(what = "h2o.gam", args = c(params, fold_column = "fold")) h2o.residual_deviance(object = mod2, train = TRUE) # [1] 76080.37 }}

It might also be worth looking at the models' h2o.residual_analysis_plot. mod looks pretty normal, but mod2 shows very strange patterns in the residuals. Not providing here, but different values of bs also had inconsistencies and strange residuals.
Thanks for any help understanding what is happening!

exalate-issue-sync · 2023-05-11T15:38:31Z

Wendy Wong commented: I have run Paul’s code and was able to reproduce the error.

I remove the fold column when doing the predict using model mod2 but still do not get the same results as in mod. Something is wrong here.

h2o-ops · 2023-05-14T18:04:19Z

JIRA Issue Details

Jira Issue: PUBDEV-8681
Assignee: Wendy Wong
Reporter: Wendy Wong
State: Resolved
Fix Version: 3.36.1.3
Attachments: N/A
Development PRs: Available

h2o-ops · 2023-05-14T18:04:21Z

Linked PRs from JIRA

#6185

h2o-ops assigned wendycwong May 14, 2023

h2o-ops added the fixVersion/3.36.1.3 label May 14, 2023

h2o-ops closed this as completed May 14, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GAM prediction results in different results with cv on and off. #7015

GAM prediction results in different results with cv on and off. #7015

exalate-issue-sync bot commented May 11, 2023

exalate-issue-sync bot commented May 11, 2023

exalate-issue-sync bot commented May 11, 2023

h2o-ops commented May 14, 2023

h2o-ops commented May 14, 2023

GAM prediction results in different results with cv on and off. #7015

GAM prediction results in different results with cv on and off. #7015

Comments

exalate-issue-sync bot commented May 11, 2023

exalate-issue-sync bot commented May 11, 2023

exalate-issue-sync bot commented May 11, 2023

h2o-ops commented May 14, 2023

h2o-ops commented May 14, 2023