Implement in-training checkpoints for algos #7192

exalate-issue-sync · 2023-05-11T15:58:51Z

h3. Edit: The original task has changed to implement in-training checkpoints

{quote}
Grid search has the ability to checkpoint models but each model is independent. If a user can extend a model during grid search, they can use learnings from a previous model to grow another. {quote}

{quote}For example, you can make a GBM grid with ntrees=[50, 70, 100]. The grid search can first build 50 trees, checkpoint, then extend that previous model by adding more trees (50+20 = 70 trees), and so on. We are assuming that other hyperparameters are fixed in the grid.

This will reduce the overhead of repeating the work of smaller models and can also allow users to run training on the independent sessions while saving learnings/checkpoints from previous runs.{quote}

h2o-ops-ro · 2023-05-14T18:41:54Z

JIRA Issue Details

Jira Issue: PUBDEV-8470
Assignee: Adam Valenta
Reporter: Neema Mashayekhi
State: Resolved
Fix Version: 3.38.0.1
Attachments: N/A
Development PRs: Available

h2o-ops-ro · 2023-05-14T18:41:57Z

Linked PRs from JIRA

#6167
#6234
#6310
h2oai/sparkling-water#2819
https://github.com/h2oai/h2oai-serving/pull/1079

h2o-ops-ro assigned valenad1 May 14, 2023

h2o-ops-ro added the fixVersion/3.38.0.1 label May 14, 2023

h2o-ops-ro closed this as completed May 14, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement in-training checkpoints for algos #7192

Implement in-training checkpoints for algos #7192

exalate-issue-sync bot commented May 11, 2023

h2o-ops-ro commented May 14, 2023

h2o-ops-ro commented May 14, 2023

Implement in-training checkpoints for algos #7192

Implement in-training checkpoints for algos #7192

Comments

exalate-issue-sync bot commented May 11, 2023

h2o-ops-ro commented May 14, 2023

h2o-ops-ro commented May 14, 2023