Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PUBDEV-8047: AutoML Save/Load implementation #6437

Open
wants to merge 4 commits into
base: master
Choose a base branch
from

Conversation

tomasfryda
Copy link
Contributor

@tomasfryda tomasfryda commented Nov 24, 2022

https://h2oai.atlassian.net/browse/PUBDEV-8047

Known issues:

  • missing automl_training_frame
    • get_leaderboard("ALL") on loaded object produces NAs for predict_time_per_row_ms on loaded models
      • fixed by computing the extended leaderboard before saving, the predict_time_per_row_ms is successfully saved and loaded.
    • make_leaderboard("ALL") on loaded object without provided leaderboard_frame fails on predict_time_per_row_ms
      • this still happens but the behavior is the same as for normal automl object so Save/Load doesn't change anything in this case.

Comment on lines 959 to 984
protected static AutoML readAutoML(AutoBuffer ab, Futures fs) {
try (PersistenceContext pc = PersistenceContext.begin()) {
AutoML aml = new AutoML(ab.get(), null, ab.get(), false);
aml._leaderboard = (Leaderboard) ab.getKey(fs);
aml._eventLog = (EventLog) ab.getKey(fs);
// aml._trainingFrame = (Frame) ab.getKey(fs);
fs.blockForPending();
for (Key mk : aml.leaderboard().getModelKeys()) {
Model m = (Model) PersistenceContext.getKey(ab, fs, mk);
if (aml._buildSpec.build_control.keep_cross_validation_predictions)
for (Key k : m._output._cross_validation_predictions)
PersistenceContext.loadKey(ab, fs, k);
if (aml._buildSpec.build_control.keep_cross_validation_models)
for (Key k : m._output._cross_validation_models)
PersistenceContext.loadKey(ab, fs, k);
if (aml._buildSpec.build_control.keep_cross_validation_fold_assignment)
PersistenceContext.loadKey(ab,fs ,m._output._cross_validation_fold_assignment_frame_id);
// if (m instanceof StackedEnsembleModel)
// PersistenceContext.loadKey(ab, fs, ((StackedEnsembleModel)m)._output._metalearner._parms._train);
}
DKV.put(aml);
return aml;
} catch (Exception e) {
throw new RuntimeException(e);
}
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The objects have to be deserialized exactly in the same order as they were serialized => we need to keep track of what was already loaded and in that case skip the loading of that object.

@tomasfryda tomasfryda force-pushed the tomf_PUBDEV-8047_save_load_automl branch 2 times, most recently from d4fb774 to 5ba924b Compare November 28, 2022 11:41
@tomasfryda tomasfryda force-pushed the tomf_PUBDEV-8047_save_load_automl branch from 5ba924b to 74490f2 Compare November 28, 2022 13:09
@tomasfryda tomasfryda marked this pull request as ready for review November 30, 2022 15:46
Copy link
Contributor

@ledell ledell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great, thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants