-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Get AutoML leaderboard from sklearn wrapped functions #8431
Comments
Erin LeDell commented: [~accountid:5b153fb1b0d76456f36daced] has been working on some demos of how to use the new sklearn API – Seb, do you have a link to any of those notebooks yet? |
Sebastien Poirier commented: [~accountid:557058:afd6e9a4-1891-4845-98ea-b5d34a2bc42c] [~accountid:5e1ba4e7b5771b0ca440cd4e] I’ll publish a complete tutorial for those {{sklearn}} wrappers very soon. For now, just be aware that when using a {{sklearn}} wrapper of an H2OEstimator or H2OAutoML, you still have full access to the wrapped object, using the {{_estimator}} property. {code:python}from h2o.sklearn import H2OAutoMLClassifier aml = H2OAutoMLClassifier(max_models=5) I should probably expose a “public property” though: I didn’t want to create potential naming conflicts at first, but now that there is internal logic to prevent those, I don’t see what prevents me from making it public. |
Stan Biryukov commented: Thanks for the quick, reply. Good to know about the {{_estimator}} property. How do I save one of these sklearn wrapped models? I’m attempting to save a sklearn pipeline and can’t seem to find the best way to save to disk. Happy to open a separate issue if that’s best. For example, my {{mlt}} object is: {noformat}Pipeline(memory=None, Try pickle dump of everything: {noformat}import pickle Results in: TypeError: can't pickle dict_keys objects Try h2o save of just the estimator: Results in: {noformat} H2OTypeError: Argument |
Sebastien Poirier commented: [~accountid:5e1ba4e7b5771b0ca440cd4e] I could reproduce the issue with pickle, thanks for pointing that out. The fix is trivial, creating a ticket, will be in next minor release. For now, what you can still do is to save the params and the wrapper class to restore them later: {code:python}model = mlt.named_steps.model with open('/workspace/testautoml.pkl', 'rb') as fid: however, if the restored model is usable, it’s untrained/unfit of course, so it may suits your needs if you want to save the pipeline before training, but don’t expect to recover a trained {{H2OAutoML}} or a trained {{H2OEstimator}} from pickle that easily. I’m creating a quick fix for the {{dump}} issue as it will still allow you to dump both an untrained or trained wrapper, but you’ll still only be able to {{load}} an untrained one… If you want to be able to pickle trained models, please create a task, I can’t promise any time estimation for this issue though. |
Stan Biryukov commented: Thanks, Sebastien. It would be ideal to save and then load a model for production purposes, therefore I’ll open a new task. |
JIRA Issue Migration Info Jira Issue: PUBDEV-7201 Linked PRs from JIRA |
Using one of the new sklearn compatible automl models,
{noformat}from h2o.sklearn import H2OAutoMLRegressor, H2OAutoMLClassifier{noformat}
How can we access the leaderboard which has a summary of models?
I tried wrapping the model instance using the get_leaderboard function but that only accepts the H20AutoML class. Perhaps the leaderboard can be added as a definition to the sklearn classes?
{noformat}H2OTypeError: Argument
aml
should be an H2OAutoML, got H2OAutoMLClassifier H2OAutoMLClassifier(algo_parameters=None, balance_classes=False,class_sampling_factors=None, data_conversion='auto',
exclude_algos=None, export_checkpoints_dir=None,
include_algos=None,
keep_cross_validation_fold_assignment=False,
keep_cross_validation_models=False,
keep_cross_validation_predictions=False,
max_after_balance_size=5.0, max_models=None,
max_runtime_secs=None, max_runtime_secs_per_model=None,
modeling_plan=None, monotone_constraints=None, nfolds=5,
project_name=None, seed=4336, sort_metric='AUTO',
stopping_metric='AUTO', stopping_rounds=3,
stopping_tolerance=None, verbosity='warn'){noformat}
The text was updated successfully, but these errors were encountered: