AutoML H2o #4

Shafi2016 · 2021-02-08T23:40:49Z

Thanks for the nice package. Do you have any documentation for this package? Do you have a basic example for running AutoML? I was thinking to combine it with modeltime R.

stevenpawley · 2021-02-12T21:40:27Z

Unfortunately I haven't gotten that far yet. However, it is mostly the same as using anything in tidymodels/parsnip, apart from you set the engine to "h2o". A conceptual struggle is that H2O is probably best when the data is kept within a H2OFrame, but that doesn't work if you are using other tidymodels features, e.g. recipes, tune etc., which require data to be in the R environment. There is a very rough tune_grid_h2o in the package, which keeps the data within the H2O cluster.

Shafi2016 · 2021-02-18T01:03:52Z

Probably you can collaborate with H2O's people to improve it further.

mdancho84 · 2021-03-08T11:51:24Z

I agree with @stevenpawley that it's needed to minimize the data conversion (this is actually very expensive when converting to/from Data Frame / H2O Frame.

The nice thing about H2O AutoML is that it manages the whole process of tuning, so there shouldn't be much hyper param tweaking. If there is, the user can use set_engine() to specify the needed arguments, which would go straight to the h2o::automl() function and be used in the tuning process within H2O.

With that said, the only challenge I see is that (unlike other H2O algorithms) AutoML returns a Leaderboard. This requires a choice on the user's end. Typically my choices are:

Best (normally ends up being a Stacked Ensemble)
Best-Explainable (I usually take the top explainable model, which is XGBoost, GBM, Deep Learning, etc. Stacked Ensembles don't have variable importance so I don't use these for explainability).

An option during the training process would be to store both of these models. Then when the user serializes (saves), the user gets both models. Prediction happens with the Best. Explanation happens with the best explainable.

These are just my thoughts... Would be happy to discuss more as part of #5.

stevenpawley closed this as completed Feb 12, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AutoML H2o #4

AutoML H2o #4

Shafi2016 commented Feb 8, 2021 •

edited

Loading

stevenpawley commented Feb 12, 2021

Shafi2016 commented Feb 18, 2021

mdancho84 commented Mar 8, 2021

AutoML H2o #4

AutoML H2o #4

Comments

Shafi2016 commented Feb 8, 2021 • edited Loading

stevenpawley commented Feb 12, 2021

Shafi2016 commented Feb 18, 2021

mdancho84 commented Mar 8, 2021

Shafi2016 commented Feb 8, 2021 •

edited

Loading