-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add h2o.make_leaderboard function to score & compare a set of models #12152
Comments
Erin LeDell commented: Hi [~accountid:557058:9328661f-241f-4a0f-9d9a-d4e78ef05ba0] just tagging you here since this ticket interested you. |
Ruslan Dautkhanov commented: Thank you Erin ! |
Sebastien Poirier commented: Client API decision based on discussion: {code:r}#' h2o.make_leaderboard <- function( Backend will generate warnings in case of inconsistencies depending on the choice of {{scoring_data}}.
If {{valid}} is set and one model was trained without validation frame, an error should be raised. if {{scoring_data='AUTO'}} is set, the first strategy common to ALL models should be chosen. For example, some models may have been trained with {{xval}} and others without, in which case we fallback to {{valid}} and if some models don’t have any validation frame, we fallback to {{train}}. Finally, at first, we will return all default metrics defined in AutoML leaderboard. |
Sebastien Poirier commented: Also using name {{make_leaderboard}} instead of {{create_leaderboard}} for consistency: R+Py APIs seem to favour {{make_}} prefix over {{create_}} |
JIRA Issue Migration Info Jira Issue: PUBDEV-5280 Linked PRs from JIRA |
The idea of this function is to score a bunch of models and compare their performance on a "leaderboard". This doesn't need to be a list of AutoML models, it could be a simple list of models, a grid of models, or an AutoML object (from which we retrieve the models).
You should be able to use this function on a new dataset, or create a leaderboard from stored metrics (train/valid/xval). It's similar to the h2o.peformance() function in that sense, but instead of returning the performance objects, it will simply return an H2OFrame of metrics (rows = models, cols = metrics).
A few design ideas:
{code}
lb <- h2o.create_leaderboard(models, newdata = NULL, train = FALSE, valid = FALSE,
xval = FALSE, metric = "AUTO")
{code}
{code}
lb <- h2o.create_leaderboard(models, newdata = NULL, train = FALSE, valid = FALSE,
xval = FALSE, sort_metric = "AUTO")
{code}
{code}
lb <- h2o.create_leaderboard(models, newdata = NULL, sort_metric = "AUTO", sort_data = "AUTO")
{code}
The
models
argument would support multiple types of input (which could all be translated to a list of model_ids before sending to the backend, if that's easier): list of models, list of model ids, grid, (maybe a list of grids), and an automl object.Let's do a check to make sure that the models are all of the same type (binomial, multiclass, regression).
The text was updated successfully, but these errors were encountered: