Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AutoML Explain: Pareto front plots in R and Python #7076

Closed
exalate-issue-sync bot opened this issue May 11, 2023 · 4 comments
Closed

AutoML Explain: Pareto front plots in R and Python #7076

exalate-issue-sync bot opened this issue May 11, 2023 · 4 comments
Assignees

Comments

@exalate-issue-sync
Copy link

Add Pareto fronts (model perf vs prediction speed) for all the models in AutoML as a new “explanation” in the H2O Explain module. accuracy (auc, logloss, etc) on y-axis, predict speed on the x-axis.

{noformat}pf ← h2o.pareto_front(automl_object)
plot(pf)
pf # we need a way to return the data too{noformat}

Also since you could potentially want to make frontiers for other metrics of any group of models, so we might want to offer a way for the user to select what metrics they want on the x and y axis, and keep accuracy vs speed frontier as default for AutoML objects.

@exalate-issue-sync
Copy link
Author

Erin LeDell commented: [~accountid:5e43370f5a495e0c91a74ebe] I attached a prototype in R. Can be used on an AutoML object, though the final version should support H2OGrid and a list of models too (or the raw data.frame that stores the x,y,label data).

Need to decide if the object should return a H2OParetoFront object (or a list) with several objects including the ggplot2 object and the data object of the Pareto front. The current prototype just returns a ggplot2 object (hence theb current _plot() name of the function), but we may want to optionally (or in addition) return the data object.

{noformat}library(h2o)
h2o.init()

prostate_path <- system.file("extdata", "prostate.csv", package = "h2o")
prostate <- h2o.importFile(path = prostate_path, header = TRUE)
y <- "CAPSULE"
prostate[,y] <- as.factor(prostate[,y]) #convert to factor for classification
aml <- h2o.automl(y = y, training_frame = prostate, max_runtime_secs = 30)

source the attached pareto code here

pf <- h2o.pareto_front_plot(aml)
pf{noformat}

[^h2o_pareto_front_plot.R]

The details need work but it currently produces this:

!Screen Shot 2022-03-05 at 6.53.05 PM.png|width=816,height=614!

@h2o-ops
Copy link
Collaborator

h2o-ops commented May 14, 2023

JIRA Issue Details

Jira Issue: PUBDEV-8589
Assignee: Tomas Fryda
Reporter: Erin LeDell
State: Resolved
Fix Version: 3.38.0.1
Attachments: Available (Count: 3)
Development PRs: Available

@h2o-ops
Copy link
Collaborator

h2o-ops commented May 14, 2023

Attachments From Jira

Attachment Name: h2o_pareto_front_plot.R
Attached By: Erin LeDell
File Link:https://h2o-3-jira-github-migration.s3.amazonaws.com/PUBDEV-8589/h2o_pareto_front_plot.R

Attachment Name: Screen Shot 2022-02-17 at 2.55.01 PM.png
Attached By: Erin LeDell
File Link:https://h2o-3-jira-github-migration.s3.amazonaws.com/PUBDEV-8589/Screen Shot 2022-02-17 at 2.55.01 PM.png

Attachment Name: Screen Shot 2022-03-05 at 6.53.05 PM.png
Attached By: Erin LeDell
File Link:https://h2o-3-jira-github-migration.s3.amazonaws.com/PUBDEV-8589/Screen Shot 2022-03-05 at 6.53.05 PM.png

@h2o-ops
Copy link
Collaborator

h2o-ops commented May 14, 2023

Linked PRs from JIRA

#6176
https://github.com/h2oai/h2oai-serving/pull/1079

@h2o-ops h2o-ops closed this as completed May 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants