Improve the Model Explainability interface in R & Python #7836

exalate-issue-sync · 2023-05-11T18:18:01Z

This is just an epic to collect tickets related to improvements in the R and Python interface for H2O Explainability.

A brief list (to be converted into JIRAs):

F-or AutoML objects, don’t use the whole model id in the labels/axes/legend of the plots that are the output of h2o.explain() – just use shortened model_id names. The date at the end does not help with visually understanding how the different models/algos compare (the long ids are distracting to the eye, so we could just remove it from the display by default…maybe it can override to use full model_id in plot_overrides?). Would be nice to view the model names as just: StackedEnsemble_AllModels, GLM_1, DRF_1, XGBoost_3, GBM_grid__1__model_3, etc.-
Model correlation has interpretable models (GLM) highlighed in red text (in Python), but we don’t explain what the red is for, and we don’t do it in the other visuals like Varimp Heatmap. Need to check if this is also the case in R.
Move explanation descriptions into a JSON or text file so there’s just one source and read that into R and Python (easier to edit a single source). Then make some updates to the descriptions.
I think we would benefit by using a title for plot name and subtitle for the model_id in all the R plots since they are pretty squished when using inside RStudio.
[https://www.datanovia.com/en/blog/ggplot-title-subtitle-and-caption/|https://www.datanovia.com/en/blog/ggplot-title-subtitle-and-caption/]
-I wonder if the Leaderboard printed out (specifically when you pass in an AutoML object) should just be top 20 models by default?-
We can default the Leaderboard to 20 models, but we could find a way to allow the user to override this with {{plot_overrides}}. e.g. {{plot_overrides = list(leaderboard=list(nrow=-1))}} to get all (or maybe there’s something better than -1 to use here, like “ALL” if they want to show all rows)? or they can set to a particular number, like 50.
Add more information at the top of the explain print-out for AutoML specific stats (how many models of each type, and best score (using default loss) for each algo type.
Visual improvement tweaks/ideas for the printed Leaderboard
** is there a way to control the number of decimal places shown? we could reduce to about 5 decimal places and get the table to be skinner & fit on the page better
** is it easy to left-align the model names in the first column? then it would be easy to read the type of model better.
I am wondering since the user passes the {{newdata}} test frame explicitly in the {{h2o.explain()}} function if we just shouldn't use the test set leaderboard metrics instead of the default CV metrics. But then it’s delivering a different “view” of the leaderboard than the internal AutoML object has… so it’s going to have some inconsistency either way, we just have to choose which type of inconsistency is better/worse.
Add learning curve of leader model (let’s decide if we want to plot train vs CV error or validation error or error for a single fold, etc).

exalate-issue-sync · 2023-05-11T18:18:03Z

Hud Wahab commented: Not sure if this is related, but {{.explain()}} doesn’t seem to be available for 3.30.1.3. See [https://h2oai.atlassian.net/browse/PUBDEV-7850|https://h2oai.atlassian.net/browse/PUBDEV-7850|smart-link]

h2o-ops · 2023-05-14T20:55:23Z

JIRA Issue Migration Info

Jira Issue: PUBDEV-7806
Assignee: Tomas Fryda
Reporter: Erin LeDell
State: Open
Fix Version: Backlog
Attachments: N/A
Development PRs: N/A

h2o-ops assigned tomasfryda May 14, 2023

h2o-ops added the fixVersion/Backlog label May 14, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve the Model Explainability interface in R & Python #7836

Improve the Model Explainability interface in R & Python #7836

exalate-issue-sync bot commented May 11, 2023

exalate-issue-sync bot commented May 11, 2023

h2o-ops commented May 14, 2023

Improve the Model Explainability interface in R & Python #7836

Improve the Model Explainability interface in R & Python #7836

Comments

exalate-issue-sync bot commented May 11, 2023

exalate-issue-sync bot commented May 11, 2023

h2o-ops commented May 14, 2023