Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

R API for Infogram #7430

Closed
exalate-issue-sync bot opened this issue May 11, 2023 · 6 comments
Closed

R API for Infogram #7430

exalate-issue-sync bot opened this issue May 11, 2023 · 6 comments
Assignees

Comments

@exalate-issue-sync
Copy link

{noformat}ig <- h2o.infogram(x, y, training_frame,
protected_columns = NULL,
algorithm = c("gbm", "automl", ...),
algorithm_params = NULL,
thresholds = c(0.1, 0.1), #x, y thresholds
top_n_features = 50,
...){noformat}

Note: You can pass plot arguments to the {{...}} (aes, etc.).

Function returns a {{H2OInfogram}} object, which contains several slots for data, as well as the plots (ggplot object in R and a matplotlib object in Python). This is very similar to the H2O Explain module and the {{H2OExplanation}} class.

The {{ig}} object contains the following slots (note possibly change {{_columns}} to {{_features}}, though {{top_n_features}} is used to denote an integer in H2O Explain):

  • {{admissible_columns}}: list of column names which are in the top-right of the Infogram.
  • {{protected_columns}}: this might be useful to store just for reference.
  • {{top_n_columns}}: let's record which columns were found as "top N"
  • {{plots}}: ggplot/matplot lib object
  • {{admissible_score}} (or better name): An H2OFrame to store the (x,y) index values on the infogram plot, containing four columns: feature name, and then the values for the (x,y) plot (two additional columns), and boolean "admissible":
    ** x: {{relevance_index}} (aka total information)
    ** y: {{core_index}} (aka conditional/net information) or {{safety_index}}

Add a plot methods such that to plot the Infogram, you do: {{plot(ig)}} in R.

@exalate-issue-sync
Copy link
Author

Erin LeDell commented: Since the {{h2o.admissibleML()}} function will need to have {{infogram_algorithm}} and {{infogram_algorithm_params}} (to differentiate from the main model training algorithm ({{automl_params}}, etc)), it might make sense to be overly verbose here and change the shorter, nicer {{algorithm}} and {{algorithm_params}} to the more lengthy above versions. The longer version is what’s currently implemented: [https://github.com//pull/5572/files|https://github.com//pull/5572/files|smart-link]

@exalate-issue-sync
Copy link
Author

Erin LeDell commented: Is {{sensitive_columns}} a better name than {{protected_columns}}?

@exalate-issue-sync
Copy link
Author

Wendy commented: Erin: I prefer the threshold parameter to be broken into two parts, one for cmi, one for relevance so that people don’t have to remember the order of which one is for which.

@exalate-issue-sync
Copy link
Author

Erin LeDell commented: Note to self: update the algorithm description in admissible.R to:

{noformat}#' The infogram is an information-theoretic graphical tool which allows the user to quickly spot the "core" decision-making variables
#' that are driving the response, while minimizing redundancy, for supervised learning problems. The user can can also define protected
#' features to set the admissibility criterion. All other features will be checked to make sure that their pathway to the response
#' is not through a protected feature. A measure of how safe each feature is is provided on the y-axis of the infogram, while the relevance
#' to the response (how much the variable drives the response) is plotted on the x-axis of the infogram. {noformat}

@h2o-ops
Copy link
Collaborator

h2o-ops commented May 14, 2023

JIRA Issue Details

Jira Issue: PUBDEV-8222
Assignee: Erin LeDell
Reporter: Erin LeDell
State: Resolved
Fix Version: 3.36.0.1
Attachments: N/A
Development PRs: Available

@h2o-ops
Copy link
Collaborator

h2o-ops commented May 14, 2023

Linked PRs from JIRA

#5933

@h2o-ops h2o-ops closed this as completed May 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants