Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

throw warning and update error message #3576

Open
wants to merge 6 commits into
base: master
Choose a base branch
from
Open
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
7 changes: 4 additions & 3 deletions shap/explainers/_tree.py
Original file line number Diff line number Diff line change
Expand Up @@ -182,7 +182,7 @@ def __init__(
self.data = data
if self.data is None:
feature_perturbation = "tree_path_dependent"
#warnings.warn("Setting feature_perturbation = \"tree_path_dependent\" because no background data was given.")
warnings.warn("Setting feature_perturbation = \"tree_path_dependent\" because no background data was given.")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change may impact a large fraction of users, so I think we should carefully consider what level of notification is appropriate. Would logging.info be more appropriate here?

In the logging HOWTO guide, they recommend:

  • To report events that occur "during normal operation of a program", use logging info()
  • To issue a warning regarding a particular runtime event, if the issue is avoidable and the client application should be modified to eliminate the warning, use warnings.warn

To me, this seems like normal operation rather than an "issue", as this behaviour is described in the docs of tree explainer:

feature_perturbation : "interventional" (default) or "tree_path_dependent" (default when data=None)

My guess is that Scott commented this warning as it was becoming a bit annoying.

elif feature_perturbation == "interventional" and self.data.shape[0] > 1_000:
wmsg = (
f"Passing {self.data.shape[0]} background samples may lead to slow runtimes. Consider "
Expand Down Expand Up @@ -351,8 +351,9 @@ def _validate_inputs(self, X, y, tree_limit, check_additivity):
"so TreeExplainer cannot run with the "
"feature_perturbation=\"tree_path_dependent\" option! "
"Try providing a larger background "
"dataset, no background dataset, or using "
"feature_perturbation=\"interventional\"."
"dataset, no background dataset, using "
"feature_perturbation=\"interventional\" or "
"reducing the number of trees might help."
)
raise ExplainerError(emsg)

Expand Down