Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: shap.maskers.Impute() throws TypeError #3378

Open
4 tasks done
stompsjo opened this issue Nov 2, 2023 · 3 comments · May be fixed by #3379
Open
4 tasks done

BUG: shap.maskers.Impute() throws TypeError #3378

stompsjo opened this issue Nov 2, 2023 · 3 comments · May be fixed by #3379
Labels
bug Indicates an unexpected problem or unintended behaviour

Comments

@stompsjo
Copy link
Contributor

stompsjo commented Nov 2, 2023

Issue Description

shap.maskers.Impute() in version 0.42.1 is throwing the following TypeError when used in an Explainer object. This is regardless of model or data type (minimal reproducible example below uses a toy example that was not the first time I encountered this problem). It seems like this problem would likely be encountered in other (albeit much older) issues like #1723. It's entirely possible that I am not using Impute or Explainer properly, in which case I would appreciate any corrections. 😄

From what I can tell in the source code, Impute has not implemented a __call__ function and is inheriting the blank __call__ function from Masker, its parent. There's a comment that it should eventually be inheriting from Tabular once arbitrary masking is supported. Should this be changed, or can someone expand on this? Thanks for the help!

P.S. Given the latest source code and release notes, it is not clear to me that 0.43.0 resolves this issue, or that the issue is resolved on the master branch.

Minimal Reproducible Example

# relevant packages
from sklearn.datasets import make_regression
from sklearn.neural_network import MLPRegressor
from sklearn.model_selection import train_test_split
import shap

# setup toy data
X, y = make_regression(n_samples=100)
X_train, X_test, y_train, y_test = train_test_split(X, y, train_size = 0.75)

# train toy model
model = MLPRegressor()
model.fit(X_train, y_train)
model.score(X_test, y_test)

background = shap.maskers.Impute(X_train)
# TypeError here:
explainer = shap.Explainer(model.predict, masker=background)

shap_values = explainer(X_test)
exp = shap.Explanation(shap_values.values, shap_values.base_values, shap_values.data)

Traceback

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
     17 # TypeError here:
     18 explainer = shap.Explainer(model.predict, masker=background)
---> 20 shap_values = explainer(X_test)
     21 exp = shap.Explanation(shap_values.values, shap_values.base_values, shap_values.data)

File ~/miniconda3/lib/python3.9/site-packages/shap/explainers/_permutation.py:76, in Permutation.__call__(self, max_evals, main_effects, error_bounds, batch_size, outputs, silent, *args)
     72 def __call__(self, *args, max_evals=500, main_effects=False, error_bounds=False, batch_size="auto",
     73              outputs=None, silent=False):
     74     """ Explain the output of the model on the given arguments.
     75     """
---> 76     return super().__call__(
     77         *args, max_evals=max_evals, main_effects=main_effects, error_bounds=error_bounds, batch_size=batch_size,
     78         outputs=outputs, silent=silent
     79     )

File ~/miniconda3/lib/python3.9/site-packages/shap/explainers/_explainer.py:264, in Explainer.__call__(self, max_evals, main_effects, error_bounds, batch_size, outputs, silent, *args, **kwargs)
    262     feature_names = [[] for _ in range(len(args))]
    263 for row_args in show_progress(zip(*args), num_rows, self.__class__.__name__+" explainer", silent):
--> 264     row_result = self.explain_row(
    265         *row_args, max_evals=max_evals, main_effects=main_effects, error_bounds=error_bounds,
    266         batch_size=batch_size, outputs=outputs, silent=silent, **kwargs
    267     )
    268     values.append(row_result.get("values", None))
    269     output_indices.append(row_result.get("output_indices", None))

File ~/miniconda3/lib/python3.9/site-packages/shap/explainers/_permutation.py:134, in Permutation.explain_row(self, max_evals, main_effects, error_bounds, batch_size, outputs, silent, *row_args)
    131     i += 1
    133 # evaluate the masked model
--> 134 outputs = fm(masks, zero_index=0, batch_size=batch_size)
    136 if row_values is None:
    137     row_values = np.zeros((len(fm),) + outputs.shape[1:])

File ~/miniconda3/lib/python3.9/site-packages/shap/utils/_masked_model.py:66, in MaskedModel.__call__(self, masks, zero_index, batch_size)
     64         full_masks = np.zeros((int(np.sum(masks >= 0)), self._masker_cols), dtype=bool)
     65         _convert_delta_mask_to_full(masks, full_masks)
---> 66         return self._full_masking_call(full_masks, zero_index=zero_index, batch_size=batch_size)
     68 else:
     69     return self._full_masking_call(masks, batch_size=batch_size)

File ~/miniconda3/lib/python3.9/site-packages/shap/utils/_masked_model.py:106, in MaskedModel._full_masking_call(self, masks, zero_index, batch_size)
    103     masked_inputs = (masked_inputs,)
    105 # masked_inputs = self.masker(mask, *self.args)
--> 106 num_mask_samples[i] = len(masked_inputs[0])
    108 # see which rows have been updated, so we can only evaluate the model on the rows we need to
    109 if i == 0 or self._variants is None:

TypeError: object of type 'NoneType' has no len()

Expected Behavior

No response

Bug report checklist

  • I have checked that this issue has not already been reported.
  • I have confirmed this bug exists on the latest release of shap.
  • I have confirmed this bug exists on the master branch of shap.
  • I'd be interested in making a PR to fix this bug

Installed Versions

0.42.1

@stompsjo stompsjo added the bug Indicates an unexpected problem or unintended behaviour label Nov 2, 2023
@CloseChoice
Copy link
Collaborator

CloseChoice commented Nov 2, 2023

Thanks for the bug report. I can reproduce this on master. As far as I can see the problem is that the Impute masker does not overwrite the __call__ function of the base class and therefore returns None here.

@stompsjo I see that you are interested in a PR to fix this. Feel free to dive into this a bit and ask for help if you are stuck.

@stompsjo
Copy link
Contributor Author

stompsjo commented Nov 3, 2023

I am happy to start a PR, but I think getting Impute working will be more than a matter of changing the parent class from Masker->Tabular. It looks like the Impute class is unfinished? I'm inferring that method='linear' is supposed to define the interpolation method but it is not implemented. It also seems like the following is repeated in Tabular.__init__:

if data is dict and "mean" in data:
self.mean = data.get("mean", None)
self.cov = data.get("cov", None)
data = np.expand_dims(data["mean"], 0)

I tried tracking down the history of this class, and I see that there is a more complete implementation in the branch shap:benchmark_utility, most recently PR'd 3 years ago in #1489, but that has not been merged into shap:master. Are @slundberg or @maggiewu19 able to provide context?

@CloseChoice
Copy link
Collaborator

CloseChoice commented Nov 3, 2023

I agree that the Impute class is unfinished. IMO the scope of a first PR should be that our public API is working.
Therefore we can either remove the masker (and file an enhancement issue to implement it) or implement it.

For implementing I would just fall back to the sklearn imputers or check the work of @maggiewu19 first.

Everything else, e.g. researching suitable imputation methods (see here or here) should be done in a separate issue.

@stompsjo stompsjo linked a pull request Nov 3, 2023 that will close this issue
2 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Indicates an unexpected problem or unintended behaviour
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants