BUG: Strange base value of exact explainer #3174

mayer79 · 2023-08-01T15:57:47Z

Issue Description

I expect the baseline of the exact explainer to equal the average prediction on the background data (masker). However, there are cases where this is not true, see example.

Minimal Reproducible Example

import numpy as np
import pandas as pd
import shap

n = 101
x = np.arange(n)
X = pd.DataFrame(dict(x1=x, x2=np.flip(x)))
print(X.x1.mean())  # 50
X.head()

def true_model(X):
    return X.x1

ex = shap.Explainer(true_model, masker=X, algorithm="exact")

ps = ex(X[0:3])
ps

# Output
#.values =
#array([[-50.06,   0.  ],
#       [-49.06,   0.  ],
#       [-48.06,   0.  ]])
#
#.base_values =
#array([50.06, 50.06, 50.06])
#
#.data =
#array([[  0, 100],
#       [  1,  99],
#       [  2,  98]])

Expected Behavior

The average of 0, ..., 100 is 50, so I'd expect the baseline to equal 50.

Running above example with n = 100 gives a baseline of 49.5 (as expected).

Bug report checklist

I have checked that this issue has not already been reported.
I have confirmed this bug exists on the latest release of shap.
I have confirmed this bug exists on the master branch of shap.
I'd be interested in making a PR to fix this bug

Installed Versions

0.42.1

The text was updated successfully, but these errors were encountered:

znacer · 2023-08-14T15:07:12Z

Hi,
This seems to be caused by a hard coded limitation of the masker size.
The class handling Tabular maskers has a limitation (max_samples) set to 100. Above this limit, only a subsample of the masker dataset is considered.

A possible fix could be to add a property max_samples to Explainer make it possible for users to tune this limitation.

mayer79 · 2023-08-17T06:15:50Z

You are right, it affects even linear explainers.

CloseChoice · 2023-09-29T14:08:06Z

@znacer I implemented your proposal. That sounds like a good approach to me.

mayer79 · 2023-09-29T14:59:28Z

Awesome, thanks @znacer and @CloseChoice

connortann · 2023-12-04T11:42:11Z

Hi, This seems to be caused by a hard coded limitation of the masker size. The class handling Tabular maskers has a limitation (max_samples) set to 100. Above this limit, only a subsample of the masker dataset is considered.

A possible fix could be to add a property max_samples to Explainer make it possible for users to tune this limitation.

The explainer object already accepts a masker, so I think it would be preferable to pass in a masker with the desired number of samples rather than expose more masker params in the Explainer class.

What do you think, would this be acceptable?

ex = shap.Explainer(true_model, masker=maskers.Independent(X, max_samples=1000), algorithm="exact")

mayer79 · 2023-12-04T12:53:11Z

Good idea, that would be quite elegant, indeed.

connortann · 2023-12-04T13:51:53Z

What do you think @CloseChoice , happy for me to close this issue for now with the approach above as a recommended way to set the masker params?

CloseChoice · 2023-12-04T18:35:05Z

@connortann, that's fine for me

mayer79 added the bug Indicates an unexpected problem or unintended behaviour label Aug 1, 2023

CloseChoice mentioned this issue Sep 29, 2023

Fix/base value of exact explainer 3174 #3292

Closed

2 tasks

connortann closed this as completed Dec 4, 2023

CloseChoice mentioned this issue Jan 20, 2024

ENH: increase transparency of background dataset sub-sampling #3461

Open

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BUG: Strange base value of exact explainer #3174

BUG: Strange base value of exact explainer #3174

mayer79 commented Aug 1, 2023 •

edited

znacer commented Aug 14, 2023 •

edited

mayer79 commented Aug 17, 2023

CloseChoice commented Sep 29, 2023

mayer79 commented Sep 29, 2023

connortann commented Dec 4, 2023

mayer79 commented Dec 4, 2023

connortann commented Dec 4, 2023

CloseChoice commented Dec 4, 2023

BUG: Strange base value of exact explainer #3174

BUG: Strange base value of exact explainer #3174

Comments

mayer79 commented Aug 1, 2023 • edited

Issue Description

Minimal Reproducible Example

Expected Behavior

Bug report checklist

Installed Versions

znacer commented Aug 14, 2023 • edited

mayer79 commented Aug 17, 2023

CloseChoice commented Sep 29, 2023

mayer79 commented Sep 29, 2023

connortann commented Dec 4, 2023

mayer79 commented Dec 4, 2023

connortann commented Dec 4, 2023

CloseChoice commented Dec 4, 2023

mayer79 commented Aug 1, 2023 •

edited

znacer commented Aug 14, 2023 •

edited