ENH: Add Winter Values to Partition Explainer #3666

CousinThrockmorton · 2024-05-19T20:11:25Z

Overview

Description of the changes proposed in this pull request:

Extend SHAP to calculate feature contributions for coalitions of features Owen and Winter values. Having reviewed the Partition Explainer and Permutation Explainer code I am not aware of any current package or method that allows users to calculate these values with context specific coalition structures.

The current PartitionExplainer relies on the scipy.cluster.hierarchy package to simplify the permutations required by Shapley value by restricting the calls to those consistent with the coalitions based on the closeness of feature. Therefore, it is not as stated the Owen value but the Winter Value for the hierarchical clustering.

Checklist

Adopt utils.masked_models to accept custom non-binary hierarchies(/ translate them to binary trees and create masks of them at the specified level. (isn't possible and computationally an inferior approach))
Simplify explainer._partition.py code for faster execution and clarify code documentation.
All pre-commit checks pass.
Unit tests added (if fixing a bug or adding a new feature)

for more information, see https://pre-commit.ci

CloseChoice · 2024-06-03T20:57:10Z

I am looking into this PR right now. It's a bit hard to actually see the difference/similarities to the existing implementation, will try to change a few things there. Could you please provide a link without a pay wall to the winter value formula? I am not really familiar with the concept and didn't find any definition in the papers you linked in the issue.

EDIT: is there a test case we can implement? We need this for new functionality.

CousinThrockmorton · 2024-06-04T11:04:32Z

https://citeseerx.ist.psu.edu/document?repid=rep1&type=pdf&doi=90811fdcda3f3243fa57c0e2d14926a70e5858b1

Here is the link without a paywall, the description of the value can be found on pages 22-23 describing Winter (1989). I am sorry for the slow progress I had been going over the current implementation and a bit new to this.

I believe the PartitionExplainer and the ExactExplainer with the clustering option both calculate the Winter value not the Owen value (only respecting flat clusters) as stated in the documentations as the permutations used respect the whole clustering tree created by scipy. I believe the logic behind this approximation is that the marginals for correlated features will be similar with respect to the other correlated features and therefore we can reduce the number of permutations.

The point I am stuck at is in order to pass user described hierarchies the scipy binary hierarchy is not sufficient. However I am not sure what would be the most user friendly way to define the hierarchy so it can be checked and used easily. I will base my approach on the ExactExplainer as with the clustering option it is faster than the PartitionExplainer and write a new make_masks function (the methods mainly differ in how they use the results of this function to get to the same point) the rest should be the same. Do you think this option should be added to the ExactExplainer or the Partition method?

For example the exact method calculates for features [1,2,3,4] the Shapley value for 1.

[1] - []
[1,2] -[2]
[1,3] -[3]
[1,4] -[4]
[1,2,3] -[2,3]
[1,3,4] -[3,4]
[1,2,4] -[2,4]
[1,2,3,4] -[2,3,4]

The Owen value would be with clustering [[1,2,],[3,4]].

[1] - []
[1,2] -[2]
[1,3,4] -[3,4]
[1,2,3,4] -[2,3,4]

The Winter value [[1,2],[[3,4],[5]]]. I hope this structure makes the hierarchy clear, essentially we are talking of nested partitions of features.

[1] - []
[1,2] -[2]
[1,3,4,5] -[3,4,5]
[1,2,3,4,5] -[2,3,4,5]

So the features marginal contribution is to the cluster that it is part of and to the clusters it is external to. And this is exactly what the current method does with the binary clustering trees. For the notebook on PartitionExplainer this is the clustering used.

For the feature HouseAge the following masks are used which is what the Winter Value specifies.

X.columns = Index(['MedInc', 'HouseAge', 'AveRooms', 'AveBedrms', 'Population', 'AveOccup',
'Latitude', 'Longitude'],

(marginal to the external tree) [12]
[ True, False, False, False, True, True, False, True],
[ True, True, False, False, True, True, False, True],
(marginal with all turned on) [12, 10]
[ True, False, True, True, True, True, True, True],
[ True, True, True, True, True, True, True, True],
(marginal with all off) []
[False, False, False, False, False, False, False, False],
[False, True, False, False, False, False, False, False],
(marginal to the internal tree) [10]
[False, False, True, True, False, False, True, False],
[False, True, True, True, False, False, True, False],

or for Latitude

External [12 ,1]
[ True True False False True True False True]
[ True True False False True True True True]
All turned on [12 ,1 ,8]
[ True True True True True True False True]
[ True True True True True True True True]
External tree [12]
[ True False False False True True False True]
[ True False False False True True True True]
All but HouseAge [12 ,8]
[ True False True True True True False True]
[ True False True True True True True True]
To the HouseAge [1]
[False True False False False False False False]
[False True False False False False True False]
Internal [1 ,8]
[False True True True False False False False]
[False True True True False False True False]
All turned off []
[False False False False False False False False]
[False False False False False False True False]
Local cluster [8]
[False False True True False False False False]
[False False True True False False True False]

A couple examples of hierarchies I would like to pass [[1,2],[3,4,5]] or [[1,2],[[3,4][5,6]]] which can not be necessarily translated to a scipy hierarchy. I hope this clarifies what I would like to do as well as the current state of the ExactExplainer and PartitionExplainer methods. I hope you can give me some guidance on how to proceed. I really appreciate your help apologies for the long reply. Hope you have a good day!

for more information, see https://pre-commit.ci

CloseChoice · 2024-06-09T09:40:21Z

Thanks, this helps a lot. Will take a deeper look in the next week and we can follow this up here.

for more information, see https://pre-commit.ci

CousinThrockmorton · 2024-06-10T12:44:41Z

Thanks for the help. You can find most of my progress in the Non-binary_tree_masking notebook under notebooks. I am trying to figure out how to generate all combinations of masks of top node siblings and local siblings. This is done in the current partition method via traversing the tree and adding to each nodes mask its siblings mask on the left and right, then traversing the courser partition on the left and right and going lower if possible. See my comment before. This way we end up with all possible combinations top siblings and own path siblings in line with the Winter value calculation.

Now this is a slightly harder with n-ary trees as there is no easy left and right. At each level we need to calculate all possible combinations of masks then refine all these combinations at the lower level but one by one?

for more information, see https://pre-commit.ci

CloseChoice · 2024-06-18T07:14:23Z

The paper without a paywall doesn't work anymore. I read Winter (1989 but did not find a concrete formula that can be implemented in a straight forward fashion. Could you point me to that?

Also I would think that we are missing some combinations here:

The Winter value [[1,2],[[3,4],[5]]]. I hope this structure makes the hierarchy clear, essentially we are talking of nested partitions of features.
[1] - []
[1,2] -[2]
[1,3,4,5] -[3,4,5]
[1,2,3,4,5] -[2,3,4,5]

Aren't we missing:
[1, 5] - [5]
[1, 3, 4] - [3, 4]
?

CousinThrockmorton · 2024-06-18T15:19:31Z

Hey, thanks for looking into the paper. Here is an other publication that interprets the use of the partition_explainer. https://towardsdatascience.com/shaps-partition-explainer-for-language-models-ec2e7a6c1b77. The formula is the same, the average of marginals, as Shapley values. In fact there is an interpretation that Winter values are a recursive call of Shapley values with respect to the coalitions.

That is for the coalitions structure [[1,2],[[3,4],[5]]. One way to get the Winter values is to consider [1,2], and [[3,4],[5]] as the players and calculate their Shapley values getting a value for each cluster. Then use the assigned value for each cluster as the payoff to calculate Shapley values on the constituent parts of each cluster for instance between [3,4], and [5]. And so on and so forth.

This recursive method could work. However, I believe the direct option may be more straight forward. The idea is to consider external and internal coalitions to get the value. External coalitions are that of the top level assigning the feature value with respect to other coalitions. Then the internal coalitions assign the value in group to the feature. So the answer to your question is, no as [[3,4],[5]] are external to the cluster of [1,2] and form one cluster.

CousinThrockmorton and others added 16 commits May 7, 2024 22:27

Update _partition.py

90656a1

Update _partition.py

662f3d3

rewriting_start

fbd9dc0

Update Rewriting.ipynb

8a3e870

h

d9cae32

h

7c3a024

Merge branch 'master' of https://github.com/CousinThrockmorton/shap

507f8f7

h

5f4b66b

[pre-commit.ci] auto fixes from pre-commit.com hooks

e951d81

for more information, see https://pre-commit.ci

h

0f40e1a

h

e2727bd

Merge branch 'master' of https://github.com/CousinThrockmorton/shap

2076faf

[pre-commit.ci] auto fixes from pre-commit.com hooks

7fbe76a

for more information, see https://pre-commit.ci

h

4ebd5da

[pre-commit.ci] auto fixes from pre-commit.com hooks

36c87e6

for more information, see https://pre-commit.ci

Merge branch 'master' of https://github.com/CousinThrockmorton/shap

d62405d

CousinThrockmorton and others added 12 commits June 4, 2024 21:06

h

0447835

Merge branch 'master' of https://github.com/CousinThrockmorton/shap

70697f5

Merge branch 'master' of https://github.com/CousinThrockmorton/shap

f6abf31

[pre-commit.ci] auto fixes from pre-commit.com hooks

74c70d5

for more information, see https://pre-commit.ci

h

5e60ef9

Merge branch 'master' of https://github.com/CousinThrockmorton/shap

cf84b6e

[pre-commit.ci] auto fixes from pre-commit.com hooks

a518a52

for more information, see https://pre-commit.ci

h

b6c47cf

Merge branch 'master' of https://github.com/CousinThrockmorton/shap

16fc262

[pre-commit.ci] auto fixes from pre-commit.com hooks

d517996

for more information, see https://pre-commit.ci

j

f4ae096

Merge branch 'master' of https://github.com/CousinThrockmorton/shap

fbabb7b

[pre-commit.ci] auto fixes from pre-commit.com hooks

40c6a91

for more information, see https://pre-commit.ci

CousinThrockmorton and others added 3 commits June 10, 2024 13:41

k

068347e

Merge branch 'master' of https://github.com/CousinThrockmorton/shap

f762660

[pre-commit.ci] auto fixes from pre-commit.com hooks

89fa47b

for more information, see https://pre-commit.ci

CousinThrockmorton and others added 9 commits June 11, 2024 11:52

pre_meeting

b707633

Merge branch 'master' of https://github.com/CousinThrockmorton/shap

7e26e12

[pre-commit.ci] auto fixes from pre-commit.com hooks

877b9de

for more information, see https://pre-commit.ci

fixed n-ary tree structure

9a9f4c5

Merge branch 'master' of https://github.com/CousinThrockmorton/shap

daa879d

[pre-commit.ci] auto fixes from pre-commit.com hooks

6a0e26c

for more information, see https://pre-commit.ci

h

7c9257b

Merge branch 'master' of https://github.com/CousinThrockmorton/shap

ee2b84a

[pre-commit.ci] auto fixes from pre-commit.com hooks

a1baa9f

for more information, see https://pre-commit.ci

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH: Add Winter Values to Partition Explainer #3666

ENH: Add Winter Values to Partition Explainer #3666

CousinThrockmorton commented May 19, 2024 •

edited

CloseChoice commented Jun 3, 2024 •

edited

CousinThrockmorton commented Jun 4, 2024 •

edited

CloseChoice commented Jun 9, 2024

CousinThrockmorton commented Jun 10, 2024 •

edited

CloseChoice commented Jun 18, 2024 •

edited

CousinThrockmorton commented Jun 18, 2024

ENH: Add Winter Values to Partition Explainer #3666

Are you sure you want to change the base?

ENH: Add Winter Values to Partition Explainer #3666

Conversation

CousinThrockmorton commented May 19, 2024 • edited

Overview

Checklist

CloseChoice commented Jun 3, 2024 • edited

CousinThrockmorton commented Jun 4, 2024 • edited

CloseChoice commented Jun 9, 2024

CousinThrockmorton commented Jun 10, 2024 • edited

CloseChoice commented Jun 18, 2024 • edited

CousinThrockmorton commented Jun 18, 2024

CousinThrockmorton commented May 19, 2024 •

edited

CloseChoice commented Jun 3, 2024 •

edited

CousinThrockmorton commented Jun 4, 2024 •

edited

CousinThrockmorton commented Jun 10, 2024 •

edited

CloseChoice commented Jun 18, 2024 •

edited