Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Add Winter Values to Partition Explainer #3666

Open
wants to merge 41 commits into
base: master
Choose a base branch
from

Conversation

CousinThrockmorton
Copy link

@CousinThrockmorton CousinThrockmorton commented May 19, 2024

Overview

Closes #3638

Description of the changes proposed in this pull request:

Extend SHAP to calculate feature contributions for coalitions of features Owen and Winter values. Having reviewed the Partition Explainer and Permutation Explainer code I am not aware of any current package or method that allows users to calculate these values with context specific coalition structures.

The current PartitionExplainer relies on the scipy.cluster.hierarchy package to simplify the permutations required by Shapley value by restricting the calls to those consistent with the coalitions based on the closeness of feature. Therefore, it is not as stated the Owen value but the Winter Value for the hierarchical clustering.

Checklist

  • Adopt utils.masked_models to accept custom non-binary hierarchies(/ translate them to binary trees and create masks of them at the specified level. (isn't possible and computationally an inferior approach))

  • Simplify explainer._partition.py code for faster execution and clarify code documentation.

  • All pre-commit checks pass.

  • Unit tests added (if fixing a bug or adding a new feature)

@CloseChoice
Copy link
Collaborator

CloseChoice commented Jun 3, 2024

I am looking into this PR right now. It's a bit hard to actually see the difference/similarities to the existing implementation, will try to change a few things there. Could you please provide a link without a pay wall to the winter value formula? I am not really familiar with the concept and didn't find any definition in the papers you linked in the issue.

EDIT: is there a test case we can implement? We need this for new functionality.

@CousinThrockmorton
Copy link
Author

CousinThrockmorton commented Jun 4, 2024

https://citeseerx.ist.psu.edu/document?repid=rep1&type=pdf&doi=90811fdcda3f3243fa57c0e2d14926a70e5858b1

Here is the link without a paywall, the description of the value can be found on pages 22-23 describing Winter (1989). I am sorry for the slow progress I had been going over the current implementation and a bit new to this.

I believe the PartitionExplainer and the ExactExplainer with the clustering option both calculate the Winter value not the Owen value (only respecting flat clusters) as stated in the documentations as the permutations used respect the whole clustering tree created by scipy. I believe the logic behind this approximation is that the marginals for correlated features will be similar with respect to the other correlated features and therefore we can reduce the number of permutations.

The point I am stuck at is in order to pass user described hierarchies the scipy binary hierarchy is not sufficient. However I am not sure what would be the most user friendly way to define the hierarchy so it can be checked and used easily. I will base my approach on the ExactExplainer as with the clustering option it is faster than the PartitionExplainer and write a new make_masks function (the methods mainly differ in how they use the results of this function to get to the same point) the rest should be the same. Do you think this option should be added to the ExactExplainer or the Partition method?

For example the exact method calculates for features [1,2,3,4] the Shapley value for 1.

[1] - []
[1,2] -[2]
[1,3] -[3]
[1,4] -[4]
[1,2,3] -[2,3]
[1,3,4] -[3,4]
[1,2,4] -[2,4]
[1,2,3,4] -[2,3,4]

The Owen value would be with clustering [[1,2,],[3,4]].

[1] - []
[1,2] -[2]
[1,3,4] -[3,4]
[1,2,3,4] -[2,3,4]

The Winter value [[1,2],[[3,4],[5]]]. I hope this structure makes the hierarchy clear, essentially we are talking of nested partitions of features.

[1] - []
[1,2] -[2]
[1,3,4,5] -[3,4,5]
[1,2,3,4,5] -[2,3,4,5]

So the features marginal contribution is to the cluster that it is part of and to the clusters it is external to. And this is exactly what the current method does with the binary clustering trees. For the notebook on PartitionExplainer this is the clustering used.

image

For the feature HouseAge the following masks are used which is what the Winter Value specifies.

X.columns = Index(['MedInc', 'HouseAge', 'AveRooms', 'AveBedrms', 'Population', 'AveOccup',
'Latitude', 'Longitude'],

(marginal to the external tree) [12]
[ True, False, False, False, True, True, False, True],
[ True, True, False, False, True, True, False, True],
(marginal with all turned on) [12, 10]
[ True, False, True, True, True, True, True, True],
[ True, True, True, True, True, True, True, True],
(marginal with all off) []
[False, False, False, False, False, False, False, False],
[False, True, False, False, False, False, False, False],
(marginal to the internal tree) [10]
[False, False, True, True, False, False, True, False],
[False, True, True, True, False, False, True, False],

or for Latitude

External [12 ,1]
[ True True False False True True False True]
[ True True False False True True True True]
All turned on [12 ,1 ,8]
[ True True True True True True False True]
[ True True True True True True True True]
External tree [12]
[ True False False False True True False True]
[ True False False False True True True True]
All but HouseAge [12 ,8]
[ True False True True True True False True]
[ True False True True True True True True]
To the HouseAge [1]
[False True False False False False False False]
[False True False False False False True False]
Internal [1 ,8]
[False True True True False False False False]
[False True True True False False True False]
All turned off []
[False False False False False False False False]
[False False False False False False True False]
Local cluster [8]
[False False True True False False False False]
[False False True True False False True False]

A couple examples of hierarchies I would like to pass [[1,2],[3,4,5]] or [[1,2],[[3,4][5,6]]] which can not be necessarily translated to a scipy hierarchy. I hope this clarifies what I would like to do as well as the current state of the ExactExplainer and PartitionExplainer methods. I hope you can give me some guidance on how to proceed. I really appreciate your help apologies for the long reply. Hope you have a good day!

@CloseChoice
Copy link
Collaborator

Thanks, this helps a lot. Will take a deeper look in the next week and we can follow this up here.

@CousinThrockmorton
Copy link
Author

CousinThrockmorton commented Jun 10, 2024

Thanks for the help. You can find most of my progress in the Non-binary_tree_masking notebook under notebooks. I am trying to figure out how to generate all combinations of masks of top node siblings and local siblings. This is done in the current partition method via traversing the tree and adding to each nodes mask its siblings mask on the left and right, then traversing the courser partition on the left and right and going lower if possible. See my comment before. This way we end up with all possible combinations top siblings and own path siblings in line with the Winter value calculation.

Now this is a slightly harder with n-ary trees as there is no easy left and right. At each level we need to calculate all possible combinations of masks then refine all these combinations at the lower level but one by one?

@CloseChoice
Copy link
Collaborator

CloseChoice commented Jun 18, 2024

The paper without a paywall doesn't work anymore. I read Winter (1989 but did not find a concrete formula that can be implemented in a straight forward fashion. Could you point me to that?

Also I would think that we are missing some combinations here:

The Winter value [[1,2],[[3,4],[5]]]. I hope this structure makes the hierarchy clear, essentially we are talking of nested partitions of features.
[1] - []
[1,2] -[2]
[1,3,4,5] -[3,4,5]
[1,2,3,4,5] -[2,3,4,5]

Aren't we missing:
[1, 5] - [5]
[1, 3, 4] - [3, 4]
?

@CousinThrockmorton
Copy link
Author

Hey, thanks for looking into the paper. Here is an other publication that interprets the use of the partition_explainer. https://towardsdatascience.com/shaps-partition-explainer-for-language-models-ec2e7a6c1b77. The formula is the same, the average of marginals, as Shapley values. In fact there is an interpretation that Winter values are a recursive call of Shapley values with respect to the coalitions.

That is for the coalitions structure [[1,2],[[3,4],[5]]. One way to get the Winter values is to consider [1,2], and [[3,4],[5]] as the players and calculate their Shapley values getting a value for each cluster. Then use the assigned value for each cluster as the payoff to calculate Shapley values on the constituent parts of each cluster for instance between [3,4], and [5]. And so on and so forth.

This recursive method could work. However, I believe the direct option may be more straight forward. The idea is to consider external and internal coalitions to get the value. External coalitions are that of the top level assigning the feature value with respect to other coalitions. Then the internal coalitions assign the value in group to the feature. So the answer to your question is, no as [[3,4],[5]] are external to the cluster of [1,2] and form one cluster.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ENH: Winter Values
2 participants