Causal and asymmetric Shapley values implementation #273

igbucur · 2021-08-25T15:45:56Z

This branch contains an implementation for computing causal and asymmetric Shapley values, based on the supplement code for the paper [1]. The code is adapted from the package CauSHAPley (https://gitlab.science.ru.nl/gbucur/caushapley/).

Asymmetric Shapley values were proposed in [2] as a way to incorporate causal knowledge in the real world by restricting the possible permutations of the features when computing the Shapley values to those consistent with a (partial) causal ordering.
Causal Shapley values were proposed in [1] as a way to explain the total effect of features on the prediction, taking into account their causal relationships, by adapting the sampling procedure in shapr.
The two ideas can be combined to obtain asymmetric causal Shapley values. For more details, see [1].

The branch adds the following functions for computing causal Shapley values:

sample_causal in sampling.R
explain.causal in explanation.R
prepare_data.causal in observations.R

The branch adds the following functionality for computing asymmetric Shapley values:

additional branches for feature_combinations and feature_exact in features.R
respects_order in utils.R

Finally, the function shapr gets two new parameters:

asymmetric : Logical flag specifying whether we want to compute asymmetric Shapley values.
causal_ordering : List of vectors specifying (partial) causal ordering.

These parameters are saved in the explainer object returned by shapr, for which reason the known objects in the test suite have been updated. The branch also adds a number of basic tests for the new functionality.

References:
[1] Heskes, T., Sijben, E., Bucur, I. G., & Claassen, T. (2020). Causal Shapley Values: Exploiting Causal Knowledge to Explain Individual Predictions of Complex Models. Advances in Neural Information Processing Systems, 33.
[2] Frye, C., Rowat, C., & Feige, I. (2020). Asymmetric Shapley values: incorporating causal knowledge into model-agnostic explainability. Advances in Neural Information Processing Systems, 33.

… al.) Shapley values

…values

…hapr function.

…nto CauSHAPley

…onal case

…mbinations with harmless option for which all combinations are returned

…al_ordering components

martinju · 2021-08-26T15:44:08Z

Thank you, @igbucur for taking your time to prepare this PR!

I have looked at the code and iterated through the most important parts of the new code with some example data. While I have some minor comments, and it does indeed seem to work well 👍

Before we start discussing details, I do have a few broader question/comment:

You have added the causal method as a new approach, which applies the practical implementation method from theorem 1 in your paper while assuming a Gaussian distribution for your data. Please do correct me if I am wrong, but I don't see any reason the method should be restricted to the Gaussian distribution. We have implemented a series of other approaches for estimating the conditional distributions, and it would be great if the user could use the causal method with any of these. Allowing that will require some changes in the main package, but from what I understand, it can be carried out by figuring out which conditional distributions that needs to be estimated, and in what order, and then simply looping over these different chain components -- adding new sampled columns iteratively, similarly to how you did with the Gaussian method.

What do you think @igbucur ? Did I miss any details regarding this possibility? If you think it is doable, I could assist you in making the appropriate modifications in the core package.

igbucur · 2021-08-26T16:15:19Z

Thank you for the feedback @martinju . Yes, there should be no reason for the approach to be limited to the Gaussian distribution, and we could in principle use any of the approaches for estimating the conditional distributions for computing the causal Shapley values.

I think it would be doable. Perhaps it would be better then to have a causal flag, used in a similar way to asymmetric, instead of treating it a separate approach in explain. The other approaches for which to implement causal Shapley values would then be "empirical", "copula", "ctree", and "independence"?

martinju · 2021-08-26T20:05:36Z

Sounds good! Yes, I am thinking that whenever causal_ordering is not NULL, then the causal ordering is respected with the method specified under approach (gaussian, copula, ctree, empirical or independence). I think we have to think a bit regarding the best approach for implementing this. Ongoing work on implementing a "batch mode" allowing just parts of the subsets to be handled simultaneously (see #244 ) may also affect this a bit.

In any case, I believe the best starting point would be to create a function which "figures out" which conditional distributions need to be computed based on the S-matrix (or X-matrix) created in the shapr function + the definition in causal_ordering, and stores that in some list or data.table, which ultimately could be used by any of the approaches in explain/prepare.data. I believe that would essentially consist of the relevant parts of the code in prepare.causal. Then I could figure out how to best use that object in a universal way within explain/prepare.datain the next stage. What do you think?

igbucur · 2021-08-27T14:29:07Z

Thanks for the tip. Yeah, I think this makes sense, but I'll have to give some more thought on how to implement it.

…nto CauSHAPley

martinju · 2021-10-05T19:41:24Z

@igbucur Are you currently working on this? If so, let me know if you want to chat about how to go about it!

igbucur · 2021-10-05T20:52:16Z

@martinju Yes, I think I'm ready to give it a go. I had a look at how to tackle the proposed extension and here are my thoughts:

For the causal part, the solution is as you hinted before. I think the function you suggested would have to replace lapply in each of the prepare_data functions. The function would take as input the causal ordering and sample each different causal chain component , which means it would call the available sampling functions multiple times (these would not have to be changed). In the default case, there is a single chain component containing all variables, so the new function would call the sampling functions just once for all the variables and for all possible conditioning sets, as is being done at the moment.
For the asymmetric part, it should be even simpler. The features that do not follow the causal ordering have to be removed if asymmetric == TRUE. We could do that by passing an appropriate index_features argument to the prepare_data functions. This could probably be done in the explain function.

What do you think? Does this approach seem reasonable?

martinju · 2021-10-06T14:50:32Z

Sounds good!

I think that the function that we need is really the two functions say A(S,j) and B(S,j) which gives p(X_Sbar|X_S) = \prod_j p(X_A(S,j)|X_B(S,j)) for the specified causal ordering, and I believe it would be best to compute these within the shapr-function and store them in some object there which is then used in the explain/prepare_data by iteratively updating the data matrix to perform prediction on.

In prepare data this could either be achieved by replacing the lapply-call as you write, or by modifying the sampling functions to actually doing iterative sampling. I am not sure which approach preferrable at the moment.

Note that the empirical (+ independence) methods are not constructed as lapply around sampling functions, and ctree also requires an initial model fitting procedure.

My main point is that it think construction of the "routine" needed for the specific iterative sampling should be created already in the shapr-function :-)

igbucur · 2021-10-06T15:17:51Z

Okay, I will think how it could be done in the shapr function. The difficulty here I think is that instead of calling the sampling functions just once for each feature, you have to call them multiple times per feature, that is, the number of components times.

I was thinking about encapsulating the causal ordering functionality either in a new custom lapply or somewhere upstream (perhaps in shapr, like you suggested) in order to avoid having to reimplement this factorization every time a new sampling functions is added. This way the sampling functions can stay the same, while the decision on which conditional probabilities need to be estimated and multiplied is made upstream. It might also be an idea to design a function that takes the features in and the causal ordering and splits them according to the different components. Perhaps this is something that could be done in shapr.

martinju · 2021-10-06T15:22:23Z

Maybe I was unclear, but the function you talk about , taking features and causal ordering as inputs, is exactly what I was thinking about putting in the shapr function. :-)

martinju · 2021-10-06T15:28:52Z

And let me know if you want me to out together a function like that is shapr. It should be rather straightforward, I think.

igbucur added 13 commits August 21, 2021 17:42

Add code for computing asymmetric (Frye et al.) and causal (Heskes et…

9220773

… al.) Shapley values

Bugfix variable name in explain.causal

53da84b

Add references to explain.causal

90ee920

Update NAMESPACE and documentation for asymmetric and causal Shapley …

cc8616b

…values

Bugfix in explain.causal, explainer_x_test removed in shapr master

2707b6e

Bugfix in prepare_data.causal where check for NULL value failed.

17de7e9

Bugfix moved all checks for the default value of causal_ordering to s…

a13ec74

…hapr function.

Merge branch 'master' of https://github.com/NorskRegnesentral/shapr i…

fe0b08a

…nto CauSHAPley

Add basic tests for sample_causal and minor fix for the fully conditi…

cb40de1

…onal case

Replace default causal_ordering value in feature_exact and feature_co…

c2e1d67

…mbinations with harmless option for which all combinations are returned

Add extra warnings for cases that are not yet implemented.

7ea3b1d

Add basic tests to cover the asymmetric case in features functions

429bfee

Update known values for test objects after adding asymmetric and caus…

d7f7de3

…al_ordering components

igbucur marked this pull request as draft August 25, 2021 15:47

igbucur added 2 commits August 26, 2021 11:01

Update docs feature_combinations

8597e86

minor fix in docs, line too long

8e1588c

martinju added this to In progress in Towards shapr 1.0.0 Oct 4, 2021

igbucur added 2 commits October 5, 2021 21:25

Merge branch 'master' of https://github.com/NorskRegnesentral/shapr i…

317bb3c

…nto CauSHAPley

Updated new test objects with asymmetric and causal_ordering components

dc6fbc4

LHBO mentioned this pull request May 23, 2024

Asymmetric Causal Shapley values #395

Draft

8 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Causal and asymmetric Shapley values implementation #273

Causal and asymmetric Shapley values implementation #273

igbucur commented Aug 25, 2021 •

edited

martinju commented Aug 26, 2021

igbucur commented Aug 26, 2021

martinju commented Aug 26, 2021

igbucur commented Aug 27, 2021

martinju commented Oct 5, 2021

igbucur commented Oct 5, 2021

martinju commented Oct 6, 2021

igbucur commented Oct 6, 2021 •

edited

martinju commented Oct 6, 2021

martinju commented Oct 6, 2021

Causal and asymmetric Shapley values implementation #273

Are you sure you want to change the base?

Causal and asymmetric Shapley values implementation #273

Conversation

igbucur commented Aug 25, 2021 • edited

martinju commented Aug 26, 2021

igbucur commented Aug 26, 2021

martinju commented Aug 26, 2021

igbucur commented Aug 27, 2021

martinju commented Oct 5, 2021

igbucur commented Oct 5, 2021

martinju commented Oct 6, 2021

igbucur commented Oct 6, 2021 • edited

martinju commented Oct 6, 2021

martinju commented Oct 6, 2021

igbucur commented Aug 25, 2021 •

edited

igbucur commented Oct 6, 2021 •

edited