Skip to content

Latest commit

 

History

History
76 lines (57 loc) · 2.28 KB

index.rst

File metadata and controls

76 lines (57 loc) · 2.28 KB

Tutorial: Shapley Value Regression

The Shapley value is a cooperative game theoretic tool used to share a resource between players.

In this tutorial we will use it to identify the importance of different variables to a linear regression. This is commonly referred to as Shaply Value Regression.

Installing CoopGT

With a working installation of Python, open a command line tool and type:

$ python -m pip install coopgt

Linear Regression

In cooperative game theory a characteristic function is a mapping from all groups of players to a given value. In this case it will correspond to the R^2 value for a linear model for some data. The y variable is going to be predicted by fitting a linear model to three variables:

y = c_1 x_1 + c_2 x_2 + c_3 x_3

Here are the R^2 values (you are welcome to see :download:`main.py </_static/data-for-shapley-regression-tutorial/main.py>` for the code used to generate them):

Model R^2
y=c_1x_1 0.075
y= c_2x_2 0.086
y= c_3x_3 0.629
y=c_1x_1 + c_2x_2 0.163
y=c_1x_1 + c_3x_3 0.63
y= c_2x_2 + c_3x_3 0.906
y=c_1x_1 + c_2x_2 + c_3x_3 0.907

Defining the characteristic function

We can use that table of R^2 values to create the characteristic function:

>>> characteristic_function = {
...     (): 0,
...     (1,): 0.075,
...     (2,): 0.086,
...     (3,): 0.629,
...     (1, 2): 0.163,
...     (1, 3): 0.63,
...     (2, 3): 0.906,
...     (1, 2, 3): 0.907,
... }

Obtaining the Shapley value

We now compute the Shapley value:

>>> import coopgt.shapley_value
>>> shapley_value = coopgt.shapley_value.calculate(characteristic_function=characteristic_function)
>>> shapley_value.round(4)
array([0.0383, 0.1818, 0.6868])

From this analysis we would conclude that the parameter that contributes the most is in fact x_3.