## Model Interpreter Example Notebook

### Setup

In [1]:
from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split

import sys

sys.path.append("../../")

import model_interpreter
from model_interpreter.interpreter import ModelInterpreter

In [2]:
model_interpreter.__version__

'1.0.0'

### Prepare data

In [3]:
X, y = make_classification(
    n_samples=1000,
    n_features=4,
    n_informative=2,
    n_redundant=0,
    random_state=0,
    shuffle=False,
)


X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.33, random_state=42
)

clf = RandomForestClassifier(max_depth=2, random_state=0)
clf.fit(X_train, y_train)

In [4]:
feature_names = ["feature1", "feature2", "feature3", "feature4"]

### Build single model contribution object

A model interpreter object is created by specifiying the feature names in the same order as used for building the model.

In [5]:
single_model_contribution = ModelInterpreter(feature_names)

### Build explainer

The fit method needs to be called on the model to build the explainer.

In [6]:
single_model_contribution.fit(clf)

### Define single row of data to get contributions for

In [7]:
single_row = X[0]

### Use the transform method to get feature contributions

The transform method is called on the single row to create the class output, which by default returns a sorted list of dictionaries of features with descending absolute contribution to the prediction in the format:

`[{feature_name: feature_contribution}, ... ]`

return_feature_values can also be set to True so the return is:

`[{feature_name: (feature_value, feature_contribution)}, ... ]`

this is only implemented when the return_type is not set to the default value of "name_value_dicts"

In [8]:
contribution_list = single_model_contribution.transform(
    single_row, return_feature_values=False
)
print(contribution_list)

[{'Name': 'feature2', 'Value': -0.349129583}, {'Name': 'feature1', 'Value': -0.0039231513}, {'Name': 'feature4', 'Value': 0.0031653932}, {'Name': 'feature3', 'Value': 0.0013787609}]


In [9]:
contribution_list = single_model_contribution.transform(
    single_row, return_feature_values=True, return_type="dicts"
)
print(contribution_list)

[{'feature2': (-1.2990134593088984, -0.34912958304807074)}, {'feature1': (-1.6685316675305422, -0.0039231513013799485)}, {'feature4': (-0.6036204360190907, 0.0031653931724603453)}, {'feature3': (0.27464720361244455, 0.0013787609499948776)}]


You can provide a `feature_mapping` dictionary which can either map feature names to more interpretable names, or group features together

In [10]:
mapping_dict = {
    "feature1": "feature 1 was mapped",
    "feature2": "feature 2 was mapped",
    "feature3": "feature 3 was mapped",
    "feature4": "feature 4 was mapped",
}

contribution_list_mapped = single_model_contribution.transform(
    single_row, feature_mappings=mapping_dict
)
print(contribution_list_mapped)

[{'Name': 'feature 2 was mapped', 'Value': -0.349129583}, {'Name': 'feature 1 was mapped', 'Value': -0.0039231513}, {'Name': 'feature 4 was mapped', 'Value': 0.0031653932}, {'Name': 'feature 3 was mapped', 'Value': 0.0013787609}]


In the below example we create groups for the number of rooms and location. The resulting grouped contributions equal the sum of the individual feature contributions.

In [11]:
grouping_dict = {
    "feature1": "feature 1 was mapped",
    "feature2": "feature 2 and 3 was mapped",
    "feature3": "feature 2 and 3 was mapped",
    "feature4": "feature 4 was mapped",
}

contribution_list_grouped = single_model_contribution.transform(
    single_row, feature_mappings=grouping_dict
)
print(contribution_list_grouped)

[{'Name': 'feature 2 and 3 was mapped', 'Value': -0.3477508221}, {'Name': 'feature 1 was mapped', 'Value': -0.0039231513}, {'Name': 'feature 4 was mapped', 'Value': 0.0031653932}]


There are also three `sorting` options avaliable:
- `'abs'`,  which is the default used in the above examples sorts by the absolute value of the feature contribution
- `'positive'`, which sorts the contributions in descending order
- `'label'`, which sorts in a descending order if `pred_label > 0`, and ascending if `pred_label = 0`

 `n_return` can also be specified to return only the top n features according to the sorting option applied. 
 
 Some examples of how these variables are used are provided below.

In [12]:
contribution_list_abs = single_model_contribution.transform(
    single_row, feature_mappings=mapping_dict, sorting="abs", n_return=5
)
print(contribution_list_abs)

[{'Name': 'feature 2 was mapped', 'Value': -0.349129583}, {'Name': 'feature 1 was mapped', 'Value': -0.0039231513}, {'Name': 'feature 4 was mapped', 'Value': 0.0031653932}, {'Name': 'feature 3 was mapped', 'Value': 0.0013787609}]


In [13]:
contribution_list_label_pos = single_model_contribution.transform(
    single_row, feature_mappings=mapping_dict, sorting="label", pred_label=1
)
print(contribution_list_label_pos)

[{'Name': 'feature 4 was mapped', 'Value': 0.0031653932}, {'Name': 'feature 3 was mapped', 'Value': 0.0013787609}, {'Name': 'feature 1 was mapped', 'Value': -0.0039231513}, {'Name': 'feature 2 was mapped', 'Value': -0.349129583}]


In [14]:
contribution_list_label_0 = single_model_contribution.transform(
    single_row, feature_mappings=mapping_dict, sorting="label", pred_label=0
)
print(contribution_list_label_0)

[{'Name': 'feature 2 was mapped', 'Value': -0.349129583}, {'Name': 'feature 1 was mapped', 'Value': -0.0039231513}, {'Name': 'feature 3 was mapped', 'Value': 0.0013787609}, {'Name': 'feature 4 was mapped', 'Value': 0.0031653932}]


We can also chose to return a single dictionary with format `{feature_name: feature_contribution, ... }` or list of tuples with format `[(feature_name, feature_contribution),  ... ]` using the `return_type` variable 

In [15]:
contribution_single_dict = single_model_contribution.transform(
    single_row, feature_mappings=mapping_dict, n_return=5, return_type="single_dict"
)
print(contribution_single_dict)

{'feature 2 was mapped': -0.34912958304807074, 'feature 1 was mapped': -0.0039231513013799485, 'feature 4 was mapped': 0.0031653931724603453, 'feature 3 was mapped': 0.0013787609499948776}


In [16]:
contribution_list_tups = single_model_contribution.transform(
    single_row, feature_mappings=mapping_dict, n_return=5, return_type="tuples"
)
print(contribution_list_tups)

[('feature 2 was mapped', -0.34912958304807074), ('feature 1 was mapped', -0.0039231513013799485), ('feature 4 was mapped', 0.0031653931724603453), ('feature 3 was mapped', 0.0013787609499948776)]


### Predict for class 1 vs class 0

When generating SHAP values for different models you can select which class you wish to return. The default value for this is 1. i.e the positive case. As this is a binary classification problem the feature importance should be the inverse of one another. 

In [17]:
single_model_contribution = ModelInterpreter(feature_names)
single_model_contribution.fit(clf)

contribution_list = single_model_contribution.transform(
    single_row, return_feature_values=False
)
print(contribution_list)

[{'Name': 'feature2', 'Value': -0.349129583}, {'Name': 'feature1', 'Value': -0.0039231513}, {'Name': 'feature4', 'Value': 0.0031653932}, {'Name': 'feature3', 'Value': 0.0013787609}]


In [18]:
single_model_contribution = ModelInterpreter(feature_names)
single_model_contribution.fit(clf)

contribution_list = single_model_contribution.transform(
    single_row, return_feature_values=False, predict_class=0
)
print(contribution_list)

[{'Name': 'feature2', 'Value': 0.349129583}, {'Name': 'feature1', 'Value': 0.0039231513}, {'Name': 'feature4', 'Value': -0.0031653932}, {'Name': 'feature3', 'Value': -0.0013787609}]
