![](https://cdn-images-1.medium.com/max/800/0*yoBvg_ik5WZTQkLq.)
Image source *https://cdn-images-1.medium.com/max/800/0yoBvg_ik5WZTQkLq.*

Machine Learning Methods have for long been infamous for being BlackBox or "un-interpretable". The growing popularity and complexity of the models worsens the case. There is a trade-off between accuracy and interpretability.

![](https://image.slidesharecdn.com/kasiainterpretablemachinelearning-171219021018/95/interpretable-machine-learning-using-lime-framework-kasia-kulma-phd-data-scientist-aviva-16-638.jpg?cb=1513658331)

There are some domains especially in the world of finance and healthcare where data scientists often end up having to use more traditional machine learning models (linear or tree-based). The reason being that model interpretability is very important for the business to explain each and every decision being taken by the model. However, this often leads to a sacrifice in performance. This is where complex models like ensembles and neural networks typically give us better and more accurate performance (since true relationships are rarely linear in nature). We, however, end up being unable to have proper interpretations for model decisions.I try to address this gap by using **model-agnostic methods**, which is independent of the model being used.

The question you might find yourself asking at this point of time is : ***"Why do I need an interpretable model, I don't work in finance?"***

> **Human curiosity and Learning**- When you will show your magic-like model to someone, the first thing that person will ask is ,*How did it do this?* What are you gonna say , *I don't know, I just fed the input  to the model, tuned some hyperparameters and got this output*. OF COURSE NOT!

> **Building Trust**- When you are selling your ML product to a prospective buyer , why should he trust your model? How can he know the model will produce good results under all circumstances? Interpretability is required to increase social acceptance of ML Models in our day-to-day lives.

> **Debugging**- When you are trying to reason an unexpected result or finding a bug in your model, Interpretability becomes very useful.


If you are not convinced yet , you can read a [story](https://christophm.github.io/interpretable-ml-book/storytime.html) to convince yourself.

Interpretability is basically of 2 types:-

* **Model Specific**-  Model-specific interpretation tools are very specific to intrinsic model interpretation methods which depend purely on the capabilities and features on a per-model basis. This can be coefficients, p-values, AIC scores pertaining to a regression model, rules from a decision tree and so on. The interpretation of regression weights in a linear model is a model-specific interpretation, since – by definition – the interpretation of intrinsically interpretable models is always model-specific. 

* **Model-Agnostic**- Model-agnostic tools can be used on any machine learning model and are applied after the model has been trained (post hoc). These agnostic methods usually work by analyzing feature input and output pairs. By definition, these methods cannot have access to model internals such as weights or structural information.

In this kernel , we will focus on model-agnostic methods.

I'm using the Pima Indians Diabetes dataset for  classifying whether a person has diabetes or not based on some features.


In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import os
from sklearn.feature_extraction.text import TfidfVectorizer
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split,GridSearchCV
print(os.listdir("../input"))

In [None]:
sns.set(style="white", palette="colorblind", font_scale=1.2, 
        rc={"figure.figsize":(12,9)})
RANDOM_STATE = 420 # set everything for the plots
N_JOBS=8

In [None]:
df = pd.read_csv("../input/pima-indians-diabetes-database/diabetes.csv")
df.head()

Let's know our features:
 * *Pregnancies*- Number of past pregnancies of the patient
 * *Glucose* -  Plasma glucose concentration(mg/dL)
 * *Blood Pressure*- Diastolic blood pressure (mm Hg
 * *Skin Thickness*- Triceps skin fold thickness (mm)
 * *Insulin*- 2-Hour serum insulin (mu U/ml)
 * *BMI*- Body mass index (weight in kg/(height in m)^2)
 * *Diabetes Pedigree Function*-  It determines whether a trait has a dominant or recessive pattern of inheritance.It is calculated when a patient has a diabetes history in the family.
 * *Age*- Age (years)
 * *Outcome*- Whether a person has diabetes or not( 0=No, 1=Yes)
 
 Insulin helps a person to use glucose in the blood to release energy. In a diabetic person , there isn't enough insulin to do that and most of the glucose remains in the blood.
 

In [None]:
features=["Pregnancies","Glucose","BloodPressure","SkinThickness","Insulin","BMI","DiabetesPedigreeFunction","Age"]

In [None]:
classes=['non-diabetic','diabetic']

In [None]:
target=df['Outcome']
df=df.drop(labels=['Outcome'],axis=1)
X_train, X_test, y_train, y_test = train_test_split(df, target, test_size=0.2, random_state=42)

Note that training is not our main motive, our motive is to interpret the result.

In [None]:
rfc=RandomForestClassifier(random_state=1234)
rfc.fit(X_train,y_train)


How the model does for our test data?

In [None]:
rfc.score(X_test,y_test)

Not bad! 

Let's do some hyperparameter tuning .

In [None]:
parameters={"n_estimators":[10,20,50,100,200],
           "max_depth":[2,3],
           "min_samples_split":[2,3,4],
           "max_features":('auto','log2'),
           "criterion":('gini','entropy')}
clf=GridSearchCV(rfc, parameters, cv=5)
clf.fit(X_train,y_train)

In [None]:
clf.best_params_

In [None]:
estimator=clf.best_estimator_
estimator.score(X_test,y_test)

# Feature Importance

Feature importance is the most basic and widely used technique for model interpretation. We will also start with this.

A  feature is “important” if shuffling its values increases the model error, because in this case the model relied on the feature for the prediction. A feature is “unimportant” if shuffling its values leaves the model error unchanged, because in this case the model ignored the feature for the prediction.

In [None]:
import eli5
# create our dataframe of feature importances
feat_imp_df = eli5.explain_weights_df(estimator, feature_names=features)
feat_imp_df

In [None]:
feature=["Pregnancies","Glucose","BP","SkinThickness","Insulin","BMI","DPFunc","Age"]

In [None]:
all_feat_imp_df = pd.DataFrame(data=[tree.feature_importances_ for tree in 
                                     estimator],
                               columns=feature)

(sns.boxplot(data=all_feat_imp_df)
        .set(title='Feature Importance Distributions',
             ylabel='Importance'));

As per feature importance , the Glucose level in the blood along with BMI and age are the most important features in determining a diabetic patient. The result seem justified to me. High glucose level in blood is basically diabetes and obese people are more prone to it.
Older adults are at high risk for the development of type 2 diabetes due to the combined effects of increasing insulin resistance and impaired pancreatic islet function with aging.

These reults can also  be confirmed from the [Diabetes Care Website](http://care.diabetesjournals.org/content/35/12/2650)

Uptil now , we can say our model is doing a good job in classifying.It has learned the right weights and can be trusted.

Takeaways:-
* It  provides a highly compressed, global insight into the model’s behavior.
* The importance measure automatically takes into account all interactions with other features. By permuting the feature you also destroy the interaction effects with other features.
* Permutation feature importance is linked to the error of the model. In some cases, you might prefer to know how much the model’s output varies for a feature without considering what it means for performance
*  Adding a correlated feature can decrease the importance of the associated feature by splitting the importance between both features.

In [None]:
!pip install pydotplus

It's intuitive and fun to take a look at the tree.

In [None]:
from IPython.display import Image  
from sklearn.tree import export_graphviz
import graphviz
import pydotplus
from io import StringIO  

# Get all trees of depth 3 in the random forest
depths3 = [tree for tree in estimator.estimators_ if tree.tree_.max_depth==3]
# grab the first one
tree = depths3[0]
# plot the tree
dot_data = StringIO()
export_graphviz(tree, out_file=dot_data, feature_names=features, 
                filled=True, rounded=True, special_characters=True)
graph = pydotplus.graph_from_dot_data(dot_data.getvalue())  
Image(graph.create_png())

Every non-leaf node is split based on the feature that is written at the top of the node. Towards the left part of the tree we classify samples as non-diabetic and towards the right part as diabetic. The entropy function at the leftmost leaf node becomes 0 because the data becomes homogenous(all the samples are either diabetic or non-diabetic). The first value in the value array tells how many samples are classified as non-diabetic and second value tells how many samples are diabetic.

For the leftmost leaf node , entropy is 0 all the 40 samples are non-diabetic.

# Feature Interactions

When features interact with each other in a prediction model, the prediction cannot be expressed as the sum of the feature effects, because the effect of one feature depends on the value of the other feature. The interaction between two features is the change in the prediction that occurs by varying the features after considering the individual feature effects.

The **H-statistic** proposed by Friedman and Popescu is used to calculate the interaction between features. There are a lot of packages in R for implementing this. Unfortunately for python users there is only one sklearn-gbmi package(to the best of my knowledge) to calculate H-statistic for Gradient-boosting models. 

In [None]:
!pip install sklearn-gbmi

So, let's implement  a Gradient-Boosting Classifier on our data first.

In [None]:
from sklearn.ensemble import GradientBoostingClassifier
gbr_1 = GradientBoostingClassifier(random_state = 2589)
gbr_1.fit(X_train,y_train)

In [None]:
from sklearn_gbmi import *
d=h_all_pairs(gbr_1,X_train)# d is a dictionary of feature pairs and their respective interaction strength
l=sorted(d.items(), key=lambda x: x[1])#converted to a list sorted by interaction values

In [None]:
l=l[-10:] # let's just take the top 10
data=pd.DataFrame(l)
data.columns=['Feature',"Interaction"]
data.index=data['Feature']
data=data.drop(labels=['Feature'],axis=1)

In [None]:
data.plot(kind='barh', color='teal', title="Feature Interaction Strength")

There is a strong interaction between number of pregnancies and age and slso between blood pressure and Insulin. All of these interactions are 2-way.

Takeaways:-
* The statistic detects all kinds of interactions, regardless of their particular form.
* Since the statistic is dimensionless and always between 0 and 1, it is comparable across features and even across models.(not for python users)
* The H-statistic tells us the strength of interactions, but it does not tell us how the interactions look like. That is what ***partial dependence plots*** are for.


# Partial Dependence Plots (PDP)


The partial dependence plot (short PDP or PD plot) shows the marginal effect one or two features have on the predicted outcome of a machine learning model.It can show whether the relationship between the target and a feature is linear, monotonous or more complex.

The partial dependence plot is a both a global and local method: The method considers all instances and gives a statement about the global relationship of a feature with the predicted outcome (through the yellow line) and the relationship of all the unique instances with the outcome with the blue lines.

In [None]:
from pdpbox import pdp, info_plots
pdp_ = pdp.pdp_isolate(
    model=estimator, dataset=X_train, model_features=features, feature='Glucose'
)
fig, axes = pdp.pdp_plot(
    pdp_isolate_out=pdp_, feature_name='Glucose', center=True, 
     plot_lines=True, frac_to_plot=100
)

The y axis is interpreted as change in the prediction from what it would be predicted at the baseline or leftmost value.The blue lines are all the instances and the yellow line provides average marginal effect over them.The heterogenous effects can be seen by the blue lines.

Higher blood sugar increases the chances of having diabetes.100 mg/dL is the mean  glucose level for non-diabetic people, which is also justified by the graph.

In [None]:
pdp_ = pdp.pdp_isolate(
    model=estimator, dataset=X_train, model_features=features, feature='BMI'
)
fig, axes = pdp.pdp_plot(
    pdp_isolate_out=pdp_, feature_name='BMI', center=True, x_quantile=True, 
     plot_lines=True, frac_to_plot=100
)

Obesity is related to diabetes, for BMI greater than 26 the chance of having diabetes goes on increasing with a greater effect.

In [None]:
pdp_ = pdp.pdp_isolate(
    model=estimator, dataset=X_train, model_features=features, feature='Age'
)
fig, axes = pdp.pdp_plot(
    pdp_isolate_out=pdp_, feature_name='Age', center=True, x_quantile=True, 
     plot_lines=True, frac_to_plot=100
)

After 23 years of age people are more suspectible to diabetes.

## 2D PDP plot

Through the Feature interaction plot , we found that age and pregnancies have strong interaction , let's see how they interact.

In [None]:

features_to_plot = ['Age', 'Pregnancies']
inter1  =  pdp.pdp_interact(model=estimator, dataset=X_train, model_features=features, features=features_to_plot)

pdp.pdp_interact_plot(pdp_interact_out=inter1, feature_names=features_to_plot, plot_type='contour')
plt.show()

Women above 40 years of age with more than 7 pregnancies are most suspectible to the disease.

Takeaways:-
* Easy to implement and intuitive.
* The assumption of independence is the biggest issue with PD plots. It is assumed that the feature(s) for which the partial dependence is computed are not correlated with other features. Which is never the case for real data.

# Local Interpretable Model-agnostic Explaination (LIME)

LIME is a concrete implementation of local surrogate models.Surrogate models are trained to approximate the predictions of the underlying black box model. Local surrogate models are interpretable models that are used to explain individual predictions of black box machine learning models.

Let's take a look at the steps:-
1. Permute data.
2. Calculate distance between permutations and original observation.
3. Make predictions on the new data using the black-box model.
4. Pick m features best describing the complex model outcome from the permuted data.
5. Fit a simple(surrogate) model to the permuted data with m features and similarity score as weights.
6. Feature weight from the surroagate model make explainations for the black box model's local behaviour.

![](https://cdn-images-1.medium.com/max/800/1*vE3PUuhG6RRgK1J9oxg0nA.png)

The black-box model’s complex decision function f (unknown to LIME) is represented by the blue/pink background, which cannot be approximated well by a linear model. The bold red cross is the instance being explained. LIME samples instances, gets predictions using f, and weighs them by the proximity to the instance being explained (represented here by size). The dashed line is the learned explanation that is locally (but not globally) faithful.

In [None]:
import lime
import lime.lime_tabular
explainer = lime.lime_tabular.LimeTabularExplainer(X_train.astype(int).values,  
mode='classification',training_labels=y_train,feature_names=features,class_names=classes)
#Let's take a look for the 100th row
i = 100
exp = explainer.explain_instance(X_train.loc[i,features].astype(int).values, estimator.predict_proba, num_features=5)

In [None]:
exp.show_in_notebook(show_table=True)

Orange colored features supports diabetic class, and blue supports non-diabetic class.

There are three parts to the explanation :-
1. The top Left part gives the prediction probabilities for class 0 and class 1.
2. The right part gives the 5 most important features. Orange features support the diabetic class and blue support the non-diabetic class.
3. The bottom part follows the same colour coding as 1 and 2. It contains the actual values  for the top 5 variables.

This can be read as *the woman is diabetic with a probability of 0.67. Her Glucose level, BMI , Age and DiabetesPedigreeFunction all add up to the prediction of diabetic and we have seen in the pdp plot how it does so. However , she has only one pregnancy which not at all contributes to diabetes but this has a lesser weight as compared to other more crucial features in determining diabetes*

To dig  a little deeper into  the implementation code of LIME , you can read the  [lime documentation](https://lime-ml.readthedocs.io/en/latest/lime.html#module-lime.lime_image)

## Submodular Pick (SP-LIME) for explaining models

LIME aims to attribute a model’s prediction to human understandable features. In order to do this we need to run the explanation model on a diverse but representative set of instances to return a non redundant explanation set that is a global representation of the model. 

In [None]:
from lime import submodular_pick
# SP-LIME returns exaplanations on a sample set to provide a non redundant global decision boundary of original model
sp_obj = submodular_pick.SubmodularPick(explainer, X_train.values, estimator.predict_proba, num_features=5,num_exps_desired=5)


In [None]:
[exp.show_in_notebook() for exp in sp_obj.sp_explanations]

Takeaways:-
* Human-friendly explainations that are very useful when explaining to a lay person.
* The explanations created with local surrogate models can use other features than the original model. This can be a big advantage over other methods, especially if the original features cannot bet interpreted.
* A really big problem is the instability of the explanations. If you repeat the sampling process, then the explantions that come out can be different.


# SHAP

SHAP (SHapley Additive exPlanations)  can be used for both global and local explainations. It leverages game theory to help measure the impact of the features on the predictions .A prediction can be explained by assuming that each feature value of the instance is a *“player”* in a game where the prediction is the *payout*. The Shapley value  tells us how to fairly distribute the “payout” among the features.

Players? Game? Payout? What is the connection to machine learning predictions and interpretability? The “game” is the prediction task for a single instance of the dataset. The “gain” is the actual prediction for this instance minus the average prediction for all instances. The “players” are the feature values of the instance that collaborate to receive the gain (= predict a certain value).

Let's understand this with an example:

![](https://christophm.github.io/interpretable-ml-book/images/shapley-instance.png)

You have trained a machine learning model to predict apartment prices. For a certain apartment it predicts €300,000 and you need to explain this prediction. The apartment has a size of 50 m2, is located on the 2nd floor, has a park nearby and cats are banned.

The average prediction for all apartments is €310,000. How much has each feature value contributed to the prediction compared to the average prediction?

The answer could be: The park-nearby contributed €30,000; size-50 contributed €10,000; floor-2nd contributed €0; cat-banned contributed -€50,000. The contributions add up to -€10,000, the final prediction minus the average predicted apartment price.

Now you must be thinking ,how do we calculate the Shapley value for one feature?

The Shapley value is the average marginal contribution of a feature value across all possible coalitions. Coalitions are nothing but different simulated environments created by varying the feature and noticing the effect. For example if everything is kept the same and " cat-banned" is changed to "cat-allowed" , we check how the prediction changed. For more info , you can read [this](https://christophm.github.io/interpretable-ml-book/shapley.html).

This is taken from DanB's tutorial kernel.Let's do this for the 7th instance in our dataset.

In [None]:
import shap

# create our SHAP explainer
shap_explainer = shap.TreeExplainer(estimator)
# calculate the shapley values for our data
shap_values = shap_explainer.shap_values(X_train.iloc[7])

In [None]:
# load JS in order to use some of the plotting functions from the shap
# package in the notebook
shap.initjs()
shap.force_plot(shap_explainer.expected_value[1], shap_values[1], X_train.iloc[7])

Features causing increase in prediction are in pink and features causing a decrease in prediction is in blue, along with their value showing the magnitude of effect. The base value is 0.3498 and we predict 0.7. This person is classified as diabetic , the features that pushed the result towards diabetic were Glucose level=161, Age=47, Insulin=132 and 10 pregnancies. The BMI feature which is low tries to negate the effect but couldn't because the combined effect of the pink features far outweighs it.

If you subtract the length of the blue bars from the length of the pink bars, it equals the distance from the base value to the output.

Let's also plot the **Summary plot** to get a global overview.

In [None]:
shap_values = shap_explainer.shap_values(X_train)
shap.summary_plot(shap_values[1], X_train,auto_size_plot=False)

Okay, How to interpret this?
This plot is made of many dots. Each dot has three characteristics:

* Vertical location shows what feature it is depicting
* Color shows whether that feature was high or low for that row of the dataset
* Horizontal location shows whether the effect of that value caused a higher or lower prediction.

The dots on the rightmost in the glucose row are pink which means Glucose level is high, which increases the chance of diabetes, we have already seen this before. 

***The feature insulin is not very clear to me , whether it's the natural insulin level in the patient or the amount of insulin artificially given to the patient.*** It shows a very unexpected behavior, it's high value can both increase, decrease the chance of having diabetes, which feels like a pretty useless insight to me.  Any input from your side regarding this will be  helpful !

# LIME for Text

Here I'll be playing with LIME as discussed above but for text data.LIME for text differs from LIME for tabular data. Variations of the data are generated differently: Starting from the original text, new texts are created by randomly removing words from the original text. The dataset is represented with binary features for each word. A feature is 1 if the corresponding word is included and 0 if it has been removed.

I'm using the comment classification dataset from the jigsaw toxic comment classification challenge. The model I train is very basic and the interpretations you see  maybe erroneous, but it will be a fault of our naive-model.


In [None]:
train = pd.read_csv('../input/jigsaw-toxic-comment-classification-challenge/train.csv').fillna(' ')
test = pd.read_csv('../input/jigsaw-toxic-comment-classification-challenge/test.csv').fillna(' ')

train_text = train['comment_text']
test_text = test['comment_text']
all_text = pd.concat([train_text, test_text])

In [None]:
word_vectorizer = TfidfVectorizer(
    sublinear_tf=True,
    strip_accents='unicode',
    analyzer='word',
    token_pattern=r'\w{1,}',
    stop_words='english',
    ngram_range=(1, 1),
    max_features=10000)
word_vectorizer.fit(all_text)
train_word_features = word_vectorizer.transform(train_text)
test_word_features = word_vectorizer.transform(test_text)

First let's train our model for classifying toxic coments and we'll see with the help of LIME text explainer how it does so.

In [None]:
from sklearn.linear_model import LogisticRegression
train_target_toxic = train['toxic']
classifier_toxic = LogisticRegression(C=0.1, solver='sag')
classifier_toxic.fit(train_word_features, train_target_toxic)

In [None]:
names=['non-toxic','toxic']

In [None]:
from sklearn.pipeline import make_pipeline
from lime.lime_text import LimeTextExplainer
c_tf = make_pipeline( word_vectorizer,classifier_toxic)
explainer_tf = LimeTextExplainer(class_names=names)

In [None]:
exp = explainer_tf.explain_instance(train_text.iloc[802], c_tf.predict_proba, num_features=4, top_labels=1)
exp.show_in_notebook(text=train_text.iloc[802])

In [None]:
exp = explainer_tf.explain_instance(train_text.iloc[55], c_tf.predict_proba, num_features=4, top_labels=1)
exp.show_in_notebook(text=train_text.iloc[55])

Now let's see the same for threatening comments.

In [None]:
train_target_threat = train['threat']
classifier_threat = LogisticRegression(C=0.1, solver='sag')
classifier_threat.fit(train_word_features, train_target_threat)

In [None]:
names=['threatening','non-threatening']
c_tf = make_pipeline( word_vectorizer,classifier_threat)
explainer_tf = LimeTextExplainer(class_names=names)


In [None]:
exp = explainer_tf.explain_instance(train_text.iloc[79], c_tf.predict_proba, num_features=4, top_labels=1)
exp.show_in_notebook(text=train_text.iloc[79])

In [None]:
exp = explainer_tf.explain_instance(train_text.iloc[1085], c_tf.predict_proba, num_features=4, top_labels=1)
exp.show_in_notebook(text=train_text.iloc[1085])

# LIME for Images

Now comes probably the most interesting part, **interpreting images classified by a neural network**. LIME for images works differently than LIME for tabular data and text. Intuitively, it would not make much sense to perturb individual pixels, since many more than one pixel contribute to one class. Randomly changing individual pixels would probably not change the predictions by much. 

Therefore, variations of the images are created by segmenting the image into “superpixels” and turning superpixels off or on. Superpixels are interconnected pixels with similar colors and can be turned off by replacing each pixel with a  color such as gray.

I'm using a pretrained Inception V3 model.

In [None]:
import keras
from keras.applications import inception_v3 as inc_net
from keras.preprocessing import image
from keras.applications.imagenet_utils import decode_predictions
from skimage.io import imread

In [None]:
inet_model = inc_net.InceptionV3()

In [None]:
def transform_img_fn(path_list):
    out = []
    for img_path in path_list:
        img = image.load_img(img_path, target_size=(299, 299))
        x = image.img_to_array(img)
        x = np.expand_dims(x, axis=0)
        x = inc_net.preprocess_input(x)
        out.append(x)
    return np.vstack(out)

In [None]:
main_dir="../input/inception-classification-sample-images/inception classification samples/inception classification samples"

In [None]:
images = transform_img_fn([os.path.join(main_dir,'cat-and-mouse.jpg')])
# I'm dividing by 2 and adding 0.5 because of how this Inception represents images
plt.figure(figsize=(3,3))
plt.imshow(images[0] / 2 + 0.5)
preds = inet_model.predict(images)
for x in decode_predictions(preds)[0]:
    print(x)

The model predicts 5 classes with different probabilities. Time to get an explaination of the model. 

In [None]:
import lime
from lime import lime_image
explainer = lime_image.LimeImageExplainer()
explanation = explainer.explain_instance(images[0], inet_model.predict, top_labels=5, hide_color=0, num_samples=1000)

Let's see an explaination for classifying as tiger cat. The first parameter in the get_image_and_mask function is the label for the class. The class for us is tiger cat and you can find the corresponding label here [https://savan77.github.io/blog/files/labels.json](https://savan77.github.io/blog/files/labels.json). It is 282.We can see the top 8 superpixels that are most positive towards the class with the rest of the image hidden.

In [None]:
from skimage.segmentation import mark_boundaries
temp, mask = explanation.get_image_and_mask(282, positive_only=True, num_features=8, hide_rest=True)
plt.figure(figsize=(3,3))
plt.imshow(mark_boundaries(temp / 2 + 0.5, mask))

Let's view it with the full image now.

In [None]:
from skimage.segmentation import mark_boundaries
temp, mask = explanation.get_image_and_mask(282, positive_only=True, num_features=8, hide_rest=False)
plt.figure(figsize=(3,3))
plt.imshow(mark_boundaries(temp / 2 + 0.5, mask))

The parameter **positive_only** , when True, only take superpixels that contribute to the prediction of the label. Otherwise, uses the top num_features superpixels, which can be positive or negative towards the label.

In [None]:
from skimage.segmentation import mark_boundaries
temp, mask = explanation.get_image_and_mask(282, positive_only=False, num_features=100, hide_rest=False)
plt.figure(figsize=(3,3))
plt.imshow(mark_boundaries(temp / 2 + 0.5, mask))

The superpixels in green are positive towards the label i.e. tiger cat and superpixels in red are negatve.

Let's now see the interpretations for class "mouse".The label for mouse is 673.

In [None]:
from skimage.segmentation import mark_boundaries
temp, mask = explanation.get_image_and_mask(673, positive_only=True, num_features=5, hide_rest=False)
plt.figure(figsize=(3,3))
plt.imshow(mark_boundaries(temp / 2 + 0.5, mask))

It's very clear from this image that the model is not doing a good job in predicting mouse. Since the superpixels used are of the cat mostly.

Thanks for reading guys, I'll be happy if you can upvote this kernel. I'm eager to hear any suggestions and feedback from your side.

Loads of thanks to this amazing book [Interpretable Machine Learning](https://christophm.github.io/interpretable-ml-book/) by Christoph Molnar. Some other references used by me are:-

* [https://www.kdnuggets.com/2018/06/human-interpretable-machine-learning-need-importance-model-interpretation.html](https://www.kdnuggets.com/2018/06/human-interpretable-machine-learning-need-importance-model-interpretation.html)
* [http://savvastjortjoglou.com/intrepretable-machine-learning-nfl-combine.html](http://savvastjortjoglou.com/intrepretable-machine-learning-nfl-combine.html)
* [https://lime-ml.readthedocs.io/](https://lime-ml.readthedocs.io/)
* [https://github.com/marcotcr/lime](https://github.com/marcotcr/lime)