## Model explainability
Explainability is the extent to which a model can be explained in human terms. Interpretability is the extent to which you can explain the outcome of a model after a change in input parameters. Often times we are working with black box models, which do not show why a certain conclusion was made. Interpretability and explainability help you understand the "why" of your model.

We'll start this notebook by loading the data and 

In [1]:
import pandas as pd
import sklearn

pt_info_clean = pd.read_csv("../data/interim/pt_info_clean.csv")

In [2]:
import numpy as np

# change to numpy arrays in order to interact with CEM
pt_info_array = np.asarray(pt_info_clean)

from sklearn import model_selection
x_train, x_test, y_train, y_test = model_selection.train_test_split(pt_info_array, \
                                                                    pt_info_array[:,2], #mrsa positive 
                                                                    test_size=0.2, \
                                                                    random_state=430)

In [3]:
from sklearn.linear_model import LogisticRegression

In [5]:
lr = LogisticRegression(C=1.0, class_weight=None, dual=False, fit_intercept=True,
                   intercept_scaling=1, l1_ratio=None, max_iter=100,
                   multi_class='multinomial', n_jobs=None, penalty='l2',
                   random_state=0, solver='lbfgs', tol=0.0001, verbose=0,
                   warm_start=False)
lr.fit(x_train, y_train)

LogisticRegression(multi_class='multinomial', random_state=0)

We'll use a library called [Alibi](https://github.com/SeldonIO/alibi) in order to create some of these explainability methods.

In [6]:
class_names = ['MRSA -', 'MRSA +']
idx = 2
X = x_test[idx].reshape((1,) + x_test[idx].shape)
print('Prediction on instance to be explained: {}'.format(class_names[np.argmax(lr.predict(X))]))
print('Prediction probabilities for each class on the instance: {}'.format(lr.predict(X)))

Prediction on instance to be explained: MRSA -
Prediction probabilities for each class on the instance: [0.]


[Contrastive explanation methods](https://arxiv.org/abs/1802.07623), or CEMs, are useful when looking for both the pertinent positives and pertinent negatives in a model. 

In [10]:
import alibi
from alibi.explainers import CEM

mode = 'PN'
shape = x_train.shape

cem = CEM(predict = lr.predict(x_test), 
          mode = mode, # either PN or PP
          shape = shape) # instance shape

TypeError: 'numpy.ndarray' object is not callable

In [11]:
cem.fit(x_train, no_info_type='median')  # we need to define what feature values contain the least
                                         # info wrt predictions
                                         # here we will naively assume that the feature-wise median
                                         # contains no info; domain knowledge helps!
explanation = cem.explain(X, verbose=False)

In [None]:
print('Original instance: {}'.format(explanation.X))
print('Predicted class: {}'.format(class_names[explanation.X_pred]))

In [None]:
print('Pertinent negative: {}'.format(explanation.PN))
print('Predicted class: {}'.format(class_names[explanation.PN_pred]))