# Introduction

This is the third notebook in this example of how to explain models using Certifai. If you have not already done so, please run the [first notebook](patient-readmission-train) to train the models to be explained and the [second notebook](patient-readmission-scan) to scan the models.

In this notebook, we will:
1. Load the previously saved explanations report
2. Convert the counterfactuals into a dataframe and display them


In [1]:
import numpy as np
import pandas as pd
from pprint import pprint

from certifai.scanner.report_reader import ScanReportReader
from certifai.scanner.explanation_utils import explanations
from certifai.scanner.builder import ExplanationType
from IPython.display import display, Markdown

# Loading the Explanations Report

To load the report, we need to know the use case ID ('readmission') and the scan ID.

List the available use cases, and the scans within the 'readmission' use case.

In [2]:
reader = ScanReportReader("reports")
reader.list_usecases()
scans = reader.list_scans('readmission')
data=[[s['date'], ', '.join(s['reportTypes']), s['id']] for s in scans]
df = pd.DataFrame(data, columns=['date', 'evals', 'scan id']).sort_values(by=['date'], ascending=False)
print(df)

               date                                      evals       scan id
0   20201003T200737                                explanation  9e5292a83fae
1   20201003T193811  robustness, explainability, fairness, atx  051426d6e1ea
2   20201003T193745                                explanation  ebec7e2a42a1
3   20201003T190112  fairness, atx, explainability, robustness  3bb1ee055f1a
4   20201003T180318                                explanation  e7767cd510d8
5   20201003T180042  explainability, robustness, atx, fairness  fea0a19d76ab
6   20201003T180018                                explanation  174d803744f9
7   20201003T175937  fairness, explainability, robustness, atx  163767bbd1c4
8   20201003T175913                                explanation  db1f9734349b
9   20201003T175112                                explanation  496db6a032a1
10  20201003T174746                                explanation  720a57ba857c
11  20201003T173328                                   fairness  7e02a6ba44ba

Locate the latest explanation scan and load it.

In [3]:
latest_explanation = df[df.evals == 'explanation'].iloc[0]
result = reader.load_scan('readmission', latest_explanation['scan id'])

# Extract the explanations

In this section we'll construct a dataframe containing all of the original instances, and their counterfactuals. We'll then print out the first two for the logit and mlp models.

TODO change the following to move from per explanation to processing all, including model id and prediction row.

To do this, we'll use some utility functions. 

In [4]:
# These methods are candidates to move into Certifai APIs
# Returns a dataframe containing the original instance for an explained prediction
def original_instance(explained_prediction):
    return pd.DataFrame(np.expand_dims(exp.instance, axis=0), columns=exp.field_names)

# Returns a dataframe containing the counterfactual instances for an explained prediction
def df_counterfactuals(explained_prediction, max=1):
    if explained_prediction.explanation_type != ExplanationType.Counterfactual:
        return None
    cf_list = explained_prediction.explanation.best_individuals
    metadata = pd.DataFrame([[f'Counterfactual {i}', cf.prediction,
                              cf.counterfactual_type] for i, cf in enumerate(cf_list)],
                            columns=['instance', 'prediction', 'cf type'])
    cf_data = pd.DataFrame([cf.data for cf in cf_list], columns=explained_prediction.field_names)
    return pd.concat([metadata, cf_data], axis=1)  # TODO use join?

# Returns a dataframe containing the original instance plus the counterfactuals for the prediction, 
# listing just the changed features
# May return multiple counterfactuals, in which case features that are changed in only some
# counterfactuals will be n/a where unchanged
def df_counterfactual_changes(exp):
    df_cfs = df_counterfactuals(exp)
    orig = original_instance(exp)
    df_orig = pd.DataFrame(np.repeat(original_instance(exp), len(df_cfs), axis=0))
    df_diff = df_cfs.isin(df_orig)
    cf_changes = df_cfs[~df_diff].dropna(axis=1, how='all')
    orig_metadata = pd.DataFrame([['Original instance', exp.prediction, 'original prediction']],
                                 columns=['instance', 'prediction', 'cf type'])
    orig = orig.join(orig_metadata)
    original_changed = orig[cf_changes.columns]
    return pd.concat([original_changed, cf_changes]).reset_index(drop=True)

Print out the first 2 explanations for each model.

In [6]:
all_explanations = explanations(result)


max_displayed = 2  # Just display the first few here for illustration
pd.set_option('display.max_columns', None) # print all cols
# pd.set_option('display.expand_frame_repr', False) # dont wrap
pd.set_option('display.width', 200)

for model, explained_predictions in all_explanations.items():
    for i, exp in enumerate(explained_predictions[:max_displayed]):
        df_original_instance = original_instance(exp)
        display(Markdown(f'### Explanation of model {model} instance {i+1}\n'))
        display(Markdown('**Original Instance**'))
        display(df_original_instance)
        display(Markdown(f'**Original Prediction**: {"Readmitted" if exp.prediction == 1 else "Not Readmitted"}'))
        cf_changes = df_counterfactual_changes(exp)

        display(Markdown('### Counterfactual Changes'))
        display(cf_changes)


### Explanation of model logit instance 1


**Original Instance**

Unnamed: 0,time_in_hospital,num_lab_procedures,num_procedures,num_medications,number_outpatient,number_emergency,number_inpatient,number_diagnoses,race,gender,age,diag_1,diag_2,diag_3,max_glu_serum,A1Cresult,metformin,repaglinide,nateglinide,chlorpropamide,glimepiride,acetohexamide,glipizide,glyburide,tolbutamide,pioglitazone,rosiglitazone,acarbose,miglitol,troglitazone,tolazamide,insulin,glyburide-metformin,glipizide-metformin,glimepiride-pioglitazone,metformin-rosiglitazone,metformin-pioglitazone,change,diabetesMed
0,1,39,0,13,1,0,3,5,Caucasian,Female,[70-80),Digestive,Circulatory,Other,,,No,No,No,No,No,No,No,No,No,No,No,No,No,No,No,No,No,No,No,No,No,No,No


**Original Prediction**: Readmitted

### Counterfactual Changes

Unnamed: 0,instance,prediction,cf type,age
0,Original instance,1,original prediction,[70-80)
1,Counterfactual 0,0,prediction changed,[0-10)


### Explanation of model logit instance 2


**Original Instance**

Unnamed: 0,time_in_hospital,num_lab_procedures,num_procedures,num_medications,number_outpatient,number_emergency,number_inpatient,number_diagnoses,race,gender,age,diag_1,diag_2,diag_3,max_glu_serum,A1Cresult,metformin,repaglinide,nateglinide,chlorpropamide,glimepiride,acetohexamide,glipizide,glyburide,tolbutamide,pioglitazone,rosiglitazone,acarbose,miglitol,troglitazone,tolazamide,insulin,glyburide-metformin,glipizide-metformin,glimepiride-pioglitazone,metformin-rosiglitazone,metformin-pioglitazone,change,diabetesMed
0,5,56,0,13,0,0,1,9,AfricanAmerican,Male,[30-40),Digestive,Other,Circulatory,,,No,No,No,No,No,No,No,No,No,No,No,No,No,No,No,Down,No,No,No,No,No,Ch,Yes


**Original Prediction**: Readmitted

### Counterfactual Changes

Unnamed: 0,instance,prediction,cf type,number_inpatient
0,Original instance,1,original prediction,1
1,Counterfactual 0,0,prediction changed,0


### Explanation of model mlp instance 1


**Original Instance**

Unnamed: 0,time_in_hospital,num_lab_procedures,num_procedures,num_medications,number_outpatient,number_emergency,number_inpatient,number_diagnoses,race,gender,age,diag_1,diag_2,diag_3,max_glu_serum,A1Cresult,metformin,repaglinide,nateglinide,chlorpropamide,glimepiride,acetohexamide,glipizide,glyburide,tolbutamide,pioglitazone,rosiglitazone,acarbose,miglitol,troglitazone,tolazamide,insulin,glyburide-metformin,glipizide-metformin,glimepiride-pioglitazone,metformin-rosiglitazone,metformin-pioglitazone,change,diabetesMed
0,1,39,0,13,1,0,3,5,Caucasian,Female,[70-80),Digestive,Circulatory,Other,,,No,No,No,No,No,No,No,No,No,No,No,No,No,No,No,No,No,No,No,No,No,No,No


**Original Prediction**: Readmitted

### Counterfactual Changes

Unnamed: 0,instance,prediction,cf type,glyburide-metformin
0,Original instance,1,original prediction,No
1,Counterfactual 0,0,prediction changed,Up


### Explanation of model mlp instance 2


**Original Instance**

Unnamed: 0,time_in_hospital,num_lab_procedures,num_procedures,num_medications,number_outpatient,number_emergency,number_inpatient,number_diagnoses,race,gender,age,diag_1,diag_2,diag_3,max_glu_serum,A1Cresult,metformin,repaglinide,nateglinide,chlorpropamide,glimepiride,acetohexamide,glipizide,glyburide,tolbutamide,pioglitazone,rosiglitazone,acarbose,miglitol,troglitazone,tolazamide,insulin,glyburide-metformin,glipizide-metformin,glimepiride-pioglitazone,metformin-rosiglitazone,metformin-pioglitazone,change,diabetesMed
0,5,56,0,13,0,0,1,9,AfricanAmerican,Male,[30-40),Digestive,Other,Circulatory,,,No,No,No,No,No,No,No,No,No,No,No,No,No,No,No,Down,No,No,No,No,No,Ch,Yes


**Original Prediction**: Readmitted

### Counterfactual Changes

Unnamed: 0,instance,prediction,cf type,num_lab_procedures,number_inpatient
0,Original instance,1,original prediction,56,1
1,Counterfactual 0,0,prediction changed,55,0
