Copyright (c) Microsoft Corporation. All rights reserved.

Licensed under the MIT License.

![Impressions](https://PixelServer20190423114238.azurewebsites.net/api/impressions/NotebookVM/how-to-use-azureml/explain-model/tabular-data/explain-binary-classification-local.png)

# Explain binary classification model predictions
_**This notebook showcases how to use the Azure Machine Learning Interpretability SDK to explain and visualize a binary classification model predictions.**_


## Table of Contents

1. [Introduction](#Introduction)
1. [Setup](#Setup)
1. [Run model explainer locally at training time](#Explain)
    1. Train a binary classification model
    1. Explain the model
        1. Generate global explanations
        1. Generate local explanations
1. [Visualize results](#Visualize)
1. [Next steps](#Next)

## Introduction

This notebook illustrates how to explain a binary classification model predictions locally at training time without contacting any Azure services.
It demonstrates the API calls that you need to make to get the global and local explanations and a visualization dashboard that provides an interactive way of discovering patterns in data and explanations.

We will showcase three tabular data explainers: TabularExplainer (SHAP), MimicExplainer (global surrogate), and PFIExplainer.

| ![Interpretability Toolkit Architecture](./img/interpretability-architecture.png) |
|:--:|
| *Interpretability Toolkit Architecture* |

Problem: Breast cancer diagnosis classification with scikit-learn (run model explainer locally)

1. Train a SVM classification model using Scikit-learn
2. Run 'explain_model' globally and locally with full dataset in local mode, which doesn't contact any Azure services.
3. Visualize the global and local explanations with the visualization dashboard.
---

Setup: If you are using Jupyter notebooks, the extensions should be installed automatically with the package.
If you are using Jupyter Labs run the following command:
```
(myenv) $ jupyter labextension install @jupyter-widgets/jupyterlab-manager
```


## Explain

### Run model explainer locally at training time

In [1]:
#from sklearn.datasets import load_breast_cancer
from sklearn import svm

# Explainers:
# 1. SHAP Tabular Explainer
from interpret.ext.blackbox import TabularExplainer

# OR

# 2. Mimic Explainer
from interpret.ext.blackbox import MimicExplainer
# You can use one of the following four interpretable models as a global surrogate to the black box model
from interpret.ext.glassbox import LGBMExplainableModel
from interpret.ext.glassbox import LinearExplainableModel
from interpret.ext.glassbox import SGDExplainableModel
from interpret.ext.glassbox import DecisionTreeExplainableModel

# OR

# 3. PFI Explainer
from interpret.ext.blackbox import PFIExplainer 

In [2]:
# Check core SDK version number
import azureml.core

print('SDK version:', azureml.core.VERSION)

SDK version: 1.8.0


In [4]:
from azureml.core import Workspace

ws = Workspace.from_config()
#print(ws.name, ws.resource_group, ws.location, ws.subscription_id, sep='\n')

In [5]:
experiment_name = 'insider-challengeA-explainer'

from azureml.core import Experiment
exp = Experiment(workspace=ws, name=experiment_name)

In [6]:
from azureml.core.compute import AmlCompute
from azureml.core.compute import ComputeTarget
import os

# choose a name for your cluster
compute_name = os.environ.get('AML_COMPUTE_CLUSTER_NAME', 'standard-cluster')
compute_min_nodes = os.environ.get('AML_COMPUTE_CLUSTER_MIN_NODES', 0)
compute_max_nodes = os.environ.get('AML_COMPUTE_CLUSTER_MAX_NODES', 4)

# This example uses CPU VM. For using GPU VM, set SKU to STANDARD_NC6
vm_size = os.environ.get('AML_COMPUTE_CLUSTER_SKU', 'STANDARD_D2_V2')


if compute_name in ws.compute_targets:
    compute_target = ws.compute_targets[compute_name]
    if compute_target and type(compute_target) is AmlCompute:
        print('found compute target. just use it. ' + compute_name)
else:
    print('creating a new compute target...')
    provisioning_config = AmlCompute.provisioning_configuration(vm_size=vm_size,
                                                                min_nodes=compute_min_nodes, 
                                                                max_nodes=compute_max_nodes)

    # create the cluster
    compute_target = ComputeTarget.create(ws, compute_name, provisioning_config)
    
    # can poll for a minimum number of nodes and for a specific timeout. 
    # if no min node count is provided it will use the scale settings for the cluster
    compute_target.wait_for_completion(show_output=True, min_node_count=None, timeout_in_minutes=20)
    
     # For a more detailed view of current AmlCompute status, use get_status()
    print(compute_target.get_status().serialize())

found compute target. just use it. standard-cluster


In [7]:
datastore = ws.get_default_datastore()
datastore.upload_files(files = ['./EmployeeTurnover.csv'],
                       target_path = 'train-dataset/tabular/',
                       overwrite = True,
                       show_progress = True)

Uploading an estimated of 1 files
Uploading ./EmployeeTurnover.csv
Uploaded ./EmployeeTurnover.csv, 1 files out of an estimated total of 1
Uploaded 1 files


$AZUREML_DATAREFERENCE_1e823df19a894eb9a06dfe43a5e32cd1

In [9]:
from azureml.core import Dataset
dataset = Dataset.Tabular.from_delimited_files(path = [(datastore, 'train-dataset/tabular/EmployeeTurnover.csv')])

# preview the first 3 rows of the dataset
dataset.take(6).to_pandas_dataframe()

Unnamed: 0,City,EmailDomain,EmployeeLeft,HiredthroughSMTP,ManagerRatingOfLikelihoodToLeave,MarkedForPHTProgram,MostRecentPerformanceEvaluation,SocialMediaActivity,Survey_AttitudeTowardWorkType,Survey_AttitudeTowardWorkload,Survey_RelativePeerAverageAttitudeTowardManager
0,Sandaohezi,exblog.jp,0,0,1,1,59,0,2,2,1
1,Bandung,youtu.be,1,0,21,0,49,2,1,2,3
2,Kuala Terengganu,cbslocal.com,0,1,1,1,51,1,3,1,2
3,Beaverlodge,youtu.be,1,1,1,0,48,0,3,1,1
4,Ell,springer.com,1,0,1,1,57,0,2,2,2
5,Nuoxizhi,cbslocal.com,0,0,1,1,49,2,2,1,2


In [11]:
df = dataset.to_pandas_dataframe()
classes = df['EmployeeLeft'].unique()
classes

array([0, 1])

In [12]:
df.columns

Index(['City', 'EmailDomain', 'EmployeeLeft', 'HiredthroughSMTP',
       'ManagerRatingOfLikelihoodToLeave', 'MarkedForPHTProgram',
       'MostRecentPerformanceEvaluation', 'SocialMediaActivity',
       'Survey_AttitudeTowardWorkType', 'Survey_AttitudeTowardWorkload',
       'Survey_RelativePeerAverageAttitudeTowardManager'],
      dtype='object')

In [14]:
import os

from azureml.core import Dataset, Run
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn.externals import joblib

run = Run.get_context()

x_col = ['City', 'EmailDomain', 'HiredthroughSMTP', 'ManagerRatingOfLikelihoodToLeave', 
         'MarkedForPHTProgram', 'MostRecentPerformanceEvaluation', 'SocialMediaActivity',
         'Survey_AttitudeTowardWorkType', 'Survey_AttitudeTowardWorkload', 'Survey_RelativePeerAverageAttitudeTowardManager']
y_col = ['EmployeeLeft']
categories = ['City', 'EmailDomain', 'HiredthroughSMTP', 'ManagerRatingOfLikelihoodToLeave', 'MarkedForPHTProgram', 'SocialMediaActivity',
         'Survey_AttitudeTowardWorkType', 'Survey_AttitudeTowardWorkload', 'Survey_RelativePeerAverageAttitudeTowardManager']
        
df[categories] = df[categories].astype('category')
df.dtypes


City                                               category
EmailDomain                                        category
EmployeeLeft                                          int64
HiredthroughSMTP                                   category
ManagerRatingOfLikelihoodToLeave                   category
MarkedForPHTProgram                                category
MostRecentPerformanceEvaluation                       int64
SocialMediaActivity                                category
Survey_AttitudeTowardWorkType                      category
Survey_AttitudeTowardWorkload                      category
Survey_RelativePeerAverageAttitudeTowardManager    category
dtype: object

In [23]:
df["City"] = df["City"].cat.codes
df["EmailDomain"] = df["EmailDomain"].cat.codes
df.head()

Unnamed: 0,City,EmailDomain,EmployeeLeft,HiredthroughSMTP,ManagerRatingOfLikelihoodToLeave,MarkedForPHTProgram,MostRecentPerformanceEvaluation,SocialMediaActivity,Survey_AttitudeTowardWorkType,Survey_AttitudeTowardWorkload,Survey_RelativePeerAverageAttitudeTowardManager
0,6,1,0,0,1,1,59,0,2,2,1
1,0,7,1,0,21,0,49,2,1,2,3
2,3,0,0,1,1,1,51,1,3,1,2
3,1,7,1,1,1,0,48,0,3,1,1
4,2,4,1,0,1,1,57,0,2,2,2


In [24]:
x_df = df.loc[:, x_col]
y_df = df.loc[:, y_col]

#dividing X,y into train and test data
x_train, x_test, y_train, y_test = train_test_split(x_df, y_df, test_size=0.25, random_state=100)

data = {'train': {'X': x_train, 'y': y_train},

        'test': {'X': x_test, 'y': y_test}}

clf = DecisionTreeClassifier().fit(data['train']['X'], data['train']['y'])
model_file_name = 'decision_tree.pkl'

print('Accuracy of Decision Tree classifier on training set: {:.2f}'.format(clf.score(x_train, y_train)))
print('Accuracy of Decision Tree classifier on test set: {:.2f}'.format(clf.score(x_test, y_test)))


Accuracy of Decision Tree classifier on training set: 0.99
Accuracy of Decision Tree classifier on test set: 0.79


In [26]:
from sklearn.linear_model import LogisticRegression
logmodel = LogisticRegression()
logmodel.fit(x_train,y_train)
predictions = logmodel.predict(x_test)

from sklearn.metrics import classification_report
print(classification_report(y_test,predictions))

              precision    recall  f1-score   support

           0       0.76      0.76      0.76      1527
           1       0.81      0.82      0.81      1973

   micro avg       0.79      0.79      0.79      3500
   macro avg       0.79      0.79      0.79      3500
weighted avg       0.79      0.79      0.79      3500



A column-vector y was passed when a 1d array was expected. Please change the shape of y to (n_samples, ), for example using ravel().


In [28]:
%%writefile ./train_employee_turnover.py

import os

from azureml.core import Dataset, Run
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn.externals import joblib

run = Run.get_context()

x_col = ['City', 'EmailDomain', 'HiredthroughSMTP', 'ManagerRatingOfLikelihoodToLeave', 
         'MarkedForPHTProgram', 'MostRecentPerformanceEvaluation', 'SocialMediaActivity',
         'Survey_AttitudeTowardWorkType', 'Survey_AttitudeTowardWorkload', 'Survey_RelativePeerAverageAttitudeTowardManager']
y_col = ['EmployeeLeft']
categories = ['City', 'EmailDomain', 'HiredthroughSMTP', 'ManagerRatingOfLikelihoodToLeave', 'MarkedForPHTProgram', 'SocialMediaActivity',
         'Survey_AttitudeTowardWorkType', 'Survey_AttitudeTowardWorkload', 'Survey_RelativePeerAverageAttitudeTowardManager']
        
df[categories] = df[categories].astype('category')

df["City"] = df["City"].cat.codes
df["EmailDomain"] = df["EmailDomain"].cat.codes

x_df = df.loc[:, x_col]
y_df = df.loc[:, y_col]

#dividing X,y into train and test data
x_train, x_test, y_train, y_test = train_test_split(x_df, y_df, test_size=0.2, random_state=100)

data = {'train': {'X': x_train, 'y': y_train},

        'test': {'X': x_test, 'y': y_test}}

clf = DecisionTreeClassifier().fit(data['train']['X'], data['train']['y'])
model_file_name = 'decision_tree.pkl'

print('Accuracy of Decision Tree classifier on training set: {:.2f}'.format(clf.score(x_train, y_train)))
print('Accuracy of Decision Tree classifier on test set: {:.2f}'.format(clf.score(x_test, y_test)))

os.makedirs('./outputs', exist_ok=True)
with open(model_file_name, 'wb') as file:
    joblib.dump(value=clf, filename='outputs/' + model_file_name)

Writing ./train_employee_turnover.py


### Train a SVM classification model, which you want to explain

In [41]:
import numpy as np
y_train=np.array(y_train).reshape(-1)

In [42]:
clf = svm.SVC(gamma=0.001, C=100., probability=True)
model = clf.fit(x_train, y_train)

In [50]:
x_train1=x_train.astype('int64')
x_test1=x_test.astype('int64')

###  Explain the entire model behavior (global explanation)

In [None]:
explainer = TabularExplainer(model,
                                    initialization_examples=x_train1,
                                    features=x_df.columns,
                                    classes=["Not leaving", "leaving"],
                            #         transformations=transformations
                            )

In [52]:
# Passing in test dataset for evaluation examples - note it must be a representative sample of the original data
# x_train can be passed as well, but with more examples explanations will take longer although they may be more accurate
global_explanation = explainer.explain_global(x_test1)

# Note: if you used the PFIExplainer in the previous step, use the next line of code instead
# global_explanation = explainer.explain_global(x_test, true_labels=y_test)

HBox(children=(FloatProgress(value=0.0, max=3500.0), HTML(value='')))

l1_reg="auto" is deprecated and in the next version (v0.29) the behavior will change from a conditional use of AIC to simply "num_features(10)"!
l1_reg="auto" is deprecated and in the next version (v0.29) the behavior will change from a conditional use of AIC to simply "num_features(10)"!
l1_reg="auto" is deprecated and in the next version (v0.29) the behavior will change from a conditional use of AIC to simply "num_features(10)"!
l1_reg="auto" is deprecated and in the next version (v0.29) the behavior will change from a conditional use of AIC to simply "num_features(10)"!
l1_reg="auto" is deprecated and in the next version (v0.29) the behavior will change from a conditional use of AIC to simply "num_features(10)"!
l1_reg="auto" is deprecated and in the next version (v0.29) the behavior will change from a conditional use of AIC to simply "num_features(10)"!
l1_reg="auto" is deprecated and in the next version (v0.29) the behavior will change from a conditional use of AIC to simply "num_

In [62]:
import pprint 
pp = pprint.PrettyPrinter(indent=4)

In [78]:
# sorted feature importance values and feature names
sorted_global_importance_values = global_explanation.get_ranked_global_values()
sorted_global_importance_names = global_explanation.get_ranked_global_names()
dict(zip(sorted_global_importance_names, sorted_global_importance_values))

# alternatively, you can print out a dictionary that holds the top K feature names and values
global_explanation.get_feature_importance_dict()

{'Survey_AttitudeTowardWorkType': 0.17248306640850275,
 'Survey_RelativePeerAverageAttitudeTowardManager': 0.11282257496139261,
 'City': 0.07932950418232453,
 'Survey_AttitudeTowardWorkload': 0.0730996493758034,
 'ManagerRatingOfLikelihoodToLeave': 0.06437520270928138,
 'MarkedForPHTProgram': 0.053734212156661555,
 'EmailDomain': 0.04531957835698447,
 'MostRecentPerformanceEvaluation': 0.03726255970239689,
 'HiredthroughSMTP': 0.007653841300622355,
 'SocialMediaActivity': 0.003108235336832752}

### Explain an individual prediction (local explanation)

In [80]:
# Note: PFIExplainer does not support local explanations
# You can pass a specific data point or a group of data points to the explain_local function

# get explanation for the first data point in the test set
local_explanation = explainer.explain_local(x_test1[0:5])

# sorted feature importance values and feature names
sorted_local_importance_names = local_explanation.get_ranked_local_names()
sorted_local_importance_values = local_explanation.get_ranked_local_values()

HBox(children=(FloatProgress(value=0.0, max=5.0), HTML(value='')))

l1_reg="auto" is deprecated and in the next version (v0.29) the behavior will change from a conditional use of AIC to simply "num_features(10)"!
l1_reg="auto" is deprecated and in the next version (v0.29) the behavior will change from a conditional use of AIC to simply "num_features(10)"!
l1_reg="auto" is deprecated and in the next version (v0.29) the behavior will change from a conditional use of AIC to simply "num_features(10)"!
l1_reg="auto" is deprecated and in the next version (v0.29) the behavior will change from a conditional use of AIC to simply "num_features(10)"!





l1_reg="auto" is deprecated and in the next version (v0.29) the behavior will change from a conditional use of AIC to simply "num_features(10)"!


## Visualize
Load the visualization dashboard

In [84]:
from interpret_community.widget import ExplanationDashboard

ExplanationDashboard(global_explanation, model, datasetX=x_test)

<interpret_community.widget.explanation_dashboard.ExplanationDashboard at 0x7ff642685c88>