**<center><h1>Introduction</h1></center>**

As machine learning becomes increasingly integral to decisions that affect health, safety, economic wellbeing, and other aspects of people's lives, it's important to be able to understand how models make predictions; and to be able to explain the rationale for machine learning based decisions.

Explaining models is difficult because of the range of machine learning algorithm types and the nature of how machine learning works, but model interpretability has become a key element of helping to make model predictions explainable.

**<h2>Learning objectives</h2>**

In this module, you will learn how to:

- Interpret global and local feature importance.
- Use an explainer to interpret a model.
- Create model explanations in a training experiment.
- Visualize model explanations.

<hr>

**<center><h1>Feature importance</h1></center>**

Model explainers use statistical techniques to calculate feature importance. This enables you to quantify the relative influence each feature in the training dataset has on label prediction. Explainers work by evaluating a test data set of feature cases and the labels the model predicts for them.

**<h2>Global feature importance</h2>**

Global feature importance quantifies the relative importance of each feature in the test dataset as a whole. It provides a general comparison of the extent to which each feature in the dataset influences prediction.

For example, a binary classification model to predict loan default risk might be trained from features such as loan amount, income, marital status, and age to predict a label of 1 for loans that are likely to be repaid, and 0 for loans that have a significant risk of default (and therefore shouldn't be approved). An explainer might then use a sufficiently representative test dataset to produce the following global feature importance values:

- income: 0.98
- loan amount: 0.67
- age: 0.54
- marital status 0.32

<img src="images/09-01-global-importance.png" />


It's clear from these values, that in respect to the overall predictions generated by the model for the test dataset, **income** is the most important feature for predicting whether or not a borrower will default on a loan, followed by the **loan amount**, then **age**, and finally **marital status**.

**<h2>Local feature importance</h2>**

Local feature importance measures the influence of each feature value for a specific individual prediction.

For example, suppose Sam applies for a loan, which the machine learning model approves (by predicting that Sam won't default on the loan repayment). You could use an explainer to calculate the local feature importance for Sam's application to determine which factors influenced the prediction. You might get a result like this:

<img src="images/image1.png"/>

Because this is a classification model, each feature gets a local importance value for each possible class, indicating the amount of support for that class based on the feature value. Since this is a binary classification model, there are only two possible classes (0 and 1). Each feature's support for one class results in correlatively negative level of support for the other.

<img src="images/09-02-local-importance.png"/>

In Sam's case, the overall support for class 0 is -1.4, and the support for class 1 is correspondingly 1.4; so support for class 1 is higher than for class 0, and the loan is approved. The most important feature for a prediction of class 1 is loan amount, followed by income - these are the opposite order from their global feature importance values (which indicate that income is the most important factor for the data sample as a whole). There could be multiple reasons why local importance for an individual prediction varies from global importance for the overall dataset; for example, Sam might have a lower income than average, but the loan amount in this case might be unusually small.

For a multi-class classification model, a local importance values for each possible class is calculated for every feature, with the total across all classes always being 0. For example, a model might predict the species of a penguin based on features like its bill length, bill width, flipper length, and weight. Suppose there are three species of penguin, so the model predicts one of three class labels (0, 1, or 2). For an individual prediction, the flipper length feature might have local importance values of 0.5 for class 0, 0.3 for class 1, and -0.8 for class 2 - indicating that the flipper length moderately supports a prediction of class 0, slightly supports a prediction of class 1, and strongly supports a prediction that this particular penguin is **not** class 2.

For a regression model, there are no classes so the local importance values simply indicate the level of influence each feature has on the predicted scalar label.

<hr>

**<center><h1>Using explainers</h1></center>**

You can use the Azure Machine Learning SDK to create explainers for models, even if they were not trained using an Azure Machine Learning experiment.

**<h2>Creating an explainer</h2>**

To interpret a local model, you must install the azureml-interpret package and use it to create an explainer. There are multiple types of explainer, including:

- **MimicExplainer** - An explainer that creates a global surrogate model that approximates your trained model and can be used to generate explanations. This explainable model must have the same kind of architecture as your trained model (for example, linear or tree-based).
- **TabularExplainer** - An explainer that acts as a wrapper around various SHAP explainer algorithms, automatically choosing the one that is most appropriate for your model architecture.
- **PFIExplainer** - a Permutation Feature Importance explainer that analyzes feature importance by shuffling feature values and measuring the impact on prediction performance.

The following code example shows how to create an instance of each of these explainer types for a hypothetical model named **loan_model**:
```
# MimicExplainer
from interpret.ext.blackbox import MimicExplainer
from interpret.ext.glassbox import DecisionTreeExplainableModel

mim_explainer = MimicExplainer(model=loan_model,
                             initialization_examples=X_test,
                             explainable_model = DecisionTreeExplainableModel,
                             features=['loan_amount','income','age','marital_status'], 
                             classes=['reject', 'approve'])
                             

# TabularExplainer
from interpret.ext.blackbox import TabularExplainer

tab_explainer = TabularExplainer(model=loan_model,
                             initialization_examples=X_test,
                             features=['loan_amount','income','age','marital_status'],
                             classes=['reject', 'approve'])


# PFIExplainer
from interpret.ext.blackbox import PFIExplainer

pfi_explainer = PFIExplainer(model = loan_model,
                             features=['loan_amount','income','age','marital_status'],
                             classes=['reject', 'approve'])
```


**<h2>Explaining global feature importance</h2>**

To retrieve global importance values for the features in your model, you call the **explain_global()** method of your explainer to get a global explanation, and then use the **get_feature_importance_dict()** method to get a dictionary of the feature importance values. The following code example shows how to retrieve global feature importance:
```
# MimicExplainer
global_mim_explanation = mim_explainer.explain_global(X_train)
global_mim_feature_importance = global_mim_explanation.get_feature_importance_dict()


# TabularExplainer
global_tab_explanation = tab_explainer.explain_global(X_train)
global_tab_feature_importance = global_tab_explanation.get_feature_importance_dict()


# PFIExplainer
global_pfi_explanation = pfi_explainer.explain_global(X_train, y_train)
global_pfi_feature_importance = global_pfi_explanation.get_feature_importance_dict()
```

<mark>**Note:** The code is the same for MimicExplainer and TabularExplainer. The PFIExplainer requires the actual labels that correspond to the test features.</mark>


**<h2>Explaining local feature importance</h2>**

To retrieve local feature importance from a **MimicExplainer** or a **TabularExplainer**, you must call the **explain_local()** method of your explainer, specifying the subset of cases you want to explain. Then you can use the **get_ranked_local_names()** and **get_ranked_local_values()** methods to retrieve dictionaries of the feature names and importance values, ranked by importance. The following code example shows how to retrieve local feature importance:

```
# MimicExplainer
local_mim_explanation = mim_explainer.explain_local(X_test[0:5])
local_mim_features = local_mim_explanation.get_ranked_local_names()
local_mim_importance = local_mim_explanation.get_ranked_local_values()


# TabularExplainer
local_tab_explanation = tab_explainer.explain_local(X_test[0:5])
local_tab_features = local_tab_explanation.get_ranked_local_names()
local_tab_importance = local_tab_explanation.get_ranked_local_values()
```

<mark>**Note:** The code is the same for **MimicExplainer** and **TabularExplainer**. The **PFIExplainer** doesn't support local feature importance explanations.</mark>

<hr>

**<center><h1>Creating explanations</h1></center>**

When you use an estimator or a script to train a model in an Azure Machine Learning experiment, you can create an explainer and upload the explanation it generates to the run for later analysis.

**<h2>Creating an explanation in the experiment script</h2>**

To create an explanation in the experiment script, you'll need to ensure that the **azureml-interpret** and **azureml-contrib-interpret** packages are installed in the run environment. Then you can use these to create an explanation from your trained model and upload it to the run outputs. The following code example shows how code to generate and upload a model explanation can be incorporated into an experiment script.
```
# Import Azure ML run library
from azureml.core.run import Run
from azureml.contrib.interpret.explanation.explanation_client import ExplanationClient
from interpret.ext.blackbox import TabularExplainer
# other imports as required

# Get the experiment run context
run = Run.get_context()

# code to train model goes here

# Get explanation
explainer = TabularExplainer(model, X_train, features=features, classes=labels)
explanation = explainer.explain_global(X_test)

# Get an Explanation Client and upload the explanation
explain_client = ExplanationClient.from_run(run)
explain_client.upload_model_explanation(explanation, comment='Tabular Explanation')

# Complete the run
run.complete()
```

**<h2>Viewing the explanation
</h2>**

You can view the explanation you created for your model in the **Explanations** tab for the run in Azure Machine learning studio.

You can also use the **ExplanationClient** object to download the explanation in Python.
```
from azureml.contrib.interpret.explanation.explanation_client import ExplanationClient

client = ExplanationClient.from_run_id(workspace=ws,
                                       experiment_name=experiment.experiment_name, 
                                       run_id=run.id)
explanation = client.download_model_explanation()
feature_importances = explanation.get_feature_importance_dict()
```

<hr>

**<center><h1>Visualizing explanations</h1></center>**

Model explanations in Azure Machine Learning studio include multiple visualizations that you can use to explore feature importance.

<mark>**Note:** Visualizations are only available for experiment runs that were configured to generate and upload explanations. When using automated machine learning, only the run producing the best model has explanations generated by default.</mark>

**<h2>Visualizing global feature importance</h2>**

The first visualization on the Explanations tab for a run shows global feature importance.

<img src="images/09-00-vis-global.png"/>

You can use the slider to show only the top N features.



**<h2>Visualizing summary importance</h2>**

Switching to the Summary Importance visualization shows the distribution of individual importance values for each feature across the test dataset.

<img src="images/09-00-vis-summary.png"/>

You can view the features as a swarm plot (shown above), a box plot, or a violin plot.



**<h2>Visualizing local feature importance</h2>**

Selecting an individual data point shows the local feature importance for the case to which the data point belongs.

<img src="images/09-00-vis-local.png" />

<hr>

**<center><h1>Interpret models</h1></center>**

Now it's your chance to interpret models.

In this exercise, you will:

- Generate feature importance for a model.
- Generate explanations as part of a model training experiment.





**<h2>Instructions</h2>**

Follow these instructions to complete the exercise.

1. If you do not already have an Azure subscription, sign up for a free trial at https://azure.microsoft.com.
2. View the exercise repo at https://aka.ms/mslearn-dp100.
3. If you have not already done so, complete the Create an Azure Machine Learning workspace exercise to provision an Azure Machine Learning workspace, create a compute instance, and clone the required files.
4. Complete the Interpret models exercise.

<hr>

**<center><h1>Knowledge check</h1></center>**

1. You have trained a classification model, and you want to quantify the influence of each feature on a specific individual prediction. What should you examine?

  - Global feature importance

  - Local feature importance

  - Recall and Precision

2. Which explainer uses an architecture-appropriate SHAP algorithm to interpret a model?

  - PFIExplainer

  - MimicExplainer

  - TabularExplainer
<hr>

**<center><h1>Summary</h1></center>**

In this module, you learned how to:

- Interpret global and local feature importance.
- Use an explainer to interpret a model.
- Create model explanations in a training experiment.
- Visualize model explanations.

To learn more about interpreting models, see [Model interpretability in Azure Machine Learning](https://docs.microsoft.com/en-us/azure/machine-learning/how-to-machine-learning-interpretability) in the Azure Machine Learning documentation.



<hr>