# Modeling Unbalanced Classes

Some classification models are better suited than others to outliers, low occurrence of a class, or rare events. The most common methods to add robustness to a classifier are related to stratified sampling to re-balance the training data. This module will walk you through both stratified sampling methods and more novel approaches to model data sets with unbalanced classes. 

## Learning Objectives

Identify class weights and sampling as methods to deal with unbalanced classes in a data set.

Recognize the syntax for building for sampling, blagging, and nearest neighbor methods for modeling unbalanced classes.


## Model Interpretability

In general, we need explanation methods that can make the behaviors and predictions of machine learning models understandable to humans. We need to use those methods to understand the model structures, what important features should be included in the model, and how those models map features to prediction outcomes.


In addition, sometimes knowing how models work exactly may give us more insights than merely predicting the outcomes. For example, understanding how an AI system diagnoses cancer may help human health experts identify evidence-based risk factors. For decision makers, interpretability is important especially for those in very sensitive or high-risk domains such as finance or health. We need to be confident and be able to trust that the model is working correctly. Black-box machine learning systems cannot be trusted unless they can be monitored and interpreted. As such, building trustable models is sometimes even more important than building high-performing models.

Understanding machine learning models for:
- Model explaination
- Model trust
- Model debug 

Recap
- We can only trust and effectively debug machine
learning models if they are understandable
- Self-interpretable models have simple and
intuitive structures
- Non- self- interpretable models have complex
structures and can be described as black-box
models

### Examples of Self-Interpretable and Non-Self-Interpretable Models

#### Self-Interpretable

Linear models are probably the most widely used predictive models due to their simplicity and effectiveness, especially in the financial industry. Their structure is simple with just a linear combination of features that predict values. As such, linear model prediction outcomes often require minimal effort to understand.

Tree models such as decision trees, are another popular self-interpretable type of model. The main characteristic of tree models is they mimic human’s reasoning process via creating a set of IF-THEN-ELSE rules. 

The K-nearest neighbor model, or KNN, can also be considered a self-interpretable model if the feature spaces can be comprehensible and kept small.

#### Non-Self-Interpretable Models

Ensemble Models

![](./images/70_ModelInterpretationMethods.png)



### Model-Agnostic Explanations

![](./images/71_ModelAgnosticExplainations.png)

Feature importance

Measure the importance of features

1. Simplify your model by only including important features
2. Interpret how predictions were made

Permutation feature importance
- The basic idea of permutation feature importance is very simple. For each feature, we shuffle its feature values and use the model to make predictions based on the shuffled values. In most cases, the prediction error will increase. Permuting important or impactful features will tend to generate large prediction errors and less important features will tend to generate small error increases. As such, feature importance can be measured by calculating the difference between the prediction errors before and after permutation. 

![](./images/72_PermutationFeatureImportanceExample.png)

- Partial Dependency Plot is an effective way to illustrate the relationship between a feature and the model outcome. It essentially visualizes the marginal effects of a feature, that is, it shows how the model outcome changes when a specific feature changes in its distribution. Note that we keep the rest of the features unchanged while changing the interested feature. 

Impurity-based feature importance

Shapley Additive exPlanations (SHAP) values

### Surrogate Models

![](./images/73_SurogateModels.png)

![](./images/74_GlobalSurrogateModels.png)

Local surrogate
- Global surrogate models may not always work
  - Large inconsistency between surrogate models and black-box models
  - Multiple data instance groups or clusters in the dataset
- Explain specific interested data instances locally
- A local surrogate model is built on one or a few instances

Local Interpretable Model-Agnostic Explanations (LIME)
![](./images/75_LocalInterpretableModel-AgnosticExplanations.png)


### Practice Lab: Model Interpretability

### Practice: Model interpretability

## Introduction to Unbalanced Classes

### Upsampling and Downsampling

### Modeling Approaches: Weighting and Stratified Sampling

### Modeling Approaches: Random and Synthetic Oversampling

### Modeling Approaches: Nearing Neighbor Methods

### Modeling Approaches: Blagging