# Interpret your ML model

[1. What's in a black box?](#black_box)

[2. Different types of interpretation](#types)

[3. Trade-off between Accuracy and Interpretability](#tradeoff)

[4. Feature Importance](#feature_importance)

[5. Dependency plots](#dependency_plots)

[6. Local interpretation](#local)

[7. SHAP](#shap)

[8. Practice! Explaining your ML model 🔍](#practice)

[9. Wrapping up](#wrapping_up)

[10. Great resources to learn more about interpretable ML 📚](#resources)


# <a id='black_box'></a> 1. What's in a black box?

The more companies are interested in using machine learning and big data in their work, the more they care about the interpretability of the models. This is understandable: asking questions and looking for explanations is human. 

We want to know not only "What's the prediction?", but "Why so?" as well. Thus, interpretation of ML models is important and helps us to:

   - Explain individual predictions
   - Understand models' behaviour
   - Detect errors & biases
   - Generate insights about data & create new features
   
![ml_model_lifecycle.png](attachment:ml_model_lifecycle.png)

# <a id='types'></a> 2. Different types of interpretation

Model's predictions can be explained in different ways. The choice of media relies on what would be the most appropriate for a given problem.

## <font color='purple'> Visualization
    
For example, visualized interpretations are perfect for explaining the image classifier predictions.    
![image.png](attachment:image.png)

Source: [LIME Tutorial](https://marcotcr.github.io/lime/tutorials/Tutorial%20-%20images.html)

## <font color='purple'> Textual description

A brief text explanations is also an option.    
    
![image.png](attachment:image.png)
    
Source: [Generating Visual Explanations](https://arxiv.org/pdf/1603.08507.pdf)

## <font color='purple'> Formulae
And sometimes an old, good formula is worth a thousand of words:    

House price = $\$2800 * room + \$10000 * {swimming pool} + \$5000 * garage$

# <a id='tradeoff'></a> 3. Trade-off between Accuracy and Interpretability

The thing is that not all kinds of machine learning models are equally interpretable. As a rule, more accurate and advanced algorithms, e.g. neural networks, are hard to explain. Imagine making sense of all these layers' weights!

Thus, it is a job of a data scientist to:
1. Find a trade-off between accuracy and interpretability.

    One may use a linear regression which predictions are easy to explain. But the price for a high interpretability may be a lower metric as compared to a more complicated boosting.

2. Explain a choice of a particular algorightm to a client.

![tradeoff.png](attachment:tradeoff.png)


# <a id='feature_importance'></a> 4. Feature importance
Feature importance helps to answer the question "**What features** affect the model's prediction?"

One of the methods used to estimate the importance of features is Permutation importance.

*Idea*: if we permute the values of an important feature, the data won't reflect the real world anymore and the accuracy of the model will go down.

The method work as follows:

- Train the model 
- Mix up all values of the feature `X`. Make a prediction on an updated data.
- Compute $Importance(X) = Accuracy_{actual} - Accuracy_{permutated}$.
- Restore the actual order of the feature's values. Repeat steps 2-3 with a next feature.


**Advantages:**
- Concise global explanation of the model's behaviour.
- Easy to interpret.
- No need to re-train a model again and again.

**Disadvantages:**
- Need the ground truth values for the target.
- Connection to a model's error. It's not always bad, simply not something we need in some cases.

    Sometimes we want to know how much the prediction will change depending on the feature's value without taking into account how much the metric will change.

# <a id='dependency_plots'></a> 5. Dependency plots

Partial Dependency Plots will help you to answer the question "**How** does the feature affect the predictions?"
PDP provides a quick look on the global relationship between the feature and the prediction. The plot below shows that the more money is at one's bank account, the higher the probability of one signing a term deposit during [a bank campaign](http://https://archive.ics.uci.edu/ml/datasets/Bank+Marketing).

![PDP.svg](attachment:PDP.svg)

Let's look at how this plot is created:
1. Take one sample: a single student, no loans, balance is around \$1000.
2. Increase the latter feature up to 5000.
3. Make a prediction on an updated sample.
4. What is the model output if `balance==10`? And so on.
5. Moving along the x axis, from smaller to larger values, plot the resulting predictions on the y axis.

Now, we considered only one sample. To create a PDP, we need to repeat this procedure for all the samples, i.e. all the rows in our dataset, and then draw the average prediction.

**Advantages:**
- Easy to interpret
- Enables the interpretation of causality

**Disadvantages:**
- One plot can give you the analysis of only one or two features. Plots with more features would be difficult for humans to comprehend.
- An assumption of the independent features. However, this assumption is often violated in real life. 

    Why is this a problem? Imagine that we want to draw a PDP for the data with correlated features. While we change the values of one feature, the values of the related feature stay the same. As a result, we can get unrealistic data points. For instance, we are interested in the feature `Weight`, but the dataset also contains such a feature as `Height`. As we change the value of `Weight`, the value of `Height` is fixed so we can end up having a sample with `Weight==200 kg` and `Height==150 cm`.
- Opposite effects can cancel out the feature's impact.
    
    Imagine that a half of the values of a particular feature is positively correlated with the target: the higher the value, the higher the model's outcome. On the other hand, a half of the values is negatively correlated with the target: the lower the feature's value, the higher the prediction. In this case, a PDP may be a horizontal line since the positive effects got cancelled out by the negative ones.


# <a id='local'></a> 6. Local interpretation

For now, we have considered two methods of global interpretation: feature importance and dependecy plots. These approaches help us to explain our model's behaviour, well, at a global level which is surely nice. However, we often need to explain a particular prediction for an individual sample. To achieve this goal, we may turn to local interpretation. One technique that can be used here is [LIME, Local Interpretable Model-agnostic Explanations](https://github.com/marcotcr/lime) 

The idea is as follows: instead of interpreting predictions of the black box we have at hand, we create a local surrogate model which is interpretable by its nature (e.g. a linear regression or a decision tree), use it to predict on an interesting data point and finally explain the prediction.

![lime.png](attachment:lime.png)

On the picture above, the prediction to explain is a big red cross. Blue and pink areas represent the complex decision function of the black box model. Surely, this cannot be approximated by a linear model. However, as we see, the dashed line that represents the learned explanation is locally faithful.

Source: [Why Should I Trust You?](https://arxiv.org/pdf/1602.04938.pdf)

**Advantages:**
- Concise and clear explanations.
- Compatible with most of data types: texts, images, tabular data.
- The speed of computation as we focus on one sample at a time.

**Disadvantages:**
- Only linear models are used to approximate the model's local behaviour.
- No global explanations.

# <a id='shap'></a> 7. SHAP

[SHapley Additive exPlantions (SHAP)](https://github.com/slundberg/shap) is an method based on the concept of the Shapley values from the game theory.

Idea: a feature is a "player", a prediction is a "gain". Then the Shapley value is the contribution of a feature averaged over all possible combinations of a "team":


$$\phi_i(v) = \sum_{ S \subseteq N \setminus \lbrace i \rbrace } {{|S| ! ( N - |S| - 1 )!} \over {N!}} ( v( S \cup \lbrace i \rbrace) - v( S ))$$


$N$ - all players.

$S$ - the "team" of $N$ players.

$v(S)$ - the gain of $S$.

$v( S \cup \lbrace i \rbrace) - v(S)$ - the "player's" contribution when joining $S$.

**Advantages:**
- Global and local interpretation.
- Intuitively clear local explanations: the prediction is represented as a game outcome where the features are the team players.

**Disadvantages:**
- Shap returns only one value for each feature, not an interpretable model as LIME does.
- Slow when creating a global interpretation.


# <a id='practice'></a> 8. Practice! Explaining your ML model 🔍

## Loading data

Our __task__ is to predict whether a person earns more than 50,000$ a year.

In [1]:
import warnings
warnings.filterwarnings('ignore')

import pandas as pd
import numpy as np
#import pdpbox, lime, shap, eli5
from matplotlib import pyplot as plt

from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import accuracy_score, recall_score
from sklearn.model_selection import train_test_split
from imblearn.combine import SMOTETomek

%matplotlib inline

In [2]:
from aif360.datasets import StandardDataset
from aif360.metrics import BinaryLabelDatasetMetric, ClassificationMetric
import matplotlib.patches as patches
from aif360.algorithms.preprocessing import Reweighing
#from packages import *
#from ml_fairness import *
import matplotlib.pyplot as plt
import seaborn as sns



from IPython.display import Markdown, display

In [3]:
data = pd.read_csv('../../data/adult.csv')
data.shape

(32561, 15)

In [4]:
data.columns

Index(['age', 'workclass', 'fnlwgt', 'education', 'education.num',
       'marital.status', 'occupation', 'relationship', 'race', 'sex',
       'capital.gain', 'capital.loss', 'hours.per.week', 'native.country',
       'income'],
      dtype='object')

In [5]:
data.head()

Unnamed: 0,age,workclass,fnlwgt,education,education.num,marital.status,occupation,relationship,race,sex,capital.gain,capital.loss,hours.per.week,native.country,income
0,90,?,77053,HS-grad,9,Widowed,?,Not-in-family,White,Female,0,4356,40,United-States,<=50K
1,82,Private,132870,HS-grad,9,Widowed,Exec-managerial,Not-in-family,White,Female,0,4356,18,United-States,<=50K
2,66,?,186061,Some-college,10,Widowed,?,Unmarried,Black,Female,0,4356,40,United-States,<=50K
3,54,Private,140359,7th-8th,4,Divorced,Machine-op-inspct,Unmarried,White,Female,0,3900,40,United-States,<=50K
4,41,Private,264663,Some-college,10,Separated,Prof-specialty,Own-child,White,Female,0,3900,40,United-States,<=50K


## Data Preprocessing and Modeling

Since we are going to tackle this case as a classification problem, let's encode the variable `income` into a binary target.

In [6]:
data['target']=data['income'].map({'<=50K':0,'>50K':1})
data.drop("income",axis=1,inplace=True)
data['target'].value_counts()

0    24720
1     7841
Name: target, dtype: int64

In [7]:
# Let's drop "education.num" feature. We will use one-hot encoding instead.
data.drop("education.num",axis=1,inplace=True)

In [8]:
# Since this example is for educational purposes, we'll also drop 'native-country' feature to decrease our data dimensionality.
data.drop('native.country',axis=1,inplace=True)

In [9]:
# Now we will encode categorical features using one-hot encoding, i.e. each category will now be represented by a separate column
# containing only 0 and 1, depending on whether this category is relevant in a sample (row in our data) 
data=pd.get_dummies(data, drop_first = True)

In [10]:
data.head()

Unnamed: 0,age,fnlwgt,capital.gain,capital.loss,hours.per.week,target,workclass_Federal-gov,workclass_Local-gov,workclass_Never-worked,workclass_Private,...,relationship_Not-in-family,relationship_Other-relative,relationship_Own-child,relationship_Unmarried,relationship_Wife,race_Asian-Pac-Islander,race_Black,race_Other,race_White,sex_Male
0,90,77053,0,4356,40,0,0,0,0,0,...,1,0,0,0,0,0,0,0,1,0
1,82,132870,0,4356,18,0,0,0,0,1,...,1,0,0,0,0,0,0,0,1,0
2,66,186061,0,4356,40,0,0,0,0,0,...,0,0,0,1,0,0,1,0,0,0
3,54,140359,0,3900,40,0,0,0,0,1,...,0,0,0,1,0,0,0,0,1,0
4,41,264663,0,3900,40,0,0,0,0,1,...,0,0,1,0,0,0,0,0,1,0


Let's split our data into train and test in proportions 70/30. We will also fix ```random_state``` for reproducability and use `stratify` to preserve the same class distribution.

In [11]:
y = data['target'].values
features = [col for col in data.columns if col not in ['target']]
X = data[features]

X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=1, test_size=0.3, stratify=y)
print(X_train.shape, y_train.shape)
print(X_test.shape, y_test.shape)

(22792, 58) (22792,)
(9769, 58) (9769,)


In [12]:
model = RandomForestClassifier(random_state=1).fit(X_train, y_train)
y_pred = model.predict(X_test)

print("Accuracy: %.2f" %accuracy_score(y_test, y_pred))
print("Recall: %.2f" %recall_score(y_test, y_pred))

Accuracy: 0.86
Recall: 0.62


In [13]:
print(data['target'])

0        0
1        0
2        0
3        0
4        0
        ..
32556    0
32557    0
32558    1
32559    0
32560    0
Name: target, Length: 32561, dtype: int64


### Explore what features are important for the model. 

Here we are going to use permutation feature importance.

In [14]:
import eli5
from eli5.sklearn import PermutationImportance

imp = PermutationImportance(model, random_state=1).fit(X_test, y_test)
eli5.show_weights(imp, feature_names = X_test.columns.tolist())

ModuleNotFoundError: No module named 'eli5'

Ok, looks like the most important feature in our case if `capital.gain`. Let's see **how** exactly it influences the target.

In [15]:
from pdpbox import pdp, get_dataset, info_plots

feat_name = 'capital.gain'
capital_gain_pdp = pdp.pdp_isolate(model=model, dataset=X_test, 
                                   model_features=X_test.columns, feature=feat_name)

pdp.pdp_plot(capital_gain_pdp, feat_name)
plt.show()

ModuleNotFoundError: No module named 'pdpbox'

Unsurprisingly, our model show that there's a positive correlation: people who have some capital-gains are more probable to earn more than $50,000.

The more money a person makes, the more they work. Kind of a logical thought, right? But let's check if that's true.

In [16]:
feat_name = 'hours.per.week'

hours_per_week_pdp = pdp.pdp_isolate(model=model, dataset=X_test, 
                                   model_features=X_test.columns, feature=feat_name)

pdp.pdp_plot(hours_per_week_pdp, feat_name)
plt.show()

NameError: name 'pdp' is not defined

Well, actually this logic is totally wrong. The plots shows us the following:
- If a person works <20 hours a week, the chance of gaining \$50K is around a zero. That's plausible because it's probably a part-time job.
- The possibility of earning more than \$50K is increasing linearly when working from 20 up to 40 hours a week.
- However, working more hours won't make you richer on general. This could be explained by the fact that those extra hours (over standard 40) probably represent some side hastle which may be not stable. Another scenario might be that a person has several low-paying part-time jobs.  

Now let's see how we can explain individual predictions of our model. In order to do that we'll find a person earning more than \$50K from the test set and draw some plots with SHAP and LIME.

In [17]:
# check the target. 1? perfect!
y_test[69]

1

In [18]:
# taking a quick look on a sample
pd.DataFrame(X_test.iloc[69]).T

Unnamed: 0,age,fnlwgt,capital.gain,capital.loss,hours.per.week,workclass_Federal-gov,workclass_Local-gov,workclass_Never-worked,workclass_Private,workclass_Self-emp-inc,...,relationship_Not-in-family,relationship_Other-relative,relationship_Own-child,relationship_Unmarried,relationship_Wife,race_Asian-Pac-Islander,race_Black,race_Other,race_White,sex_Male
2389,46,243190,7688,0,40,0,0,0,1,0,...,0,0,0,0,1,0,0,0,1,0


In [19]:
# First, create a prediction on this sample
row = X_test.iloc[69]
to_predict = row.values.reshape(1, -1)

model.predict_proba(to_predict)

array([[0.07, 0.93]])

Our model predicts that this person earns more than \$50K a year with the probabiltiy over 90%. Let's find out what affected this prediction.

In [None]:
import shap 
# create object that can calculate shap values
explainer = shap.TreeExplainer(model)

# calculate Shap values
shap_values = explainer.shap_values(row)

In [None]:
# draw a plot
shap.initjs()
shap.force_plot(explainer.expected_value[1], shap_values[1], row)

Let's read the plot above:
- Basically, the base value is the mean of the model output over the train set. This means that accoding to our model, the probability of earning more than \$50K over is on average 24%.
- The red arrows show us **what** features and **how much** "pushed" the probability for a given person to earn more than people on general. Here the capital-gain amounts for more than \$7K. Quite probable this person earns at least 50 grand a year. 
- The opposite goes for the blue arrows. Ouch, it seems like our model identified one of the trends of the job market: women earn less than men somehow. Since the model was trained on this kind of data, it identifies one's gender as one of the features affecting one's income.


Let's also create a local explanation for that prediction using LIME:

In [None]:
import lime.lime_tabular

explainer = lime.lime_tabular.LimeTabularExplainer(X_train.values, feature_names=X_test.columns,
                                                    discretize_continuous=True)

exp = explainer.explain_instance(row, model.predict_proba, num_features=8)
exp.show_in_notebook(show_table=True)

Looks about right: capital-gain is identified as the most important feature for predicting the target as `1`.


# <a id='wrapping_up'></a> 9. Wrapping up

Hope that this notebook was useful and now you know how to turn your black box into a explainable and trustworthy model. Cheers!

**Bonus:** a quick (but totally not comprehensive!) overview of some tools for interpretable ML.

![interpret.png](attachment:interpret.png)

# <a id='resources'></a> 10. Great resources to learn more about interpretable ML 📚

- [Ch. Molnar, Interpretable ML book](https://christophm.github.io/interpretable-ml-book/)
- [Kaggle course by Dan Becker: Machine Learning Explainability](https://www.kaggle.com/learn/machine-learning-explainability)
- [LIME repo](https://github.com/marcotcr/lime)
- [Shap repo](https://github.com/slundberg/shap)
- [InterpretML: Open-source project by Microsoft](https://github.com/interpretml/interpret)


In [24]:
data

Unnamed: 0,age,fnlwgt,capital.gain,capital.loss,hours.per.week,target,workclass_Federal-gov,workclass_Local-gov,workclass_Never-worked,workclass_Private,...,relationship_Not-in-family,relationship_Other-relative,relationship_Own-child,relationship_Unmarried,relationship_Wife,race_Asian-Pac-Islander,race_Black,race_Other,race_White,sex_Male
0,90,77053,0,4356,40,0,0,0,0,0,...,1,0,0,0,0,0,0,0,1,0
1,82,132870,0,4356,18,0,0,0,0,1,...,1,0,0,0,0,0,0,0,1,0
2,66,186061,0,4356,40,0,0,0,0,0,...,0,0,0,1,0,0,1,0,0,0
3,54,140359,0,3900,40,0,0,0,0,1,...,0,0,0,1,0,0,0,0,1,0
4,41,264663,0,3900,40,0,0,0,0,1,...,0,0,1,0,0,0,0,0,1,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
32556,22,310152,0,0,40,0,0,0,0,1,...,1,0,0,0,0,0,0,0,1,1
32557,27,257302,0,0,38,0,0,0,0,1,...,0,0,0,0,1,0,0,0,1,0
32558,40,154374,0,0,40,1,0,0,0,1,...,0,0,0,0,0,0,0,0,1,1
32559,58,151910,0,0,40,0,0,0,0,1,...,0,0,0,1,0,0,0,0,1,0


## Fairness

In [20]:
# This DataFrame is created to stock differents models and fair metrics that we produce in this notebook
algo_metrics = pd.DataFrame(columns=['model', 'fair_metrics', 'prediction', 'probs'])

def add_to_df_algo_metrics(algo_metrics, model, fair_metrics, preds, probs, name):
    return algo_metrics.append(pd.DataFrame(data=[[model, fair_metrics, preds, probs]], columns=['model', 'fair_metrics', 'prediction', 'probs'], index=[name]))

In [22]:
def fair_metrics(dataset, pred, pred_is_dataset=False):
    if pred_is_dataset:
        dataset_pred = pred
    else:
        dataset_pred = dataset.copy()
        dataset_pred.labels = pred
    
    cols = ['statistical_parity_difference', 'equal_opportunity_difference', 'average_abs_odds_difference',  'disparate_impact', 'theil_index']
    obj_fairness = [[0,0,0,1,0]]
    
    fair_metrics = pd.DataFrame(data=obj_fairness, index=['objective'], columns=cols)
    
    for attr in dataset_pred.protected_attribute_names:
        idx = dataset_pred.protected_attribute_names.index(attr)
        privileged_groups =  [{attr:dataset_pred.privileged_protected_attributes[idx][0]}] 
        unprivileged_groups = [{attr:dataset_pred.unprivileged_protected_attributes[idx][0]}] 
        
        classified_metric = ClassificationMetric(dataset, 
                                                     dataset_pred,
                                                     unprivileged_groups=unprivileged_groups,
                                                     privileged_groups=privileged_groups)

        metric_pred = BinaryLabelDatasetMetric(dataset_pred,
                                                     unprivileged_groups=unprivileged_groups,
                                                     privileged_groups=privileged_groups)

        acc = classified_metric.accuracy()

        row = pd.DataFrame([[metric_pred.mean_difference(),
                                classified_metric.equal_opportunity_difference(),
                                classified_metric.average_abs_odds_difference(),
                                metric_pred.disparate_impact(),
                                classified_metric.theil_index()]],
                           columns  = cols,
                           index = [attr]
                          )
        fair_metrics = fair_metrics.append(row)    
    
    fair_metrics = fair_metrics.replace([-np.inf, np.inf], 2)
        
    return fair_metrics

def plot_fair_metrics(fair_metrics):
    fig, ax = plt.subplots(figsize=(20,4), ncols=5, nrows=1)

    plt.subplots_adjust(
        left    =  0.125, 
        bottom  =  0.1, 
        right   =  0.9, 
        top     =  0.9, 
        wspace  =  .5, 
        hspace  =  1.1
    )

    y_title_margin = 1.2

    plt.suptitle("Fairness metrics", y = 1.09, fontsize=20)
    sns.set(style="dark")

    cols = fair_metrics.columns.values
    obj = fair_metrics.loc['objective']
    size_rect = [0.2,0.2,0.2,0.4,0.25]
    rect = [-0.1,-0.1,-0.1,0.8,0]
    bottom = [-1,-1,-1,0,0]
    top = [1,1,1,2,1]
    bound = [[-0.1,0.1],[-0.1,0.1],[-0.1,0.1],[0.8,1.2],[0,0.25]]

    display(Markdown("### Check bias metrics :"))
    display(Markdown("A model can be considered bias if just one of these five metrics show that this model is biased."))
    for attr in fair_metrics.index[1:len(fair_metrics)].values:
        display(Markdown("#### For the %s attribute :"%attr))
        check = [bound[i][0] < fair_metrics.loc[attr][i] < bound[i][1] for i in range(0,5)]
        display(Markdown("With default thresholds, bias against unprivileged group detected in **%d** out of 5 metrics"%(5 - sum(check))))

    for i in range(0,5):
        plt.subplot(1, 5, i+1)
        ax = sns.barplot(x=fair_metrics.index[1:len(fair_metrics)], y=fair_metrics.iloc[1:len(fair_metrics)][cols[i]])
        
        for j in range(0,len(fair_metrics)-1):
            a, val = ax.patches[j], fair_metrics.iloc[j+1][cols[i]]
            marg = -0.2 if val < 0 else 0.1
            ax.text(a.get_x()+a.get_width()/5, a.get_y()+a.get_height()+marg, round(val, 3), fontsize=15,color='black')

        plt.ylim(bottom[i], top[i])
        plt.setp(ax.patches, linewidth=0)
        ax.add_patch(patches.Rectangle((-5,rect[i]), 10, size_rect[i], alpha=0.3, facecolor="green", linewidth=1, linestyle='solid'))
        plt.axhline(obj[i], color='black', alpha=0.3)
        plt.title(cols[i])
        ax.set_ylabel('')    
        ax.set_xlabel('')

In [30]:
def get_fair_metrics_and_plot(data, model, plot=False, model_aif=False):
    pred = model.predict(data).labels if model_aif else model.predict(data.features)
    # fair_metrics function available in the metrics.py file
    fair = fair_metrics(data, pred)

    if plot:
        # plot_fair_metrics function available in the visualisations.py file
        # The visualisation of this function is inspired by the dashboard on the demo of IBM aif360 
        plot_fair_metrics(fair)
        display(fair)
    
    return fair

In [26]:
#print(X)


#combine_final = [train_df, test_df]
#result = pd.concat(combine_final)
#print(result.ifany())
#print(result)
privileged_groups = [{'sex_Male': 1}]
unprivileged_groups = [{'sex_Male': 0}]
dataset_orig = StandardDataset(data,
                                  label_name='target',
                                  protected_attribute_names=['sex_Male'],
                                  favorable_classes=[1],
                                  privileged_classes=[[1]])

#metric_orig_train = BinaryLabelDatasetMetric(dataset_orig, 
#                                             unprivileged_groups=unprivileged_groups,
#                                             privileged_groups=privileged_groups)
#display(Markdown("#### Original training dataset"))
#print("Difference in mean outcomes between unprivileged and privileged groups = %f" % metric_orig_train.mean_difference())


In [27]:
metric_orig_train = BinaryLabelDatasetMetric(dataset_orig, 
                                             unprivileged_groups=unprivileged_groups,
                                             privileged_groups=privileged_groups)
display(Markdown("#### Original training dataset"))
print("Difference in mean outcomes between unprivileged and privileged groups = %f" % metric_orig_train.mean_difference())

#### Original training dataset

Difference in mean outcomes between unprivileged and privileged groups = -0.196276


In [31]:
import ipynbname
nb_fname = ipynbname.name()
nb_path = ipynbname.path()

from xgboost import XGBClassifier
import pickle

data_orig_train, data_orig_test = dataset_orig.split([0.7], shuffle=True)
X_train = data_orig_train.features
y_train = data_orig_train.labels.ravel()

X_test = data_orig_test.features
y_test = data_orig_test.labels.ravel()
num_estimators = 100

model = RandomForestClassifier(n_estimators=100)


mdl = model.fit(X_train, y_train)
with open('../../Results/RF/' + nb_fname + '.pkl', 'wb') as f:
        pickle.dump(mdl, f)

with open('../../Results/RF/' + nb_fname + '_Train' + '.pkl', 'wb') as f:
    pickle.dump(data_orig_train, f) 
    
with open('../../Results/RF/' + nb_fname + '_Test' + '.pkl', 'wb') as f:
    pickle.dump(data_orig_test, f) 

In [32]:
from csv import writer
from sklearn.metrics import accuracy_score, f1_score

final_metrics = []
accuracy = []
f1= []

for i in range(1,num_estimators+1):
    
    model = RandomForestClassifier(n_estimators=i)

    
    mdl = model.fit(X_train, y_train)
    yy = mdl.predict(X_test)
    accuracy.append(accuracy_score(y_test, yy))
    f1.append(f1_score(y_test, yy))
    fair = get_fair_metrics_and_plot(data_orig_test, mdl)                           
    fair_list = fair.iloc[1].tolist()
    fair_list.insert(0, i)
    final_metrics.append(fair_list)


In [33]:
import numpy as np
final_result = pd.DataFrame(final_metrics)
print(final_result)
final_result[4] = np.log(final_result[4])
final_result = final_result.transpose()
final_result.loc[0] = f1  # add f1 and acc to df
acc = pd.DataFrame(accuracy).transpose()
acc = acc.rename(index={0: 'accuracy'})
final_result = pd.concat([acc,final_result])
final_result = final_result.rename(index={0: 'f1', 1: 'statistical_parity_difference', 2: 'equal_opportunity_difference', 3: 'average_abs_odds_difference', 4: 'disparate_impact', 5: 'theil_index'})
final_result.columns = ['T' + str(col) for col in final_result.columns]
final_result.insert(0, "classifier", final_result['T' + str(num_estimators - 1)])   ##Add final metrics add the beginning of the df
final_result.to_csv('../../Results/RF/' + nb_fname + '.csv')
final_result

      0         1         2         3         4         5
0     1 -0.182750 -0.080837  0.090248  0.388229  0.136529
1     2 -0.134901 -0.087296  0.068853  0.296718  0.151777
2     3 -0.196482 -0.090019  0.094819  0.326302  0.121292
3     4 -0.162644 -0.124258  0.094853  0.291798  0.137144
4     5 -0.195207 -0.099651  0.096669  0.308047  0.118813
..  ...       ...       ...       ...       ...       ...
95   96 -0.188846 -0.105558  0.092610  0.289121  0.113673
96   97 -0.183950 -0.080361  0.078211  0.307551  0.111698
97   98 -0.186855 -0.087651  0.084099  0.295393  0.114033
98   99 -0.185945 -0.076476  0.077948  0.302868  0.111976
99  100 -0.185472 -0.077673  0.078519  0.297756  0.112806

[100 rows x 6 columns]


Unnamed: 0,classifier,T0,T1,T2,T3,T4,T5,T6,T7,T8,...,T90,T91,T92,T93,T94,T95,T96,T97,T98,T99
accuracy,0.858123,0.803358,0.835295,0.832429,0.839492,0.841335,0.840311,0.843689,0.849422,0.845634,...,0.855768,0.855052,0.857304,0.85669,0.85802,0.85669,0.858532,0.856075,0.857918,0.858123
f1,0.677674,0.586615,0.571961,0.638871,0.611689,0.651215,0.62518,0.650971,0.648002,0.661276,...,0.675719,0.670698,0.67806,0.675023,0.679158,0.674721,0.680093,0.673479,0.679001,0.677674
statistical_parity_difference,-0.185472,-0.18275,-0.134901,-0.196482,-0.162644,-0.195207,-0.170682,-0.185973,-0.174676,-0.196898,...,-0.189635,-0.186391,-0.191926,-0.185783,-0.18947,-0.188846,-0.18395,-0.186855,-0.185945,-0.185472
equal_opportunity_difference,-0.077673,-0.080837,-0.087296,-0.090019,-0.124258,-0.099651,-0.0968,-0.091402,-0.099941,-0.092015,...,-0.10013,-0.097123,-0.103509,-0.099811,-0.103509,-0.105558,-0.080361,-0.087651,-0.076476,-0.077673
average_abs_odds_difference,0.078519,0.090248,0.068853,0.094819,0.094853,0.096669,0.085845,0.088023,0.086254,0.092945,...,0.090406,0.088211,0.092973,0.088267,0.091354,0.09261,0.078211,0.084099,0.077948,0.078519
disparate_impact,-1.21148,-0.94616,-1.214971,-1.119931,-1.231693,-1.177503,-1.192535,-1.151472,-1.217434,-1.185925,...,-1.210862,-1.220457,-1.246775,-1.207216,-1.22926,-1.24091,-1.179115,-1.219447,-1.194459,-1.21148
theil_index,0.112806,0.136529,0.151777,0.121292,0.137144,0.118813,0.13139,0.120045,0.124045,0.115433,...,0.112658,0.115014,0.112172,0.11351,0.111943,0.113673,0.111698,0.114033,0.111976,0.112806
