## Task 1

We have two populations Blue (privileged) and Red (unprivileged), with the Blue population being 9 times larger than the Red population.

Individuals from both populations are requesting to attend XAI training to improve competency in this important area. Number of places is limited. The administrators of the training have decided to give priority to enrolling individuals who may need this training in the future, although unfortunately it is difficult to predict who will benefit.

The decision rule adopted:
1. In the Red group, half of the people will find the skills useful in future and half will not. Administrators randomly allocate 50% of people to training.
2. in the Blue group, 80% of people will find the training useful in future and 20% will not, although of course it is not known who will find it useful. The administrators have built a predictive model based on user behaviour in predicting for whom it will be useful and whom will not. The model has the following performance:


| Blue                     	| Will use XAI 	| Will not use XAI 	| Total 	|
|--------------------------	|--------------	|------------------	|-------	|
| Enrolled in training     	| 60           	| 5               	| 65    	|
| not enrolled in training 	| 20            	| 15               	| 35    	|
| Total                    	| 80           	| 20               	| 100   	|


Task: Calculate the Demographic parity, equal opportunity and predictive rate parity coefficients for this decision rule.

Starred task: How can this decision rule be changed to improve its fairness?


| Red                     	| Will use XAI 	| Will not use XAI 	| Total 	|
|--------------------------	|--------------	|------------------	|-------	|
| Enrolled in training     	| 25           	| 25               	| 50    	|
| not enrolled in training 	| 25           	| 25               	| 50    	|
| Total                    	| 50           	| 50               	| 100   	|


### Demographic parity
$$P(\hat{Y}= 1 | A = blue) = 0.65$$
$$P(\hat{Y}= 1 | A = red) = 0.5$$
$$DP = 0.5/0.65 = 0.769$$

### Equal opportunity
$$P(\hat{Y}= 1 | Y = \text{use XAI}, A = blue) = 0.75$$
$$P(\hat{Y}= 1 | Y = \text{use XAI}, A = red) = 0.5$$
$$EO = 0.5/0.75 = 0.667$$

### Predictive rate parity
$$P(Y = \text{use XAI}| \hat{Y}= 1 , A = blue) = 60/65$$
$$P(Y = \text{use XAI}| \hat{Y}= 1 , A = red) = 0.5$$
$$P(Y = \text{use XAI}| \hat{Y}= 0 , A = blue) = 20/35$$
$$P(Y = \text{use XAI}| \hat{Y}= 0 , A = red) = 0.5$$
$$Negative PRP = 0.5/(60/65) = 0.54$$
$$Positive PRP = 0.5/(20/35) = 0.875$$

## Task 2

For this homework, train few models on a selected dataset from https://github.com/ahxt/fair_fairness_benchmark/:

Prepare a knitr/jupiter notebook with the following points.
Submit your results on GitHub to the directory `Homeworks/HW1`.

1. Train a model for the selected dataset. 
2. For the selected protected attribute (age, gender, race) calculate the following fairness coefficients: Statistical parity, Equal opportunity, Predictive parity.
3. Train another model (different hyperparameters, feature transformations etc., different family of models) and see how the coefficients Statistical parity, Equal opportunity, Predictive parity behave for it. Are they different/similar?
4. Apply the selected bias mitigation technique (like data balancing) on the first model. Check how Statistical parity, Equal opportunity, Predictive parity coefficients behave after this mittigation.
5. Compare the quality (performance) of the three models with their fairness coefficients. Is there any correlation/trade off? 
6. ! COMMENT on the results obtained in (2)-(5)

In [23]:
import numpy as np
import pandas as pd
import torch.nn.functional as F
import torch.nn as nn
import torch
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
import torch.optim as optim
import dalex as dx
from sklearn.neural_network import MLPClassifier
from sklearn.preprocessing import StandardScaler, OneHotEncoder
from sklearn.base import BaseEstimator, ClassifierMixin
from sklearn.impute import SimpleImputer
from sklearn.pipeline import Pipeline
from sklearn.compose import ColumnTransformer
from sklearn.tree import DecisionTreeClassifier
import plotly

In [24]:
# Load the data
file_path = 'german.data'  # Replace with your actual file path if different

# Define column names based on the German Credit dataset documentation
column_names = [
    "Status_of_existing_checking_account", "Duration_in_month", "Credit_history", 
    "Purpose", "Credit_amount", "Savings_account_bonds", "Present_employment_since", 
    "Installment_rate_in_percentage_of_disposable_income", "Personal_status_and_sex", 
    "Other_debtors_guarantors", "Present_residence_since", "Property", 
    "Age_in_years", "Other_installment_plans", "Housing", 
    "Number_of_existing_credits_at_this_bank", "Job", "Number_of_people_being_liable_to_provide_maintenance_for", 
    "Telephone", "Foreign_worker", "Credit_risk"  # 'Credit_risk' is the target variable
]

data = pd.read_csv(file_path, delim_whitespace=True, header=None, names=column_names)

In [25]:
data_map ={ "A151" : "rent",
	      "A152" : "own",
	      "A153" : "forfree" }


data.Housing = data.Housing.map(data_map)

# Convert the 'Credit_risk' to binary labels (assuming 1: good, 2: bad)
data['Credit_risk'] = data['Credit_risk'].map({1: 1, 2: 0})  # 1 for 'good', 0 for 'bad'

# Convert categorical features to numerical using one-hot encoding
#data_encoded = pd.get_dummies(data, drop_first=True)

# Separate features (X) and target (y)
X = data.drop('Credit_risk', axis=1)
y = data['Credit_risk']

#X, X_test, y, y_test = train_test_split(X, y, test_size=0.33, random_state=42)

In [26]:
numerical_features = [ "Duration_in_month",  
    "Credit_amount",   
    "Installment_rate_in_percentage_of_disposable_income", 
    "Present_residence_since", 
    "Age_in_years", 
    "Number_of_existing_credits_at_this_bank", "Number_of_people_being_liable_to_provide_maintenance_for", 
    ]
numerical_transformer = Pipeline(
    steps=[
        ('imputer', SimpleImputer(strategy='median')),
        ('scaler', StandardScaler())
    ]
)

categorical_features = ["Status_of_existing_checking_account", "Credit_history",
                        "Purpose", "Savings_account_bonds", "Present_employment_since",
                        "Personal_status_and_sex", "Other_debtors_guarantors",
                        "Property", "Other_installment_plans", 
                        "Telephone", "Housing", "Job", "Foreign_worker"]
categorical_transformer = Pipeline( 
    steps=[
        ('imputer', SimpleImputer(strategy='constant', fill_value='missing')) ,
        ('onehot', OneHotEncoder(handle_unknown='ignore'))]
)

preprocessor = ColumnTransformer(
    transformers=[
        ('num', numerical_transformer, numerical_features),
        ('cat', categorical_transformer, categorical_features)
    ]
)

classifier = MLPClassifier(hidden_layer_sizes=(50,20), max_iter=50, random_state=0)
#classifier = DecisionTreeClassifier(max_depth=10, random_state=0)

clf = Pipeline(steps=[('preprocessor', preprocessor),
                      ('classifier', classifier)])

In [27]:
clf.fit(X, y)


Stochastic Optimizer: Maximum iterations (50) reached and the optimization hasn't converged yet.



In [28]:
exp = dx.Explainer(clf, X, y)

Preparation of a new explainer is initiated

  -> data              : 1000 rows 20 cols
  -> target variable   : Parameter 'y' was a pandas.Series. Converted to a numpy.ndarray.
  -> target variable   : 1000 values
  -> model_class       : sklearn.neural_network._multilayer_perceptron.MLPClassifier (default)
  -> label             : Not specified, model's class short name will be used. (default)
  -> predict function  : <function yhat_proba_default at 0x000001ED223113A0> will be used (default)
  -> predict function  : Accepts only pandas.DataFrame, numpy.ndarray causes problems.
  -> predicted values  : min = 0.00406, mean = 0.714, max = 0.999
  -> model type        : classification will be used (default)
  -> residual function : difference between y and yhat (default)
  -> residuals         : min = -0.955, mean = -0.0139, max = 0.689
  -> model_info        : package sklearn

A new explainer has been created!


In [29]:
#protected = X_test.Housing  #+ '_' + np.where(data.Age_in_years < 25, 'young', 'old')
protected = np.where(X.Age_in_years < 30, 'young', 'old')
privileged = 'old' 

In [30]:
fobject = exp.model_fairness(protected = protected, privileged=privileged)

In [31]:
fobject.fairness_check(epsilon = 0.8) # default epsilon

No bias was detected!

Conclusion: your model is fair in terms of checked fairness criteria.

Ratios of metrics, based on 'old'. Parameter 'epsilon' was set to 0.8 and therefore metrics should be within (0.8, 1.25)
            TPR       ACC       PPV       FPR       STP
young  0.962963  0.933993  0.924259  1.107407  0.887342


In [32]:
# or unscaled ones via
fobject.metric_scores

Unnamed: 0,TPR,TNR,PPV,NPV,FNR,FPR,FDR,FOR,ACC,STP
old,0.972,0.73,0.911,0.902,0.028,0.27,0.089,0.098,0.909,0.79
young,0.936,0.701,0.842,0.865,0.064,0.299,0.158,0.135,0.849,0.701


In [33]:
mp = exp.model_performance(model_type = 'classification')
#dt.result

In [34]:
numerical_features = [ "Duration_in_month",  
    "Credit_amount",   
    "Installment_rate_in_percentage_of_disposable_income", 
    "Present_residence_since", 
    "Age_in_years", 
    "Number_of_existing_credits_at_this_bank", "Number_of_people_being_liable_to_provide_maintenance_for", 
    ]
numerical_transformer = Pipeline(
    steps=[
        ('imputer', SimpleImputer(strategy='median')),
        ('scaler', StandardScaler())
    ]
)

categorical_features = ["Status_of_existing_checking_account", "Credit_history",
                        "Purpose", "Savings_account_bonds", "Present_employment_since",
                        "Personal_status_and_sex", "Other_debtors_guarantors",
                        "Property", "Other_installment_plans", 
                        "Telephone", "Housing", "Job", "Foreign_worker"]
categorical_transformer = Pipeline( 
    steps=[
        ('imputer', SimpleImputer(strategy='constant', fill_value='missing')),
        ('onehot', OneHotEncoder(handle_unknown='ignore'))
    ]
)

preprocessor = ColumnTransformer(
    transformers=[
        ('num', numerical_transformer, numerical_features),
        ('cat', categorical_transformer, categorical_features)
    ]
)

classifier_2 = DecisionTreeClassifier(max_depth=10, random_state=0)
clf_2 = Pipeline(steps=[('preprocessor', preprocessor),
                      ('classifier', classifier_2)])
clf_2.fit(X, y)

In [35]:
exp_2 = dx.Explainer(clf_2, X, y)

Preparation of a new explainer is initiated

  -> data              : 1000 rows 20 cols
  -> target variable   : Parameter 'y' was a pandas.Series. Converted to a numpy.ndarray.
  -> target variable   : 1000 values
  -> model_class       : sklearn.tree._classes.DecisionTreeClassifier (default)
  -> label             : Not specified, model's class short name will be used. (default)
  -> predict function  : <function yhat_proba_default at 0x000001ED223113A0> will be used (default)
  -> predict function  : Accepts only pandas.DataFrame, numpy.ndarray causes problems.
  -> predicted values  : min = 0.0, mean = 0.7, max = 1.0
  -> model type        : classification will be used (default)
  -> residual function : difference between y and yhat (default)
  -> residuals         : min = -0.936, mean = 0.0, max = 0.818
  -> model_info        : package sklearn

A new explainer has been created!


In [36]:
dt = exp_2.model_performance(model_type = 'classification')
fobject_2 = exp_2.model_fairness(protected = protected, privileged=privileged)
fobject_2.fairness_check(epsilon = 0.8) # default epsilon

Bias detected in 1 metric: FPR

Conclusion: your model cannot be called fair because 1 criterion exceeded acceptable limits set by epsilon.
It does not mean that your model is unfair but it cannot be automatically approved based on these metrics.

Ratios of metrics, based on 'old'. Parameter 'epsilon' was set to 0.8 and therefore metrics should be within (0.8, 1.25)
            TPR       ACC       PPV       FPR      STP
young  0.986775  0.940928  0.920886  1.575163  0.91276


In [37]:
fobject.fairness_check(epsilon = 0.8) # default epsilon
fobject_2.fairness_check(epsilon = 0.8) # default epsilon

No bias was detected!

Conclusion: your model is fair in terms of checked fairness criteria.

Ratios of metrics, based on 'old'. Parameter 'epsilon' was set to 0.8 and therefore metrics should be within (0.8, 1.25)
            TPR       ACC       PPV       FPR       STP
young  0.962963  0.933993  0.924259  1.107407  0.887342
Bias detected in 1 metric: FPR

Conclusion: your model cannot be called fair because 1 criterion exceeded acceptable limits set by epsilon.
It does not mean that your model is unfair but it cannot be automatically approved based on these metrics.

Ratios of metrics, based on 'old'. Parameter 'epsilon' was set to 0.8 and therefore metrics should be within (0.8, 1.25)
            TPR       ACC       PPV       FPR      STP
young  0.986775  0.940928  0.920886  1.575163  0.91276


In [38]:
results = pd.concat([mp.result,dt.result], ignore_index=True)
# first row is DNN, second decision treee
results

Unnamed: 0,recall,precision,f1,accuracy,auc
0,0.96,0.887715,0.922443,0.887,0.951157
1,0.978571,0.921938,0.949411,0.927,0.98174


## 4) Apply bias migitation

In [39]:
numerical_features = [ "Duration_in_month",  
    "Credit_amount",   
    "Installment_rate_in_percentage_of_disposable_income", 
    "Present_residence_since", 
    "Age_in_years", 
    "Number_of_existing_credits_at_this_bank", "Number_of_people_being_liable_to_provide_maintenance_for", 
    ]
numerical_transformer = Pipeline(
    steps=[
        ('imputer', SimpleImputer(strategy='median')),
        ('scaler', StandardScaler())
    ]
)

categorical_features = ["Status_of_existing_checking_account", "Credit_history",
                        "Purpose", "Savings_account_bonds", "Present_employment_since",
                        "Personal_status_and_sex", "Other_debtors_guarantors",
                        "Property", "Other_installment_plans", 
                        "Telephone", "Housing", "Job", "Foreign_worker"]
categorical_transformer = Pipeline( 
    steps=[
        ('imputer', SimpleImputer(strategy='constant', fill_value='missing')) ,
        ('onehot', OneHotEncoder(handle_unknown='ignore'))]
)

preprocessor = ColumnTransformer(
    transformers=[
        ('num', numerical_transformer, numerical_features),
        ('cat', categorical_transformer, categorical_features)
    ]
)

#classifier = MLPClassifier(hidden_layer_sizes=(150,100,50), max_iter=500, random_state=0)
classifier_3 = MLPClassifier(hidden_layer_sizes=(50,20), max_iter=50, random_state=0)


In [40]:
class PostProcessedClassifier(BaseEstimator, ClassifierMixin):
    def __init__(self, base_classifier, perturbation=0.01):
        self.base_classifier = base_classifier
        self.perturbation = perturbation
    
    def fit(self, X, y):
        # Train the underlying classifier
        self.base_classifier.fit(X, y)
        return self
    
    def predict(self, X):
        # Get the predicted probabilities (if applicable)
        probas = self.base_classifier.predict_proba(X)
        
        # Apply small perturbation to probabilities
        perturbed_probas = probas +  self.perturbation
        
        # Ensure probabilities are still valid (between 0 and 1)
        perturbed_probas = np.clip(perturbed_probas, 0, 1)
        
        # Return new predictions based on perturbed probabilities
        return np.argmax(perturbed_probas, axis=1)
    
    def predict_proba(self, X):
        # Get the predicted probabilities from the base classifier
        probas = self.base_classifier.predict_proba(X)
        
        # Apply small perturbation to probabilities
        perturbed_probas = probas + self.perturbation
        
        # Ensure probabilities are valid
        return np.clip(perturbed_probas, 0, 1)

# Wrap your classifier with the post-processing step
perturbed_classifier = PostProcessedClassifier(classifier_3, perturbation=0.1)

# Redefine the pipeline
clf_with_mitigation = Pipeline(steps=[('preprocessor', preprocessor),
                                      ('classifier', perturbed_classifier)])

# Train and evaluate as normal
clf_with_mitigation.fit(X, y)
#y_pred = clf_with_mitigation.predict(X_test)



Stochastic Optimizer: Maximum iterations (50) reached and the optimization hasn't converged yet.



In [41]:
exp_3 = dx.Explainer(clf_with_mitigation, X, y)

Preparation of a new explainer is initiated

  -> data              : 1000 rows 20 cols
  -> target variable   : Parameter 'y' was a pandas.Series. Converted to a numpy.ndarray.
  -> target variable   : 1000 values
  -> model_class       : __main__.PostProcessedClassifier (default)
  -> label             : Not specified, model's class short name will be used. (default)
  -> predict function  : <function yhat_proba_default at 0x000001ED223113A0> will be used (default)
  -> predict function  : Accepts only pandas.DataFrame, numpy.ndarray causes problems.
  -> predicted values  : min = 0.104, mean = 0.789, max = 1.0
  -> model type        : classification will be used (default)
  -> residual function : difference between y and yhat (default)
  -> residuals         : min = -1.0, mean = -0.0889, max = 0.589
  -> model_info        : package sklearn

A new explainer has been created!


In [42]:
mp_m = exp_3.model_performance(model_type = 'classification')
fobject_3 = exp_3.model_fairness(protected = protected, privileged=privileged)

final_result = pd.concat([results,mp_m.result], ignore_index=True)
final_result

Unnamed: 0,recall,precision,f1,accuracy,auc
0,0.96,0.887715,0.922443,0.887,0.951157
1,0.978571,0.921938,0.949411,0.927,0.98174
2,0.981429,0.852357,0.912351,0.868,0.949438


In [43]:
fobject.fairness_check(epsilon = 0.8) # default epsilon
fobject_2.fairness_check(epsilon = 0.8) # default epsilon
fobject_3.fairness_check(epsilon = 0.8) # default epsilon

No bias was detected!

Conclusion: your model is fair in terms of checked fairness criteria.

Ratios of metrics, based on 'old'. Parameter 'epsilon' was set to 0.8 and therefore metrics should be within (0.8, 1.25)
            TPR       ACC       PPV       FPR       STP
young  0.962963  0.933993  0.924259  1.107407  0.887342
Bias detected in 1 metric: FPR

Conclusion: your model cannot be called fair because 1 criterion exceeded acceptable limits set by epsilon.
It does not mean that your model is unfair but it cannot be automatically approved based on these metrics.

Ratios of metrics, based on 'old'. Parameter 'epsilon' was set to 0.8 and therefore metrics should be within (0.8, 1.25)
            TPR       ACC       PPV       FPR      STP
young  0.986775  0.940928  0.920886  1.575163  0.91276
No bias was detected!

Conclusion: your model is fair in terms of checked fairness criteria.

Ratios of metrics, based on 'old'. Parameter 'epsilon' was set to 0.8 and therefore metrics should b

In [44]:
fobject.plot()
fobject_2.plot()
fobject_3.plot()

## Comment on results
Statistical parity - STP, Equal opportunity - TPR, Predictive parity - ACC

| NN | TPR      | ACC      |STP      |
|-------|----------|---------|----------|
| Young | 0.962963 | 0.933993| 0.887342 |

| DT | TPR      | ACC      | STP      |
|-------|----------|----------|----------|
| Young | 0.986775 | 0.940928 | 0.912760 |


3.  We can see that using different model resulted in significantly different results 

4. After appling post processing data migitation increment of propability to the model returned the following values:

| NN_2 |TPR       |ACC     |STP       |
|-------|----------|--------|----------|
|Young  |0.982776  |0.956916|  0.892729|

We can see that all off the values of our coefficients increased. Lets now compare the performance of models and their coefficients.

|Model name| TPR      | ACC      |STP    |recall	    |precision	|f1	        |accuracy	|auc     |
|----------|----------|----------|-------|--------------|-----------|-----------|-----------|--------|
| NN       | 0.962963 | 0.933993| 0.887342	|0.960000	|0.887715	|0.922443	|0.887	    |0.951157|
| DT       | 0.986775 | 0.940928 | 0.912760	|0.978571	|0.921938	|0.949411	|0.927	    |0.981740|
| NN2      |0.982776  |0.956916|  0.892729	|0.981429	|0.852357	|0.912351	|0.868	    |0.949438|

Despite the improvements for Statistical parity, Equal opportunity and Predictive parity between NN and NN_2 auc, acc, f1 and precision diminished.
DT has the best scores in TPR and STP as well as auc, acc, prec, f1. It seems that you cannot maximize all of the Statistical parity, Equal opportunity and Predictive parity and performance metric at the same time - there always exists a tradeoff.