In [None]:
!pip install 'aif360[all]'

# AIF360 Metrics in Machine Learning

This notebook demonstrates how to use the AI Fairness 360 (AIF360) toolkit to assess and mitigate bias in machine learning models. We'll use the Adult dataset and focus on gender bias.

## Let's collect the adult dataset

In [2]:
!mkdir -p /usr/local/lib/python3.10/dist-packages/aif360/data/raw/adult

In [None]:
!wget https://archive.ics.uci.edu/ml/machine-learning-databases/adult/adult.data -O /usr/local/lib/python3.10/dist-packages/aif360/data/raw/adult/adult.data
!wget https://archive.ics.uci.edu/ml/machine-learning-databases/adult/adult.test -O /usr/local/lib/python3.10/dist-packages/aif360/data/raw/adult/adult.test
!wget https://archive.ics.uci.edu/ml/machine-learning-databases/adult/adult.names -O /usr/local/lib/python3.10/dist-packages/aif360/data/raw/adult/adult.names

## Import packages

In [16]:
from aif360.datasets import AdultDataset
from aif360.metrics import BinaryLabelDatasetMetric
from aif360.algorithms.preprocessing import Reweighing
from aif360.algorithms.inprocessing import PrejudiceRemover
from sklearn.linear_model import LogisticRegression
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import accuracy_score
import numpy as np

In [5]:
# Load the dataset
dataset = AdultDataset()



Can you create markdown cells for this notebook. Go into more detail to explain the fairness metrics, the use of Reweighing (what it is and what it does), and explain the fairness metrics outputs and what we can infer from the metrics before reweighing and after.

In [6]:
type(dataset)

In [7]:
df = dataset.convert_to_dataframe()[0]  # [0] because it returns a tuple (df, _)


In [8]:
print(df.head())


    age  education-num  race  sex  capital-gain  capital-loss  hours-per-week  \
0  25.0            7.0   0.0  1.0           0.0           0.0            40.0   
1  38.0            9.0   1.0  1.0           0.0           0.0            50.0   
2  28.0           12.0   1.0  1.0           0.0           0.0            40.0   
3  44.0           10.0   0.0  1.0        7688.0           0.0            40.0   
5  34.0            6.0   1.0  1.0           0.0           0.0            30.0   

   workclass=Federal-gov  workclass=Local-gov  workclass=Private  ...  \
0                    0.0                  0.0                1.0  ...   
1                    0.0                  0.0                1.0  ...   
2                    0.0                  1.0                0.0  ...   
3                    0.0                  0.0                1.0  ...   
5                    0.0                  0.0                1.0  ...   

   native-country=Puerto-Rico  native-country=Scotland  native-country=Sou

In [9]:
print(df.shape)

(45222, 99)


In [10]:
df.sex.value_counts()

Unnamed: 0_level_0,count
sex,Unnamed: 1_level_1
1.0,30527
0.0,14695


In [11]:
print(df[dataset.label_names[0]].value_counts(normalize=True))

income-per-year
0.0    0.752156
1.0    0.247844
Name: proportion, dtype: float64


In [12]:
print(df.describe())

                age  education-num          race           sex  capital-gain  \
count  45222.000000   45222.000000  45222.000000  45222.000000  45222.000000   
mean      38.547941      10.118460      0.860267      0.675048   1101.430344   
std       13.217870       2.552881      0.346714      0.468362   7506.430084   
min       17.000000       1.000000      0.000000      0.000000      0.000000   
25%       28.000000       9.000000      1.000000      0.000000      0.000000   
50%       37.000000      10.000000      1.000000      1.000000      0.000000   
75%       47.000000      13.000000      1.000000      1.000000      0.000000   
max       90.000000      16.000000      1.000000      1.000000  99999.000000   

       capital-loss  hours-per-week  workclass=Federal-gov  \
count  45222.000000    45222.000000           45222.000000   
mean      88.595418       40.938017               0.031091   
std      404.956092       12.007508               0.173566   
min        0.000000        1.00

In [13]:
# Split the dataset
train, test = dataset.split([0.7], shuffle=True, seed = 123)

#{0.0: 'Female', 1.0: 'Male'}]
# Get the privileged and unprivileged groups;
privileged_groups = [{'sex': 1}]
unprivileged_groups = [{'sex': 0}]


## Assessing Original Dataset Bias

We'll use two fairness metrics to assess bias in our original dataset:

1. **Disparate Impact**: This metric shows the ratio of the probability of a positive outcome for the unprivileged group to the probability of a positive outcome for the privileged group. A value of 1.0 indicates perfect fairness, while values further from 1.0 indicate greater levels of bias.

2. **Statistical Parity Difference**: This metric computes the difference in the probability of a positive outcome between the privileged and unprivileged groups. A value of 0.0 indicates perfect fairness, while positive values indicate bias in favor of the privileged group and negative values indicate bias in favor of the unprivileged group.

In [14]:
# Metric for the original dataset
metric_orig_train = BinaryLabelDatasetMetric(train,
                                             unprivileged_groups=unprivileged_groups,
                                             privileged_groups=privileged_groups)
print("Original training dataset bias metrics")
print("Disparate impact: {}".format(metric_orig_train.disparate_impact()))
print("Statistical parity difference: {}".format(metric_orig_train.statistical_parity_difference()))


Original training dataset bias metrics
Disparate impact: 0.3678710816756443
Statistical parity difference: -0.1975550563436117


### Analysis of Original Metrics

- The disparate impact of 0.368 indicates significant bias. It suggests that the unprivileged group (females) is only 36.8% as likely to receive a positive outcome compared to the privileged group (males).
- The statistical parity difference of -0.198 also indicates bias. The negative value suggests that the unprivileged group (females) has a 19.8% lower probability of receiving a positive outcome compared to the privileged group (males).

These metrics show a substantial bias against females in the original dataset.

## Applying Reweighing to Mitigate Bias

Reweighing is a preprocessing technique that assigns weights to different samples in the training data to ensure fairness. It aims to equalize the representation of different groups in the dataset.

In [17]:
# Reweighing preprocessing
np.random.seed(123)
RW = Reweighing(unprivileged_groups=unprivileged_groups,
                privileged_groups=privileged_groups)
dataset_transf_train = RW.fit_transform(train)


In [18]:
# Metric for the transformed dataset
metric_transf_train = BinaryLabelDatasetMetric(dataset_transf_train,
                                               unprivileged_groups=unprivileged_groups,
                                               privileged_groups=privileged_groups)
print("\nTransformed training dataset bias metrics")
print("Disparate impact: {}".format(metric_transf_train.disparate_impact()))
print("Statistical parity difference: {}".format(metric_transf_train.statistical_parity_difference()))



Transformed training dataset bias metrics
Disparate impact: 1.0000000000000007
Statistical parity difference: 1.3877787807814457e-16


### Interpretation of Transformed Metrics

- After applying Reweighing, we see that both the disparate impact and statistical parity difference are essentially perfect (1.0 and 0.0 respectively).
- This indicates that the Reweighing algorithm has successfully balanced the dataset with respect to the protected attribute (gender in this case).

## Training and Evaluating the Model

Now that we have a balanced dataset, we can train a logistic regression model and evaluate its performance and fairness.

In [19]:
# Train the model
scale = StandardScaler()
X_train = scale.fit_transform(dataset_transf_train.features)
y_train = dataset_transf_train.labels.ravel()

lmod = LogisticRegression(solver='lbfgs', max_iter=1000)
lmod.fit(X_train, y_train, sample_weight=dataset_transf_train.instance_weights)

In [20]:
# Predict on test data
X_test = scale.transform(test.features)
y_test = test.labels.ravel()
y_pred = lmod.predict(X_test)

In [21]:
# Calculate accuracy
print("\nClassifier performance")
print("Accuracy: {}".format(accuracy_score(y_test, y_pred)))


Classifier performance
Accuracy: 0.8445492739736125


In [22]:
# Calculate bias metrics on predictions
dataset_pred = test.copy()
dataset_pred.labels = y_pred

metric_pred = BinaryLabelDatasetMetric(dataset_pred,
                                       unprivileged_groups=unprivileged_groups,
                                       privileged_groups=privileged_groups)
print("\nTest dataset predictions bias metrics")
print("Disparate impact: {}".format(metric_pred.disparate_impact()))
print("Statistical parity difference: {}".format(metric_pred.statistical_parity_difference()))


Test dataset predictions bias metrics
Disparate impact: 0.6127559583752937
Statistical parity difference: -0.08426280997772179


### Analysis of Model Predictions

- The model's predictions show improved fairness metrics compared to the original dataset, but there is still some bias present.
- The disparate impact of 0.613 indicates that the unprivileged group (females) is now 61.3% as likely to receive a positive prediction compared to the privileged group (males). This is better than the original 36.8%, but still not fully fair.
- The statistical parity difference of -0.084 suggests that females have an 8.4% lower probability of receiving a positive prediction compared to males. This is an improvement from the original 19.8% difference, but bias remains.

## Final analysis

While the Reweighing technique successfully balanced the training data, some bias persists in the model's predictions on the test set. This highlights the challenge of achieving fairness in machine learning models and suggests that additional techniques or more complex approaches might be necessary to further reduce bias in the predictions, but also keeping in mind perfect fairness will never be achieved.