# SenSeI: Sensitive Set Invariance for Enforcing Individual Fairness

SenSeI is an in-processing method for individual fairness. In this method, individual fairness is formulated as invariance on certain sensitive sets. SenSeI minimizes a transport-based regularizer that enforces this version of individual fairness.

In [1]:
import pandas as pd
from sklearn.metrics import accuracy_score, balanced_accuracy_score
from sklearn.compose import make_column_transformer
from sklearn.preprocessing import OneHotEncoder, StandardScaler
from sklearn.model_selection import train_test_split
from skorch import NeuralNetClassifier
import torch
import torch.nn as nn
import torch.nn.functional as F

from inFairness import distances
from inFairness.auditor import SenSeIAuditor

import aif360
from aif360.sklearn.datasets import fetch_adult
from aif360.sklearn.inprocessing import SenSeI

We will be using the Adult income dataset for this tutorial. For pre-processing, we apply the usual one-hot encoding for categorical features and standard scaling for continuous features. We divide the data into train and test splits with a 80/20 ratio. Finally, note we convert the dtype to 32-bit floats as this is the default precision for torch models.

In [2]:
X, y, _ = fetch_adult(dropcols=['native-country', 'education'])
(X_train, X_test,
 y_train, y_test) = train_test_split(X, y, train_size=0.8, random_state=123)

pre = make_column_transformer(
        (OneHotEncoder(sparse=False, drop='if_binary'), X_train.dtypes == 'category'),
        (StandardScaler(), X_train.dtypes != 'category'),
        verbose_feature_names_out=False)
# NOTE: the torch models will only handle 32-bit floats
X_train = pd.DataFrame(pre.fit_transform(X_train), index=X_train.index,
                       columns=pre.get_feature_names_out(), dtype='float32')
X_test = pd.DataFrame(pre.transform(X_test), index=X_test.index,
                      columns=pre.get_feature_names_out(), dtype='float32')
X_train

Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,workclass_Federal-gov,workclass_Local-gov,workclass_Private,workclass_Self-emp-inc,workclass_Self-emp-not-inc,workclass_State-gov,workclass_Without-pay,marital-status_Divorced,marital-status_Married-AF-spouse,marital-status_Married-civ-spouse,...,relationship_Own-child,relationship_Unmarried,relationship_Wife,race_White,sex_Male,age,education-num,capital-gain,capital-loss,hours-per-week
Unnamed: 0_level_1,race,sex,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1
31209,White,Male,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,...,0.0,0.0,0.0,1.0,1.0,-0.500934,1.114976,-0.146659,-0.219919,-0.080047
35748,White,Male,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,...,0.0,0.0,0.0,1.0,1.0,0.484367,1.114976,-0.146659,-0.219919,0.835685
26000,White,Male,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,...,0.0,0.0,0.0,1.0,1.0,-0.197765,-0.444540,-0.146659,-0.219919,0.752437
5072,Non-white,Male,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,1.0,-1.107273,-2.004057,-0.146659,-0.219919,1.418425
46474,White,Female,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,1.0,0.0,0.0,1.0,0.0,-1.410443,-0.054661,-0.146659,-0.219919,-0.080047
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
8271,White,Male,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,1.0,1.0,-1.562028,-1.224298,-0.146659,-0.219919,-0.912532
16345,White,Male,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,...,0.0,0.0,0.0,1.0,1.0,-0.728312,-0.054661,-0.146659,-0.219919,1.418425
18865,White,Male,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,...,0.0,1.0,0.0,1.0,1.0,1.393875,-3.173694,-0.146659,-0.219919,-0.080047
29771,White,Female,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,1.0,0.0,-0.728312,-0.444540,-0.146659,-0.219919,-0.080047


We also need to convert our target from a string to 0/1.

In [3]:
# y_train = pd.Series(y_train.factorize(sort=True)[0], index=y_train.index)
# y_test = pd.Series(y_test.factorize(sort=True)[0], index=y_test.index)
# y_train

At this point, we can create a copy of the test data with the spouse variable flipped. This will be used in a counterfactual assessment of the model later.

In [4]:
X_test_spouse_flipped = X_test.copy()
X_test_spouse_flipped.relationship_Wife = 1 - X_test_spouse_flipped.relationship_Wife

Another thing we need to keep track of for later is the protected attribute indices.

In [5]:
protected_vars = ['race_White', 'sex_Male']
protected_idxs = [X_train.columns.get_loc(var) for var in protected_vars]
protected_idxs

[34, 35]

This is the neural network we will use for the following experiment. It is a simple fully-connected network with ReLU activations.

In [6]:
class Model(nn.Module):
    def __init__(self, input_size, output_size=2):
        super().__init__()
        self.fc1 = nn.Linear(input_size, 100)
        self.fc2 = nn.Linear(100, 100)
        self.fcout = nn.Linear(100, output_size)

    def forward(self, x):
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fcout(x)
        return x

## Standard training

Now let's train our model with no individual fairness loss. We can use the skorch library to convert the PyTorch model to a sklearn-friendly estimator.

In [7]:
EPOCHS = 10
input_size = X_train.shape[1]
output_size = 1
optimizer = torch.optim.Adam
criterion = nn.BCEWithLogitsLoss
lr = 1e-3
device = torch.device('cpu')

network_standard = NeuralNetClassifier(
    Model,
    module__input_size=input_size,
    module__output_size=output_size,
    max_epochs=EPOCHS,
    criterion=criterion,
    optimizer=optimizer,
    lr=lr,
    # this is not strictly necessary; it just handles the conversion from DataFrame -> ndarray
    dataset=aif360.sklearn.inprocessing.infairness.Dataset,
    iterator_train__shuffle=True, # Shuffle training data on each epoch
    device=device,
)

In [8]:
y_train_enc = y_train.cat.codes.astype('float32')
y_test_enc = y_test.cat.codes.astype('float32')

In [9]:
network_standard.fit(X_train, y_train_enc.to_frame())

  epoch    train_loss    valid_acc    valid_loss     dur
-------  ------------  -----------  ------------  ------
      1        [36m0.3670[0m       [32m0.8460[0m        [35m0.3253[0m  0.5687
      2        [36m0.3184[0m       [32m0.8541[0m        [35m0.3170[0m  0.5384
      3        [36m0.3151[0m       [32m0.8545[0m        [35m0.3160[0m  0.5484
      4        [36m0.3136[0m       0.8535        0.3160  0.5419
      5        [36m0.3121[0m       0.8530        [35m0.3147[0m  0.5273
      6        [36m0.3101[0m       0.8530        [35m0.3139[0m  0.4738
      7        [36m0.3085[0m       0.8520        0.3143  0.5381
      8        [36m0.3079[0m       0.8528        0.3157  0.5546
      9        [36m0.3071[0m       0.8526        0.3160  0.4662
     10        [36m0.3045[0m       [32m0.8549[0m        0.3154  0.5047


<class 'skorch.classifier.NeuralNetClassifier'>[initialized](
  module_=Model(
    (fc1): Linear(in_features=41, out_features=100, bias=True)
    (fc2): Linear(in_features=100, out_features=100, bias=True)
    (fcout): Linear(in_features=100, out_features=1, bias=True)
  ),
)

As a baseline, let's print the accuracy, balanced accuracy, and the consistency of the predictions when the spouse column is flipped. This feature should have no causal impact on the prediction so for an individually fair model, this should be close to 100%.

In [10]:
y_pred_standard = network_standard.predict(X_test)
accuracy = accuracy_score(y_test_enc, y_pred_standard)
balanced_acc = balanced_accuracy_score(y_test_enc, y_pred_standard)

y_pred_flipped = network_standard.predict(X_test_spouse_flipped)
spouse_consistency = accuracy_score(y_pred_standard, y_pred_flipped)

print(f'Accuracy: {accuracy:.2%}')
print(f'Balanced accuracy: {balanced_acc:.2%}')
print(f'Spouse consistency: {spouse_consistency:.2%}')

Accuracy: 85.14%
Balanced accuracy: 77.17%
Spouse consistency: 92.54%


## Individually fair training

Now let's train an individually fair model using SenSeI. First, we must define the distance functions we will be using in both the input and output spaces. For the input (X) space, we will use the Logistic Regression Sensitive Subspace distance metric and for the output (y) space, we will use a simple Squared Euclidean distance.

In [11]:
distance_x = distances.LogisticRegSensitiveSubspace()
distance_y = distances.SquaredEuclideanDistance()

X_train_tensor = torch.as_tensor(X_train.to_numpy())
distance_x.fit(X_train_tensor, protected_idxs=protected_idxs)
distance_y.fit(num_dims=output_size)

distance_x.to(device)
distance_y.to(device)

The `SenSeI` class inherits from skorch so it looks very similar to the standard training setup.

In [12]:
rho = 5.0/2
eps = 0.1
auditor_nsteps = 100
auditor_lr = 1e-3

network_fair = SenSeI(
    Model,
    module__input_size=input_size,
    module__output_size=output_size,
    distance_x=distance_x,
    distance_y=distance_y,
    rho=rho,
    eps=eps,
    auditor_nsteps=auditor_nsteps,
    auditor_lr=auditor_lr,
    max_epochs=EPOCHS,
    criterion=criterion,
    optimizer=optimizer,
    lr=lr,
    device=device,
    iterator_train__shuffle=True, # Shuffle training data on each epoch
)

In [13]:
network_fair.fit(X_train, y_train)

  epoch    train_loss      dur
-------  ------------  -------
      1        [36m0.5068[0m  23.8525
      2        [36m0.4354[0m  26.8293
      3        [36m0.4053[0m  23.9973
      4        [36m0.3944[0m  23.4837
      5        [36m0.3900[0m  24.2724
      6        [36m0.3877[0m  24.4329
      7        [36m0.3849[0m  29.8392
      8        [36m0.3838[0m  24.8109
      9        [36m0.3822[0m  24.4255
     10        [36m0.3810[0m  24.0646


<class 'aif360.sklearn.inprocessing.infairness.SenSeI'>[initialized](
  module_=SenSeI(
    (distance_x): LogisticRegSensitiveSubspace()
    (distance_y): SquaredEuclideanDistance()
    (network): Model(
      (fc1): Linear(in_features=41, out_features=100, bias=True)
      (fc2): Linear(in_features=100, out_features=100, bias=True)
      (fcout): Linear(in_features=100, out_features=1, bias=True)
    )
    (loss_fn): BCEWithLogitsLoss()
  ),
)

This time when we run the metrics, the spouse consistency is almost exactly 100% while accuracy and balanced accuracy are only slightly lower. Great!

In [14]:
y_pred_fair = network_fair.predict(X_test)
accuracy = accuracy_score(y_test, y_pred_fair)
balanced_acc = balanced_accuracy_score(y_test, y_pred_fair)

y_pred_fair_flipped = network_fair.predict(X_test_spouse_flipped)
spouse_consistency = accuracy_score(y_pred_fair, y_pred_fair_flipped)

print(f'Accuracy: {accuracy:.2%}')
print(f'Balanced accuracy: {balanced_acc:.2%}')
print(f'Spouse consistency: {spouse_consistency:.2%}')

Accuracy: 83.70%
Balanced accuracy: 73.29%
Spouse consistency: 99.95%


## Individual fairness auditing

Let's now audit the two models and check for their individual fairness compliance.

In [15]:
# Auditing using the SenSeI Auditor

audit_nsteps = 500
audit_lr = 0.001
loss_fn = F.binary_cross_entropy_with_logits

auditor = SenSeIAuditor(distance_x=distance_x, distance_y=distance_y, num_steps=audit_nsteps, lr=audit_lr, max_noise=0.5, min_noise=-0.5)

X_test_tensor = torch.as_tensor(X_test.to_numpy())
y_test_tensor = torch.as_tensor(y_test_enc.to_numpy().reshape(-1, 1))
audit_result_stdmodel = auditor.audit(network_standard.module_, X_test_tensor, y_test_tensor, loss_fn, audit_threshold=1.15, lambda_param=50.0)
audit_result_fairmodel = auditor.audit(network_fair.module_.network, X_test_tensor, y_test_tensor, loss_fn, audit_threshold=1.15, lambda_param=50.0)

print(f"Loss ratio (standard model) : {audit_result_stdmodel.lower_bound:.3f}. Is model fair: {audit_result_stdmodel.is_model_fair}")
print(f"Loss ratio (fair model) : {audit_result_fairmodel.lower_bound:.3f}. Is model fair: {audit_result_fairmodel.is_model_fair}")

invalid value encountered in true_divide


Loss ratio (standard model) : -7817093355.950. Is model fair: True
Loss ratio (fair model) : 1.000. Is model fair: True


As signified by these numbers, the fair model is fairer than the standard model.