# SenSeI: Sensitive Set Invariance for Enforcing Individual Fairness

SenSeI is an in-processing method for individual fairness. In this method, individual fairness is formulated as invariance on certain sensitive sets. SenSeI minimizes a transport-based regularizer that enforces this version of individual fairness.

In [1]:
import pandas as pd
from sklearn.metrics import accuracy_score, balanced_accuracy_score
from sklearn.compose import make_column_transformer
from sklearn.preprocessing import OneHotEncoder, StandardScaler, minmax_scale
from sklearn.model_selection import train_test_split
from skorch import NeuralNetClassifier
import torch
import torch.nn as nn
import torch.nn.functional as F

from inFairness import distances
from inFairness.auditor import SenSeIAuditor

import aif360
from aif360.sklearn.datasets import fetch_adult
from aif360.sklearn.metrics import consistency_score
from aif360.sklearn.inprocessing import SenSeI

We will be using the Adult income dataset for this tutorial. For pre-processing, we apply the usual one-hot encoding for categorical features and standard scaling for continuous features. We divide the data into train and test splits with a 80/20 ratio. Finally, note we convert the dtype to 32-bit floats as this is the default precision for torch models.

In [2]:
X, y, _ = fetch_adult(dropcols=['native-country', 'education'])
(X_train, X_test,
 y_train, y_test) = train_test_split(X, y, train_size=0.8, random_state=123)

pre = make_column_transformer(
        (OneHotEncoder(sparse=False, drop='if_binary'), X_train.dtypes == 'category'),
        (StandardScaler(), X_train.dtypes != 'category'),
        verbose_feature_names_out=False)
# NOTE: the torch models will only handle 32-bit floats
X_train = pd.DataFrame(pre.fit_transform(X_train), index=X_train.index,
                       columns=pre.get_feature_names_out(), dtype='float32')
X_test = pd.DataFrame(pre.transform(X_test), index=X_test.index,
                      columns=pre.get_feature_names_out(), dtype='float32')
X_train

Unnamed: 0_level_0,Unnamed: 1_level_0,workclass_Federal-gov,workclass_Local-gov,workclass_Private,workclass_Self-emp-inc,workclass_Self-emp-not-inc,workclass_State-gov,workclass_Without-pay,marital-status_Divorced,marital-status_Married-AF-spouse,marital-status_Married-civ-spouse,...,race_Asian-Pac-Islander,race_Black,race_Other,race_White,sex_Male,age,education-num,capital-gain,capital-loss,hours-per-week
race,sex,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1
White,Male,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,1.0,...,0.0,0.0,0.0,1.0,1.0,-0.500934,1.114976,-0.146659,-0.219919,-0.080047
White,Male,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,1.0,...,0.0,0.0,0.0,1.0,1.0,0.484367,1.114976,-0.146659,-0.219919,0.835685
White,Male,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,...,0.0,0.0,0.0,1.0,1.0,-0.197765,-0.444540,-0.146659,-0.219919,0.752437
Non-white,Male,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,1.0,0.0,1.0,-1.107273,-2.004057,-0.146659,-0.219919,1.418425
White,Female,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,1.0,0.0,-1.410443,-0.054661,-0.146659,-0.219919,-0.080047
White,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
White,Male,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,1.0,1.0,-1.562028,-1.224298,-0.146659,-0.219919,-0.912532
White,Male,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,...,0.0,0.0,0.0,1.0,1.0,-0.728312,-0.054661,-0.146659,-0.219919,1.418425
White,Male,0.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,...,0.0,0.0,0.0,1.0,1.0,1.393875,-3.173694,-0.146659,-0.219919,-0.080047
White,Female,0.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,1.0,0.0,-0.728312,-0.444540,-0.146659,-0.219919,-0.080047


At this point, we can create a copy of the test data with the spouse variable flipped. This will be used in a counterfactual assessment of the model later.

In [3]:
X_test_spouse_flipped = X_test.copy()
X_test_spouse_flipped.relationship_Wife = 1 - X_test_spouse_flipped.relationship_Wife

Another thing we need to keep track of for later is the protected attribute indices.

In [4]:
protected_vars = ['race_White', 'sex_Male']
protected_idxs = [X_train.columns.get_loc(var) for var in protected_vars]
protected_idxs

[38, 39]

This is the neural network we will use for the following experiment. It is a simple fully-connected network with ReLU activations.

In [5]:
class Model(nn.Module):
    def __init__(self, input_size, output_size=1):
        super().__init__()
        self.fc1 = nn.Linear(input_size, 100)
        self.fc2 = nn.Linear(100, 100)
        self.fcout = nn.Linear(100, output_size)

    def forward(self, x):
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fcout(x)
        return x

## Standard training

Now let's train our model with no individual fairness loss. We can use the skorch library to convert the PyTorch model to a sklearn-friendly estimator.

Note: we could alternatively set `output_size = 2` and `criterion = nn.CrossEntropyLoss`. SenSeI will encode the nominal values automatically, though, so this way we skip that step later since for a binary y it assumes the loss is BCE.

In [6]:
EPOCHS = 10
input_size = X_train.shape[1]
output_size = 1
optimizer = torch.optim.Adam
criterion = nn.BCEWithLogitsLoss
lr = 1e-3
device = torch.device('cpu')

network_standard = NeuralNetClassifier(
    Model,
    module__input_size=input_size,
    module__output_size=output_size,
    max_epochs=EPOCHS,
    criterion=criterion,
    optimizer=optimizer,
    lr=lr,
    train_split=None,
    # this is not strictly necessary; it just handles the conversion from DataFrame -> ndarray
    dataset=aif360.sklearn.inprocessing.infairness.Dataset,
    iterator_train__shuffle=True, # Shuffle training data on each epoch
    device=device,
)

skorch does not automatically encode the targets so we need to convert them to 0/1

In [7]:
y_train_enc = y_train.cat.codes.astype('float32')
y_test_enc = y_test.cat.codes.astype('float32')

In [8]:
# the shape of y also needs to match the output of the network so we convert it to 2D first
network_standard.fit(X_train, y_train_enc.to_frame())

  epoch    train_loss     dur
-------  ------------  ------
      1        [36m0.3598[0m  0.5932
      2        [36m0.3162[0m  0.5971
      3        [36m0.3139[0m  0.6378
      4        [36m0.3112[0m  0.4824
      5        [36m0.3091[0m  0.5530
      6        [36m0.3078[0m  0.5659
      7        [36m0.3067[0m  0.5196
      8        [36m0.3052[0m  0.4765
      9        [36m0.3036[0m  0.5145
     10        [36m0.3025[0m  0.5702


<class 'skorch.classifier.NeuralNetClassifier'>[initialized](
  module_=Model(
    (fc1): Linear(in_features=45, out_features=100, bias=True)
    (fc2): Linear(in_features=100, out_features=100, bias=True)
    (fcout): Linear(in_features=100, out_features=1, bias=True)
  ),
)

As a baseline, let's print the accuracy, balanced accuracy, consistency with nearest neighbors, and the consistency of the predictions when the spouse column is flipped. The spouse feature should have no causal impact on the prediction so for an individually fair model, this should be close to 100%.

In [9]:
y_pred_standard = network_standard.predict(X_test)
accuracy = accuracy_score(y_test_enc, y_pred_standard)
balanced_acc = balanced_accuracy_score(y_test_enc, y_pred_standard)
consistency = consistency_score(minmax_scale(X_test), y_pred_standard.ravel())

y_pred_flipped = network_standard.predict(X_test_spouse_flipped)
spouse_consistency = accuracy_score(y_pred_standard, y_pred_flipped)

print(f'Accuracy: {accuracy:.2%}')
print(f'Balanced accuracy: {balanced_acc:.2%}')
print(f'Consistency: {consistency:.2%}')
print(f'Spouse consistency: {spouse_consistency:.2%}')

Accuracy: 85.13%
Balanced accuracy: 77.59%
Consistency: 93.79%
Spouse consistency: 92.77%


## Individually fair training

Now let's train an individually fair model using SenSeI. First, we must define the distance functions we will be using in both the input and output spaces. For the input (X) space, we will use the Logistic Regression Sensitive Subspace distance metric and for the output (y) space, we will use a simple Squared Euclidean distance.

In [10]:
distance_x = distances.LogisticRegSensitiveSubspace()
distance_y = distances.SquaredEuclideanDistance()

X_train_tensor = torch.as_tensor(X_train.to_numpy())
distance_x.fit(X_train_tensor, protected_idxs=protected_idxs)
distance_y.fit(num_dims=output_size)

distance_x.to(device)
distance_y.to(device)

The `SenSeI` class inherits from skorch so it looks very similar to the standard training setup.

In [11]:
rho = 2.5
eps = 0.1
auditor_nsteps = 100
auditor_lr = 1e-3

network_fair = SenSeI(
    Model,
    module__input_size=input_size,
    module__output_size=output_size,
    distance_x=distance_x,
    distance_y=distance_y,
    rho=rho,
    eps=eps,
    auditor_nsteps=auditor_nsteps,
    auditor_lr=auditor_lr,
    max_epochs=EPOCHS,
    criterion=criterion,
    optimizer=optimizer,
    lr=lr,
    device=device,
    iterator_train__shuffle=True, # Shuffle training data on each epoch
)

In [12]:
network_fair.fit(X_train, y_train)

  epoch    train_loss      dur
-------  ------------  -------
      1        [36m0.5119[0m  23.9251
      2        [36m0.4335[0m  23.2762
      3        [36m0.4028[0m  23.8665
      4        [36m0.3949[0m  24.0242
      5        [36m0.3914[0m  23.4092
      6        [36m0.3886[0m  23.2872
      7        [36m0.3864[0m  23.1221
      8        [36m0.3853[0m  23.8168
      9        [36m0.3836[0m  24.1062
     10        [36m0.3822[0m  24.3297


<class 'aif360.sklearn.inprocessing.infairness.SenSeI'>[initialized](
  module_=SenSeI(
    (distance_x): LogisticRegSensitiveSubspace()
    (distance_y): SquaredEuclideanDistance()
    (network): Model(
      (fc1): Linear(in_features=45, out_features=100, bias=True)
      (fc2): Linear(in_features=100, out_features=100, bias=True)
      (fcout): Linear(in_features=100, out_features=1, bias=True)
    )
    (loss_fn): BCEWithLogitsLoss()
  ),
)

This time when we run the metrics, the spouse consistency is almost exactly 100% while accuracy and balanced accuracy are only slightly lower and nearest neighbor consistency is slightly higher. Great!

In [13]:
y_pred_fair = network_fair.predict(X_test)
accuracy = accuracy_score(y_test, y_pred_fair)
balanced_acc = balanced_accuracy_score(y_test, y_pred_fair)
consistency = consistency_score(minmax_scale(X_test), y_pred_fair.ravel() == '>50K')

y_pred_fair_flipped = network_fair.predict(X_test_spouse_flipped)
spouse_consistency = accuracy_score(y_pred_fair, y_pred_fair_flipped)

print(f'Accuracy: {accuracy:.2%}')
print(f'Balanced accuracy: {balanced_acc:.2%}')
print(f'Consistency: {consistency:.2%}')
print(f'Spouse consistency: {spouse_consistency:.2%}')

Accuracy: 83.70%
Balanced accuracy: 73.62%
Consistency: 95.79%
Spouse consistency: 99.97%


## Individual fairness auditing

Let's now audit the two models and check for their individual fairness compliance.

In [14]:
audit_nsteps = 500
audit_lr = 0.001
loss_fn = F.binary_cross_entropy_with_logits

auditor = SenSeIAuditor(distance_x=distance_x, distance_y=distance_y,
    num_steps=audit_nsteps, lr=audit_lr, max_noise=0.5, min_noise=-0.5)

X_test_tensor = torch.as_tensor(X_test.to_numpy())
y_test_tensor = torch.as_tensor(y_test_enc.to_numpy().reshape(-1, 1))
audit_result_stdmodel = auditor.audit(network_standard.module_, X_test_tensor,
                                      y_test_tensor, loss_fn,
                                      audit_threshold=1.15, lambda_param=50.0)
audit_result_fairmodel = auditor.audit(network_fair.module_.network,
                                       X_test_tensor, y_test_tensor, loss_fn,
                                       audit_threshold=1.15, lambda_param=50.0)

print(f"Loss ratio (standard model) : {audit_result_stdmodel.lower_bound:.3f}. "
      f"Is model fair: {audit_result_stdmodel.is_model_fair}")
print(f"Loss ratio (fair model) : {audit_result_fairmodel.lower_bound:.3f}. "
      f"Is model fair: {audit_result_fairmodel.is_model_fair}")

invalid value encountered in true_divide


Loss ratio (standard model) : 221.515. Is model fair: False
Loss ratio (fair model) : 1.000. Is model fair: True


As signified by these numbers, the fair model is fairer than the standard model.