<h1><center>ERM with DNN under penalty of Equalized Odds</center></h1>

We implement here a regular Empirical Risk Minimization (ERM) of a Deep Neural Network (DNN) penalized to enforce an Equalized Odds constraint. More formally, given a dataset of size $n$ consisting of context features $x$, target $y$ and a sensitive information $z$ to protect, we want to solve
$$
\text{argmin}_{h\in\mathcal{H}}\frac{1}{n}\sum_{i=1}^n \ell(y_i, h(x_i)) + \lambda \chi^2|_1
$$
where $\ell$ is for instance the MSE and the penalty is
$$
\chi^2|_1 = \left\lVert\chi^2\left(\hat{\pi}(h(x)|y, z|y), \hat{\pi}(h(x)|y)\otimes\hat{\pi}(z|y)\right)\right\rVert_1
$$
where $\hat{\pi}$ denotes the empirical density estimated through a Gaussian KDE.

### The dataset

We use here the _communities and crimes_ dataset that can be found on the UCI Machine Learning Repository (http://archive.ics.uci.edu/ml/datasets/communities+and+crime). Non-predictive information, such as city name, state... have been removed and the file is at the arff format for ease of loading.

In [11]:
import sys, os
sys.path.append(os.path.abspath(os.path.join('../..')))
cur_path = os.path.dirname(os.path.dirname(os.path.dirname(os.getcwd())))
sys.path.append(cur_path)
base_path = cur_path + '/data/'
print(base_path)

import get_dataset

d:\Microsoft VS Code\PyCodes\RA_Fairness\fair_dummies/data/


In [2]:
from examples.data_loading import read_dataset
import importlib
# importlib.reload(read_dataset)
x_train, y_train, z_train, x_test, y_test, z_test = read_dataset(name='crimes', fold=1)
n, d = x_train.shape

dataset = "crimes"
seed = 123
X, A, Y, X_cal, A_cal, Y_cal, X_test, A_test, Y_test = get_dataset.get_train_test_data(base_path, dataset, seed, dim = 2, specified="TotalPctDiv")

y_train.shape

(1794,)

### The Deep Neural Network

We define a very simple DNN for regression here

In [3]:
from torch import nn
import torch.nn.functional as F

class NetRegression(nn.Module):
    def __init__(self, input_size, num_classes):
        super(NetRegression, self).__init__()
        size = 50
        self.first = nn.Linear(input_size, size)
        self.fc = nn.Linear(size, size)
        self.last = nn.Linear(size, num_classes)

    def forward(self, x):
        out = F.selu(self.first(x))
        out = F.selu(self.fc(out))
        out = self.last(out)
        return out

### The fairness-inducing regularizer
We implement now the regularizer. The empirical densities $\hat{\pi}$ are estimated using a Gaussian KDE. The L1 functional norm is taken over the values of $y$.
$$
\chi^2|_1 = \left\lVert\chi^2\left(\hat{\pi}(x|z, y|z), \hat{\pi}(x|z)\otimes\hat{\pi}(y|z)\right)\right\rVert_1
$$
This used to enforce the conditional independence $X \perp Y \,|\, Z$.
Practically, we will want to enforce $\text{prediction} \perp \text{sensitive} \,|\, \text{target}$

In [4]:
from facl.independence.density_estimation.pytorch_kde import kde
from facl.independence.hgr import chi_2_cond

def chi_squared_l1_kde(X, Y, Z):
    return torch.mean(chi_2_cond(X, Y, Z, kde))

### The fairness-penalized ERM

We now implement the full learning loop. The regression loss used is the quadratic loss with a L2 regularization and the fairness-inducing penalty.

In [5]:
import torch
import numpy as np
import torch.utils.data as data_utils

def regularized_learning(x_train, y_train, z_train, model, fairness_penalty, lr=1e-5, num_epochs=10, penalty=1.0):
    # wrap dataset in torch tensors
    Y = torch.tensor(y_train.astype(np.float32))
    X = torch.tensor(x_train.astype(np.float32))
    Z = torch.tensor(z_train.astype(np.float32))
    dataset = data_utils.TensorDataset(X, Y, Z)
    dataset_loader = data_utils.DataLoader(dataset=dataset, batch_size=200, shuffle=True)

    # mse regression objective
    data_fitting_loss = nn.MSELoss()

    # stochastic optimizer
    optimizer = torch.optim.Adam(model.parameters(), lr=lr, weight_decay=0.01)

    for j in range(num_epochs):
        for i, (x, y, z) in enumerate(dataset_loader):
            def closure():
                optimizer.zero_grad()
                outputs = model(x).flatten()
                loss = data_fitting_loss(outputs, y)
                loss += penalty*fairness_penalty(outputs, z, y)
                loss.backward()
                return loss

            optimizer.step(closure)
    return model

### Evaluation

For the evaluation on the test set, we compute two metrics: the MSE (accuracy) and HGR$|_\infty$ (fairness).

In [6]:
from facl.independence.hgr import hgr_cond

def evaluate(model, x, y, z):
    Y = torch.tensor(y.astype(np.float32))
    Z = torch.Tensor(z.astype(np.float32))
    X = torch.tensor(x.astype(np.float32))

    prediction = model(X).detach().flatten()
    loss = nn.MSELoss()(prediction, Y)
    # hgr_infty = np.max(hgr_cond(prediction, Z, Y, kde))
    return loss.item(), 0

### Running everything together


In [8]:
import torch
import random
import get_dataset
import numpy as np
import pandas as pd

from fair_dummies import fair_dummies_learning
from fair_dummies import utility_functions
importlib.reload(fair_dummies_learning)
importlib.reload(utility_functions)
importlib.reload(get_dataset)
from sklearn.linear_model import LinearRegression
import warnings
from scipy import stats
warnings.filterwarnings('ignore')
import tensorflow as tf
seed = 123
for i in range(20):
    X, A, Y, X_cal, A_cal, Y_cal, X_test, A_test, Y_test = get_dataset.get_train_test_data(base_path, dataset, seed, dim = 2, specified="PctUnemployed")

    specified_density_te = utility_functions.MAF_density_estimation(np.concatenate((Y_cal, Y_test)), np.concatenate((A_cal, A_test)), Y_test, A_test)
    ##############
    y_perm_index = np.squeeze(utility_functions.generate_X_CPT(50,1000,specified_density_te))
    A_perm_index = np.argsort(y_perm_index)
    specified_At = A_test[A_perm_index]
    ############## 
    print("###################################################")
    model = NetRegression(X.shape[1], 1)

    num_epochs = 20
    lr = 0.001

    # $\chi^2|_1$
    penalty_coefficient = 100
    penalty = chi_squared_l1_kde

    model = regularized_learning(X, Y, A, model=model, fairness_penalty=penalty, lr=lr, \
                                num_epochs=num_epochs, penalty = penalty_coefficient)

    # mse, hgr_infty = evaluate(model, X_test, Y_test, A_test[:,[0]])
    Yhat_out_test = model(torch.tensor(X_test.astype(np.float32))).detach().flatten().numpy()
    Yhat_out_cal = model(torch.tensor(X_cal.astype(np.float32))).detach().flatten().numpy()
    # print("MSE:{} HGR_infty:{}".format(mse, hgr_infty))
    print("MSE:{}".format(np.mean((Yhat_out_test - Y_test)**2)))
    print("Null-MSE:{}".format(np.mean((np.mean(Y_test) - Y_test)**2)))

    p_val = utility_functions.fair_dummies_test_regression(Yhat_out_cal,
                                                A_cal,
                                                Y_cal,
                                                Yhat_out_test,
                                                A_test,
                                                Y_test,
                                                num_reps = 1,
                                                num_p_val_rep = 1000,
                                                reg_func_name = "RF",
                                                test_type = "MAF_CPT_onA",
                                                transform = False,
                                                specified_density = specified_density_te,
                                                return_vec = True,
                                                specified_At = specified_At)


 Epoch 1/1000 
	 loss: 1.4303, val_loss: 1.3014

 Epoch 26/1000 
	 loss: 1.0061, val_loss: 0.8239

 Epoch 51/1000 
	 loss: 0.9977, val_loss: 0.7856

 Epoch 76/1000 
	 loss: 0.9897, val_loss: 0.7829

 Epoch 101/1000 
	 loss: 0.9951, val_loss: 0.7826

 Epoch 126/1000 
	 loss: 0.9834, val_loss: 0.8131
Epoch 129: early stopping
###################################################
MSE:1.18290958191897
Null-MSE:0.960140145461198
Fair dummies test (regression score), p-value: 0.000999000999000999

 Epoch 1/1000 
	 loss: 1.3519, val_loss: 1.3349

 Epoch 26/1000 
	 loss: 0.8082, val_loss: 0.8848

 Epoch 51/1000 
	 loss: 0.7968, val_loss: 0.8789

 Epoch 76/1000 
	 loss: 0.7940, val_loss: 0.8645

 Epoch 101/1000 
	 loss: 0.7801, val_loss: 0.8548

 Epoch 126/1000 
	 loss: 0.7739, val_loss: 0.8447

 Epoch 151/1000 
	 loss: 0.7697, val_loss: 0.8509

 Epoch 176/1000 
	 loss: 0.7589, val_loss: 0.8552

 Epoch 201/1000 
	 loss: 0.7955, val_loss: 0.8610
Epoch 218: early stopping
#########################

KeyboardInterrupt: 