# CS 378 Homework 4: Mitigation and Model Cards (100 pts)

## Deadline: 11:59 pm, November 7, 2022

This assignment has four parts. First, you will train a model for predicting the creditworthiness of a given individual using the so-called German credit dataset. Second, you will evaluate the model's fairness. Third, you will implement two strategies for the algorithmic mitigation of fairness issues with your model. Finally, you will write a [model card](https://arxiv.org/abs/1810.03993) for a model you trained in the first part.


## Part 1: Training a Pytorch model on the German Credit Dataset

We want you to train your model using Pytorch as this gives you the flexibility to experiment with different kinds of loss functions. Here is the code for loading and cleaning up the dataset.

In [1]:
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline

Data = pd.read_csv("german_credit_data.csv")
print (Data.columns)
Data.head(10)

Index(['Unnamed: 0', 'Age', 'Sex', 'Job', 'Housing', 'Saving accounts',
       'Checking account', 'Credit amount', 'Duration', 'Purpose', 'Risk'],
      dtype='object')


Unnamed: 0.1,Unnamed: 0,Age,Sex,Job,Housing,Saving accounts,Checking account,Credit amount,Duration,Purpose,Risk
0,0,67,male,2,own,,little,1169,6,radio/TV,good
1,1,22,female,2,own,little,moderate,5951,48,radio/TV,bad
2,2,49,male,1,own,little,,2096,12,education,good
3,3,45,male,2,free,little,little,7882,42,furniture/equipment,good
4,4,53,male,2,free,little,little,4870,24,car,bad
5,5,35,male,1,free,,,9055,36,education,good
6,6,53,male,2,own,quite rich,,2835,24,furniture/equipment,good
7,7,35,male,3,rent,little,moderate,6948,36,car,good
8,8,61,male,1,own,rich,,3059,12,radio/TV,good
9,9,28,male,3,own,little,moderate,5234,30,car,bad


In [2]:
Data['Saving accounts'] = Data['Saving accounts'].map({"little":0,"moderate":1,"quite rich":2 ,"rich":3 });
Data['Saving accounts'] = Data['Saving accounts'].fillna(Data['Saving accounts'].dropna().mean())

Data['Checking account'] = Data['Checking account'].map({"little":0,"moderate":1,"rich":2 });
Data['Checking account'] = Data['Checking account'].fillna(Data['Checking account'].dropna().mean())

Data['Sex'] = Data['Sex'].map({"male":0,"female":1}).astype(float)

Data['Housing'] = Data['Housing'].map({"own":0,"free":1,"rent":2}).astype(float)

Data['Purpose'] = Data['Purpose'].map({'radio/TV':0, 'education':1, 'furniture/equipment':2, 'car':3, 'business':4,
       'domestic appliances':5, 'repairs':6, 'vacation/others':7}).astype(float)

Data['Risk'] = Data['Risk'].map({"good":0,"bad":1}).astype(float)

Data.head(10)

Unnamed: 0.1,Unnamed: 0,Age,Sex,Job,Housing,Saving accounts,Checking account,Credit amount,Duration,Purpose,Risk
0,0,67,0.0,2,0.0,0.456548,0.0,1169,6,0.0,0.0
1,1,22,1.0,2,0.0,0.0,1.0,5951,48,0.0,1.0
2,2,49,0.0,1,0.0,0.0,0.651815,2096,12,1.0,0.0
3,3,45,0.0,2,1.0,0.0,0.0,7882,42,2.0,0.0
4,4,53,0.0,2,1.0,0.0,0.0,4870,24,3.0,1.0
5,5,35,0.0,1,1.0,0.456548,0.651815,9055,36,1.0,0.0
6,6,53,0.0,2,0.0,2.0,0.651815,2835,24,2.0,0.0
7,7,35,0.0,3,2.0,0.0,1.0,6948,36,3.0,0.0
8,8,61,0.0,1,0.0,3.0,0.651815,3059,12,0.0,0.0
9,9,28,0.0,3,0.0,0.0,1.0,5234,30,3.0,1.0


And here is the code for performing a train-test split.

In [3]:
import numpy as np
import sklearn

X = Data.drop(columns=['Risk'])
Y = Data['Risk']

from sklearn.model_selection import train_test_split

train_x, test_x, train_y, test_y = train_test_split(X, Y, test_size=0.2, random_state=137)

Now you will use Pytorch to implement a couple of models. Below, we give you some code to translate your training and test data into Pytorch tensors.

In [4]:
import torch
from torch.autograd import Variable

scaler = sklearn.preprocessing.StandardScaler()
train_x = scaler.fit_transform(train_x)
test_x = scaler.fit_transform(test_x)

train_x = torch.from_numpy(train_x.astype(np.float32))
test_x = torch.from_numpy(test_x.astype(np.float32))

# Train_y is now a numpy object 
train_y = list(train_y)

# train_y is now a torch object 
train_y = torch.as_tensor(train_y, dtype = torch.float32)
test_y = torch.as_tensor(list(test_y), dtype=torch.float32)

train_y = train_y.view(train_y.shape[0],1)
test_y = test_y.view(test_y.shape[0],1)

n_samples,n_features=train_x.shape

#### Q1: Consider the unfinished logistic regression model (recall that a logistic regressor is essentially a 1-layer neural network). Finish the model definition by writing the "forward" method.  (2 points)

In [5]:
class Logistic_Reg_model(torch.nn.Module):
 def __init__(self,no_input_features):
    super(Logistic_Reg_model,self).__init__()
    self.layer1=torch.nn.Linear(no_input_features,20)
    self.layer2=torch.nn.Linear(20,1)
    self.sigmoid = torch.nn.Sigmoid()
 def forward(self,x):
    # YOUR ANSWER HERE
    return self.sigmoid(self.layer2(self.layer1(x)))

#### Q2:  Now write the training code for the model. (3 points)

In [112]:
class Dataset(torch.utils.data.Dataset):
    
    # Create Torch Dataset object.
    def __init__(self, X, Y):

        #X = X.reshape((-1, 1, 64, 64))  
        #self.X = torch.from_numpy(X)
        #self.Y = torch.from_numpy(Y)
        self.X = X
        self.Y = Y
        
    def __len__(self):
        return len(self.Y)

    def __getitem__(self, index):
        X = self.X[index]
        Y = self.Y[index]

        return {'X': X, 'Y': Y}
    
# YOUR ANSWER HERE

model = Logistic_Reg_model(10)
model.train()

epochs = 10
lr = 1e-3

# FILL IN FOR NONE
criterion = torch.nn.BCELoss()
# For your optimizer, please include the lr param from section 2.b
# and the parameters of the net variable - net.parameters()
optimizer = torch.optim.Adam(model.parameters(), lr=lr)
batch_size = 1

trainSignData = Dataset(train_x, train_y)
trainDataLoader = torch.utils.data.DataLoader(trainSignData, shuffle=True, batch_size=batch_size)
testSignData = Dataset(test_x, test_y)
testDataLoader = torch.utils.data.DataLoader(testSignData, shuffle=True, batch_size=batch_size)
print ("hey")
def train(epoch, net, trainDataLoader, optimizer, criterion, validDataLoader, intraininglambda=0, train_x=None, train_y=None):
    net.train()
    train_loss = 0
    for sample in trainDataLoader:

        inputs, targets = sample['X'], sample['Y']
        #print (inputs, targets)
        optimizer.zero_grad()
        outputs = net.forward(inputs)
        #print ("outputs/targets", outputs, targets)
        loss = criterion(outputs, targets)
        if intraininglambda > 0:
            loss += intraininglambda*separationLoss(net, train_x, train_y)
        #print (loss)
        loss.backward()
        optimizer.step()
        #print (loss.item())
        train_loss += loss.item() * batch_size

    net.eval()
    valid_loss = 0
    for sample in validDataLoader:

        inputs, targets = sample['X'], sample['Y']
        outputs = net(inputs)
        loss = criterion(outputs, targets)
        valid_loss += loss.item() * batch_size

    # calculate average losses
    train_loss = train_loss/len(trainDataLoader.sampler)
    valid_loss = valid_loss/len(validDataLoader.sampler)

    if epoch % 1 == 0:
        print('Epoch: {} \tTraining Loss: {:.6f} \tValidation Loss: {:.6f}'.format(epoch, train_loss, valid_loss))
    return train_loss, valid_loss

for epoch in range(epochs):
    #print ("hey")
    print ("epoch", epoch)
    train(epoch, model, trainDataLoader, optimizer, criterion, testDataLoader)

# Save the model
torch.save(model, "model.pt")


hey
epoch 0
Epoch: 0 	Training Loss: 0.612062 	Validation Loss: 0.560175
epoch 1
Epoch: 1 	Training Loss: 0.570743 	Validation Loss: 0.552832
epoch 2
Epoch: 2 	Training Loss: 0.569530 	Validation Loss: 0.554008
epoch 3
Epoch: 3 	Training Loss: 0.568433 	Validation Loss: 0.560910
epoch 4
Epoch: 4 	Training Loss: 0.567952 	Validation Loss: 0.552050
epoch 5
Epoch: 5 	Training Loss: 0.568924 	Validation Loss: 0.554955
epoch 6
Epoch: 6 	Training Loss: 0.569138 	Validation Loss: 0.557713
epoch 7
Epoch: 7 	Training Loss: 0.568228 	Validation Loss: 0.556199
epoch 8
Epoch: 8 	Training Loss: 0.568158 	Validation Loss: 0.561901
epoch 9
Epoch: 9 	Training Loss: 0.568025 	Validation Loss: 0.553983


Here is some code for testing your model's accuracy.

In [113]:
def accuracy(model, test_x):
    model.eval()
    with torch.no_grad():
     y_pred=model(test_x)
     y_pred_class=y_pred.round()
     accuracy=(y_pred_class.eq(test_y).sum())/float(test_y.shape[0])
     print(accuracy.item())
accuracy(model, test_x)

0.7300000190734863


#### Q3: Now it's time to try out some more complex neural networks! Your goal is to design one that does better, in terms of overall accuracy, than the logistic regression model. You are free to select the architecture of this model as you please. It is also fine you cannot find a model that performs better than the logistic regressor. In that case, please describe the space of models that you tried out in at least 50 words.    (10 points)

In [114]:
# YOUR ANSWER HERE
class DNN_model(torch.nn.Module):
 def __init__(self,no_input_features):
    super(DNN_model,self).__init__()
    self.layer1=torch.nn.Linear(no_input_features,20)
    self.layer2=torch.nn.Linear(20,20)
    self.layer3 = torch.nn.Linear(20,20)
    self.layer4=torch.nn.Linear(20,1)
    
    self.relu = torch.nn.ReLU()
    self.sigmoid = torch.nn.Sigmoid()
    
    self.dropout = torch.nn.Dropout(p=0.2)
    
    self.seq = torch.nn.Sequential(
        self.layer1,
        self.dropout,
        self.relu,
        self.layer2,
        self.dropout,
        self.relu,
        self.layer3,
        self.dropout,
        self.relu,
        self.layer4
    )

 def forward(self,x):
    # YOUR ANSWER HERE
    return self.sigmoid(self.seq(x))

dnnmodel = DNN_model(10)
dnnepochs = 15
dnnlr = 5e-4
dnnoptimizer = torch.optim.Adam(dnnmodel.parameters(), lr=dnnlr, weight_decay=1e-5)
for epoch in range(dnnepochs):
    #print ("hey")
    print ("epoch", epoch)
    train(epoch, dnnmodel, trainDataLoader, dnnoptimizer, criterion, testDataLoader)

torch.save(dnnmodel, "dnnmodel.pt")

print ("DNN Model accuracy:", end = " ")
accuracy(dnnmodel, test_x)

epoch 0
Epoch: 0 	Training Loss: 0.642169 	Validation Loss: 0.612226
epoch 1
Epoch: 1 	Training Loss: 0.589945 	Validation Loss: 0.592367
epoch 2
Epoch: 2 	Training Loss: 0.587719 	Validation Loss: 0.586339
epoch 3
Epoch: 3 	Training Loss: 0.582025 	Validation Loss: 0.576523
epoch 4
Epoch: 4 	Training Loss: 0.581849 	Validation Loss: 0.568267
epoch 5
Epoch: 5 	Training Loss: 0.574022 	Validation Loss: 0.559901
epoch 6
Epoch: 6 	Training Loss: 0.573381 	Validation Loss: 0.552487
epoch 7
Epoch: 7 	Training Loss: 0.566396 	Validation Loss: 0.546216
epoch 8
Epoch: 8 	Training Loss: 0.558020 	Validation Loss: 0.543778
epoch 9
Epoch: 9 	Training Loss: 0.556758 	Validation Loss: 0.544059
epoch 10
Epoch: 10 	Training Loss: 0.557846 	Validation Loss: 0.543512
epoch 11
Epoch: 11 	Training Loss: 0.551485 	Validation Loss: 0.542241
epoch 12
Epoch: 12 	Training Loss: 0.554462 	Validation Loss: 0.542659
epoch 13
Epoch: 13 	Training Loss: 0.544122 	Validation Loss: 0.537893
epoch 14
Epoch: 14 	Traini

My Deep Model performed a little better (3%) most of the time, I tried using the leaky relu function istead so as to not completely zero out negative values but that seemed to make it worse.  I also tried adding a residual connection of a adding layer1's output to layer3's output but that didn't help so I took it out. The weight decay in training helped but it didn't make a difference when I added dropout between the layers.

## Part 2: Evaluation of Fairness

In the following questions, "your model" refers to the best model you discovered in your exploration in Part 1.  We assume "Sex" to be the sensitive characteristic $A$ of the input. As in class, $R$ represents the classifier output (is a person predicted to be risky?) and $Y$ represents the ground-truth output (is the person actually risky?).

#### Q4: Determine if the model exhibits the property of independence ($R \perp A$), at least approximately.  Here, "approximately" means that $|P(R | A = 0) - P(R | A = 1)| < \epsilon$. We set $\epsilon = 0.05$. (5 points)

In [96]:
# YOUR ANSWER HERE

def independence(model, data, labels):
    
    data = pd.DataFrame(data.numpy())
    
    sexIndex = 2
    femaleData = data[data[sexIndex] < 0]
    #print (len(femaleData))
    
    maleData = data[data[sexIndex] > 0]
    
    #print (len(maleData))
    #print (len(data))
    
    femaleData = torch.tensor(femaleData.values)
    maleData = torch.tensor(maleData.values)
    
    
    femalePreds = model(femaleData)
    malePreds = model(maleData)
    
    femalePosProb = (torch.sum(femalePreds)/len(femaleData))
    malePosProb = (torch.sum(malePreds)/len(maleData))
    
    print ("female pos prob aka P(R | A = 0):", femalePosProb)
    print ("male pos prob aka P(R | A = 1):", malePosProb)
    independenceLoss = abs(femalePosProb - malePosProb)
    

    eps = 0.05
    return (independenceLoss.item() < eps)
    

print ("Does the model satisfy independence: ", independence(dnnmodel, test_x, test_y))

    

    

female pos prob aka P(R | A = 0): tensor(0.2944, grad_fn=<DivBackward0>)
male pos prob aka P(R | A = 1): tensor(0.3720, grad_fn=<DivBackward0>)
Does the model satisfy independence:  False


#### Q5: Now determine if the model (approximately) satisfies the separation criterion ($R \perp A | Y$) . Once again, we define approximate separation as $| P(R = r | A = 1, Y = y) - P(R = r | A = 0, Y = y)| < \epsilon$ where $\epsilon = 0.05$.  (5 points)

In [121]:
# YOUR ANSWER HERE

def falseaLossTrueLossSeparationHelper(model, data, labels, verbose=True):
    data = pd.DataFrame(data.numpy())
    labels = pd.DataFrame(labels.numpy())
    sexIndex = 2
    xd = data[sexIndex] < 0
    dx = (labels == 1).squeeze()

    femaleTrueData = data[(data[sexIndex] < 0) & (labels == 1).squeeze()]
    femaleFalseData = data[(data[sexIndex] < 0) & (labels == 0).squeeze()]

    maleTrueData = data[(data[sexIndex] > 0) & (labels == 1).squeeze()]
    maleFalseData = data[(data[sexIndex] > 0) & (labels == 0).squeeze()]
    
    femaleTrueData = torch.tensor(femaleTrueData.values)
    femaleFalseData = torch.tensor(femaleFalseData.values)
    
    maleTrueData = torch.tensor(maleTrueData.values)
    maleFalseData = torch.tensor(maleFalseData.values)
    
    femaleTruePreds = model(femaleTrueData)
    femaleFalsePreds = model(femaleFalseData)
    
    maleTruePreds = model(maleTrueData)
    maleFalsePreds = model(maleFalseData)
    
    femaleTruePosProb = (torch.sum(femaleTruePreds)/len(femaleTrueData))
    femaleFalsePosProb = (torch.sum(femaleFalsePreds)/len(femaleFalseData))
    
    maleTruePosProb = (torch.sum(maleTruePreds)/len(maleTrueData))
    maleFalsePosProb = (torch.sum(maleFalsePreds)/len(maleFalseData))
    
    #print (femaleTruePosProb, maleTruePosProb, femaleFalsePosProb, maleFalsePosProb)

    falseLoss = abs(femaleFalsePosProb - maleFalsePosProb)
    trueLoss = abs(femaleTruePosProb - maleTruePosProb)
    if verbose:
        print ("|P(R = 1 | A = 1, Y = 0) - P(R = 1 | A = 0, Y = 0)| = ", falseLoss.item())
        print ("|P(R = 1 | A = 1, Y = 1) - P(R = 1 | A = 0, Y = 1)| = ", trueLoss.item())
    return falseLoss, trueLoss
    
def maxSeparation(model, data, labels):
    res = falseaLossTrueLossSeparationHelper(model, data, labels)
    return max(res[0].item(), res[1].item())
    
def separation(model, data, labels):
    maxSep = maxSeparation(model, data, labels)
    eps = 0.05
    return (maxSep < eps)

def separationLoss(model, data, labels):
    falseLoss, trueLoss = falseaLossTrueLossSeparationHelper(model, data, labels, verbose=False)
    return falseLoss + trueLoss

    
    


print ("Does the model satisfy separation: ", separation(dnnmodel, test_x, test_y))
    
    

|P(R = 1 | A = 1, Y = 0) - P(R = 1 | A = 0, Y = 0)| =  0.010080650448799133
|P(R = 1 | A = 1, Y = 1) - P(R = 1 | A = 0, Y = 1)| =  0.10458672046661377
Does the model satisfy separation:  False


#### Q6: Finally, determine if the model (approximately) satisfies the sufficiency criterion $(A \perp Y | R)$. Approximate sufficiency is defined in a similar way as in Q4 and Q5. (5 points)

In [116]:
# YOUR ANSWER HERE
def sufficiency(model, data, labels):
    
    preds = pd.DataFrame(model(data).round().detach().numpy()).squeeze()
    data = pd.DataFrame(data.numpy())
    labels = pd.DataFrame(labels.numpy()).squeeze()
    
    predFalseData = data[preds == 0]
    predFalseLabels = labels[preds == 0]
    predTrueData = data[preds == 1]
    predTrueLabels = labels[preds == 1]
    
    sexIndex = 2
    
    predFalseY0Data = predFalseData[predFalseLabels == 0]
    predFalseY1Data = predFalseData[predFalseLabels == 1]    
    xd = predFalseY0Data[sexIndex].value_counts()
    femaleCode = xd.keys()[0]
    maleCode = xd.keys()[1]
    probFemalePredFalseY0 = xd[femaleCode]/(xd[femaleCode] + xd[maleCode])
    
    xd = predFalseY1Data[sexIndex].value_counts()
    probFemalePredFalseY1 = xd[femaleCode]/(xd[femaleCode] + xd[maleCode])
    
    predFalseSufficiencyLoss = abs(probFemalePredFalseY0 - probFemalePredFalseY1) # making sure A approx indep of Y
    
    
    predTrueY0Data = predTrueData[predTrueLabels == 0]
    predTrueY1Data = predTrueData[predTrueLabels == 1]
    xd = predTrueY0Data[sexIndex].value_counts()
    femaleCode = xd.keys()[0]
    maleCode = xd.keys()[1]
    probFemalePredTrueY0 = xd[femaleCode]/(xd[femaleCode] + xd[maleCode])
    
    xd = predTrueY1Data[sexIndex].value_counts()
    probFemalePredTrueY1 = xd[femaleCode]/(xd[femaleCode] + xd[maleCode])
    
    
    predTrueSufficiencyLoss = abs(probFemalePredTrueY0 - probFemalePredTrueY1)
    
    #print (probFemalePredFalseY0, probFemalePredFalseY1, probFemalePredTrueY0, probFemalePredTrueY1)
    eps = 0.05
    return (predFalseSufficiencyLoss < eps and predTrueSufficiencyLoss < eps).item()
    
print ("Does the model satisfy sufficiency: ", sufficiency(dnnmodel, test_x, test_y))
    
    
    
    
    
    
    
    
    

Does the model satisfy sufficiency:  False


## Part 3: Fairness Mitigation

#### Q7: Your goal now is to implement the preprocessing (dataset repair) strategy for mitigating fairness that we discussed in class. Specifically, note that your model assigns a numerical score to each input. Rewrite these scores so that when you threshold on the new scores, the resulting classifiers will satisfy the separation criterion (approximately).         (15 points)

In [117]:
# YOUR ANSWER HERE
from collections import defaultdict

def calculatePercentile(arr, x):
    low = 0
    high = len(arr)
    while low + 1 < high:
        mid = low + (high - low)//2
        if arr[mid] <= x:
            low = mid
        else:
            high = mid
    
    return low/len(arr) # techinically percentile/100
        
def convertDataset(data, valsToSortedList):
    sexIndex = 2
    #print (data)
    #print (valsToSortedList)
    for dataCounter in range(len(data)):
        #row = data.iloc[dataCounter]
        if data.at[dataCounter, sexIndex] < 0: # female
            for i in range(10):
                if i == sexIndex:
                    continue
                percentile = calculatePercentile(valsToSortedList[0][i], data.at[dataCounter, i])
                femaleValue = data.at[dataCounter, i]
                maleValue = valsToSortedList[1][i][int(percentile * len(valsToSortedList[1][i]))]
                median = (femaleValue + maleValue)/2 # also happens to be average
                #print (femaleValue, median)
                #row[i] = median
                data.at[dataCounter, i] = median
                #print (row[i], data.iloc[dataCounter][i])
                #if femaleValue != median:
                #    print (femaleValue, median, data.iloc[dataCounter][i])
        else: # male
            for i in range(10):
                if i == sexIndex:
                    continue
                percentile = calculatePercentile(valsToSortedList[1][i], data.at[dataCounter, i])
                maleValue = data.at[dataCounter, i]
                femaleValue = valsToSortedList[0][i][int(percentile * len(valsToSortedList[0][i]))]
                median = (femaleValue + maleValue)/2 # also happens to be average
                #row[i] = median
                data.at[dataCounter, i] = median
    return data
                
def datasetrepair(data):
    
    # data preprocessing, slide 98 of the fairness slides
    
    #print (data.shape)
    #data = pd.DataFrame(data.numpy())
    #print (data)

    sexIndex = 2
    femaleData = data[data[sexIndex] < 0]
    #print (len(femaleData))

    maleData = data[data[sexIndex] > 0]
    
    valsToSortedList = defaultdict(lambda: dict()) # maps sensitive attribute value to a map of columns to sorted lists
    #print (femaleData[2].to_numpy().tolist())
    for i in range(10):
        if i == 2:
            continue
        col = data[i]
        
        femaleCol = femaleData[i].to_numpy().tolist()
        femaleCol.sort()
        maleCol = maleData[i].to_numpy().tolist()
        maleCol.sort()
        valsToSortedList[0][i] = femaleCol
        valsToSortedList[1][i] = maleCol
        
    newdata = convertDataset(data.copy(), valsToSortedList)
    return newdata, valsToSortedList

from copy import deepcopy
#print ("before:\n", test_x)
newtrain_x, valsToSortedList = datasetrepair(pd.DataFrame(deepcopy(train_x).numpy())) # the medians/percentiles will be based off the train set
newtest_x = convertDataset(pd.DataFrame(deepcopy(test_x).numpy()), valsToSortedList) # apply the training set medians/percentiles to the test set
newtrain_x = torch.tensor(newtrain_x.values).float()
newtest_x = torch.tensor(newtest_x.values).float()

'''print (newtest_x)'''


'''print (test_x)
print ("---------")
print (newtest_x)'''







'print (test_x)\nprint ("---------")\nprint (newtest_x)'

In [118]:
fairTrainSignData = Dataset(newtrain_x, train_y)
fairTrainDataLoader = torch.utils.data.DataLoader(fairTrainSignData, shuffle=True, batch_size=batch_size)
fairTestSignData = Dataset(newtest_x, test_y)
fairTestDataLoader = torch.utils.data.DataLoader(fairTestSignData, shuffle=True, batch_size=batch_size)

fairdnnmodel = DNN_model(10)
fairdnnoptimizer = torch.optim.Adam(fairdnnmodel.parameters(), lr=5e-4, weight_decay=1e-5)
for epoch in range(15):
    #print ("hey")
    print ("epoch", epoch)
    train(epoch, fairdnnmodel, fairTrainDataLoader, fairdnnoptimizer, criterion, fairTestDataLoader)


epoch 0
Epoch: 0 	Training Loss: 0.670192 	Validation Loss: 0.608502
epoch 1
Epoch: 1 	Training Loss: 0.604473 	Validation Loss: 0.595125
epoch 2
Epoch: 2 	Training Loss: 0.595169 	Validation Loss: 0.589142
epoch 3
Epoch: 3 	Training Loss: 0.594211 	Validation Loss: 0.583797
epoch 4
Epoch: 4 	Training Loss: 0.586173 	Validation Loss: 0.579704
epoch 5
Epoch: 5 	Training Loss: 0.585194 	Validation Loss: 0.578421
epoch 6
Epoch: 6 	Training Loss: 0.570033 	Validation Loss: 0.571879
epoch 7
Epoch: 7 	Training Loss: 0.576002 	Validation Loss: 0.571544
epoch 8
Epoch: 8 	Training Loss: 0.571248 	Validation Loss: 0.569176
epoch 9
Epoch: 9 	Training Loss: 0.581584 	Validation Loss: 0.566379
epoch 10
Epoch: 10 	Training Loss: 0.564305 	Validation Loss: 0.565252
epoch 11
Epoch: 11 	Training Loss: 0.563829 	Validation Loss: 0.558052
epoch 12
Epoch: 12 	Training Loss: 0.565980 	Validation Loss: 0.552156
epoch 13
Epoch: 13 	Training Loss: 0.544339 	Validation Loss: 0.545850
epoch 14
Epoch: 14 	Traini

#### Q8: Compute the overall accuracy of the fairness-mitigated model obtained in your answer to Q7. Repeat your analysis of independence, separation, and sufficiency, previously performed in your answers to Q4-Q6, on this model.    (5 points) 

In [119]:
# YOUR ANSWER HERE
# Save the model
torch.save(model, "fairdnnmodel.pt")
print ("Fair DNN Model accuracy: ", end = " ")
accuracy(fairdnnmodel, newtest_x)
print ("Does the model satisfy independence: ", independence(fairdnnmodel, newtest_x, test_y))
print ("Does the model satisfy separation: ", separation(fairdnnmodel, newtest_x, test_y))
print ("Does the model satisfy sufficiency: ", sufficiency(fairdnnmodel, newtest_x, test_y))

Fair DNN Model accuracy:  0.75
female pos prob aka P(R | A = 0): tensor(0.2893, grad_fn=<DivBackward0>)
male pos prob aka P(R | A = 1): tensor(0.3718, grad_fn=<DivBackward0>)
Does the model satisfy independence:  False
|P(R = 1 | A = 1, Y = 0) - P(R = 1 | A = 0, Y = 0)| =  0.028803735971450806
|P(R = 1 | A = 1, Y = 1) - P(R = 1 | A = 0, Y = 1)| =  0.11915519833564758
Does the model satisfy separation:  False
Does the model satisfy sufficiency:  False


My fair dnn model with preprocessed data only sometimes satisfies separation (depending on the run, probably has to do with the random seed).  However, the probabilities are much closer compared to the those of regular dnn model without the preprocessed data, so it is definitely making a lot closer to satisfying separation.

#### Q9: Now we would like you to implement the in-training strategy for fairness mitigation that we discussed in class. Your goal is to ensure the separation criterion. Suitably modify the loss function of your model to capture this goal (approximately), and retrain your model.    (20 points)

In [131]:
# YOUR ANSWER HERE
fairintrainingdnnmodel = DNN_model(10)
dnnepochs = 15
dnnlr = 5e-4
fairintrainingdnnoptimizer = torch.optim.Adam(fairintrainingdnnmodel.parameters(), lr=dnnlr, weight_decay=1e-5)
fairintrainingcriterion = torch.nn.BCELoss()
for epoch in range(dnnepochs):
    #print ("hey")
    print ("epoch", epoch)
    # uses the optional parameter intraininglambda to indicate that the separationLoss needs to be added to BCELoss
    train(epoch, fairintrainingdnnmodel, fairTrainDataLoader, fairintrainingdnnoptimizer, fairintrainingcriterion, fairTestDataLoader, intraininglambda=5, train_x=train_x, train_y=train_y)

torch.save(fairintrainingdnnmodel, "fairintrainingdnnmodel.pt")


epoch 0
Epoch: 0 	Training Loss: 0.643618 	Validation Loss: 0.610069
epoch 1
Epoch: 1 	Training Loss: 0.605589 	Validation Loss: 0.593865
epoch 2
Epoch: 2 	Training Loss: 0.597293 	Validation Loss: 0.585022
epoch 3
Epoch: 3 	Training Loss: 0.592486 	Validation Loss: 0.575134
epoch 4
Epoch: 4 	Training Loss: 0.577412 	Validation Loss: 0.566276
epoch 5
Epoch: 5 	Training Loss: 0.584024 	Validation Loss: 0.564551
epoch 6
Epoch: 6 	Training Loss: 0.581707 	Validation Loss: 0.566384
epoch 7
Epoch: 7 	Training Loss: 0.581495 	Validation Loss: 0.566173
epoch 8
Epoch: 8 	Training Loss: 0.572393 	Validation Loss: 0.566446
epoch 9
Epoch: 9 	Training Loss: 0.574116 	Validation Loss: 0.566293
epoch 10
Epoch: 10 	Training Loss: 0.567896 	Validation Loss: 0.567272
epoch 11
Epoch: 11 	Training Loss: 0.567201 	Validation Loss: 0.565825
epoch 12
Epoch: 12 	Training Loss: 0.566239 	Validation Loss: 0.571501
epoch 13
Epoch: 13 	Training Loss: 0.565620 	Validation Loss: 0.571690
epoch 14
Epoch: 14 	Traini

#### Q10: Compute the overall accuracy of the model obtained in your answer to Q9. Repeat your analysis of independence, separation, and sufficiency, previously performed in your answers to Q4-Q6, on this model.   (5 points)

In [132]:
# YOUR ANSWER HERE
print ("Fair in-training DNN Model accuracy:", end = " ")
accuracy(fairintrainingdnnmodel, test_x)
print ("Does the model satisfy independence: ", independence(fairintrainingdnnmodel, newtest_x, test_y))
print ("Does the model satisfy separation: ", separation(fairintrainingdnnmodel, newtest_x, test_y))
print ("Does the model satisfy sufficiency: ", sufficiency(fairintrainingdnnmodel, newtest_x, test_y))


Fair in-training DNN Model accuracy: 0.7300000190734863
female pos prob aka P(R | A = 0): tensor(0.3123, grad_fn=<DivBackward0>)
male pos prob aka P(R | A = 1): tensor(0.3595, grad_fn=<DivBackward0>)
Does the model satisfy independence:  True
|P(R = 1 | A = 1, Y = 0) - P(R = 1 | A = 0, Y = 0)| =  0.011290639638900757
|P(R = 1 | A = 1, Y = 1) - P(R = 1 | A = 0, Y = 1)| =  0.06934034824371338
Does the model satisfy separation:  False
Does the model satisfy sufficiency:  False


#### Q11: Comment on the tradeoffs between overall accuracy and fairness that you see in your experimental exploration above. (5 points)

# YOUR ANSWER HERE

I actually expected to see worse accuracy when fairness was put in (as the model now needs to optimize something besides just accuracy), but this was not the case.  I guess the model was able to find solutions that are fair and just as accurate as before, but it needs a push in the right direction to find fair solutions.  Especially for the in-training separation loss model, the validation loss and accuracy were surprisingly low.  However, I did notice some variation run to run on all the models (maybe around 1-2% variance).  Still, I ran them numerous times and found great/similar results from all the models.  I needed to make the lambda for the intraining model higher than I expected (I chose 5 because in the end I was getting validation loss of like 0.55 and separation losses of about 0.1 and I wanted them to be about equal).  The intraining model was veryyy close to passing our 0.05 margin for separation.  Interestingly, independence passed with the addition of this loss function and accuracy was not compromised.  Finally, I actually made the intraining model use the preprocessed data (which may have been the intention of the lab, not sure) which reduced the maximum separation distance to 0.069, which is just 0.019 over our limit.  Accuracy was still not compromised.

## Part 4: Writing a Model Card

#### Q12: Write a model card, following the format shown in the original [Mitchell et al.](https://arxiv.org/pdf/1810.03993.pdf) paper, for your model. Templates are available in Figures 2 and 3 of the paper.   (20 points)

# YOUR ANSWER HERE

Model Details:
Deep Neural Network with 10 input features, 3 hidden layers of 20 nodes each, followed by a dropout and relu.  Lastly, an output layer of one node and then a sigmoid.  Intended use is predicting credit risk among people, while being fair against the sensitive attribute (gender).

Trained using Binary Cross Entropy Loss, with weight decay 1e-5, learning rate 5e-4. Fairness metrics given and separation used in training.  Separation was added to the training loss, which slowed down training but made the model virtually pass separation (depending on run).

Metrics: Independence, Separation, and Sufficiency of the model.

Data: 800 samples of people with good/bad credit risk.  Data had a class imbalance of vastly more good examples (about 10:1).  Data was preprocessed via replacing all nonsensitive features with taking the median value (among all sensitive groups) at the percentile among this group.







#### Q13 (OPTIONAL): How difficult was this homework and how many hours did you spend on it? (0 points) 

YOUR ANSWER HERE