# Fairness Post-Processing on the Adult Income Dataset (`Race` as Protected Attribute)

In this notebook we compare several post-processing techniques on the well-known "Adult" dataset to  
reduce demographic disparities in a binary income prediction task. 

We’ll walk through:  

1. Imports & configuration

2. Utility functions (data loading, feature selection)

3. Training / post‑processing loop

4. Result summary & feature importance

5. Conclusions

## 1  Imports & basic setup

In [1]:
import os
import numpy as np
import pandas as pd

from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split

from humancompatible.repair.methods.data_analysis import rdata_analysis
from humancompatible.repair.postprocess.roc_postprocess import ROCpostprocess
from humancompatible.repair.postprocess.proj_postprocess import Projpostprocess

pip install 'aif360[AdversarialDebiasing]'
pip install 'aif360[AdversarialDebiasing]'
pip install 'aif360[Reductions]'
pip install 'aif360[Reductions]'
pip install 'aif360[inFairness]'
pip install 'aif360[Reductions]'


In [2]:
import warnings
warnings.simplefilter(action='ignore', category=FutureWarning)

We’ve turned off FutureWarning noise so we can focus on the core outputs. Next up, we’ll write a couple of utility functions to:

1. **Load & clean** the Adult dataset (apply binning, encode labels).

2. **Compute TV distances** to pick out the most imbalanced features.

## 2  Utility helpers

Below are three helper functions:

- **`load_data`**: merges train/test, binning continuous features, encodes sensitive (`S`) & target (`Y`).

- **`categorise`**: assigns numeric bins for age, hours-per-week, capital gain/loss.

- **`choose_x`**: measures total-variation distance for each feature to detect imbalance and returns a shortlist for repair.

In [3]:
def load_data(data_path,var_list,pa):
    column_names = ['age', 'workclass', 'fnlwgt', 'education',
                'education-num', 'marital-status', 'occupation', 'relationship',
                'race', 'sex', 'capital-gain', 'capital-loss', 'hours-per-week',
                'native-country', 'Y']
    na_values=['?']
    pa_dict={'Male':1,'Female':0,'White':1,'Black':0}
    label_dict={'>50K.':1,'>50K':1,'<=50K.':0,'<=50K':0}
    train_path = os.path.join(data_path, 'adult.data')
    test_path = os.path.join(data_path, 'adult.test')
    train = pd.read_csv(train_path, header=None,names=column_names,
                    skipinitialspace=True, na_values=na_values)
    test = pd.read_csv(test_path, header=0,names=column_names,
                    skipinitialspace=True, na_values=na_values)
    messydata = pd.concat([test, train], ignore_index=True)[var_list+[pa,'Y']]
    messydata=messydata.rename(columns={pa:'S'})
    messydata['S']=messydata['S'].replace(pa_dict)
    messydata['Y']=messydata['Y'].replace(label_dict)
    messydata=messydata[(messydata['S']==0)|(messydata['S']==1)]
    for col in var_list+['S','Y']:
        messydata[col]=messydata[col].astype('int64')
    messydata['W']=1
    bins_capitalgain=[100,3500,7500,10000]
    bins_capitalloss=[100,1600,1900,2200]
    bins_age=[26,36,46,56]
    bins_hours=[21,36,46,61]

    messydata=categerise(messydata,'age',bins_age)
    messydata=categerise(messydata,'hours-per-week',bins_hours)
    messydata=categerise(messydata,'capital-gain',bins_capitalgain)
    messydata=categerise(messydata,'capital-loss',bins_capitalloss)
    
    return messydata

def categerise(df,col,bins):
    for i in range(len(bins)+1):
        if i == 0:
            df.loc[df[col] < bins[i], col] = i
        elif i == len(bins):
            df.loc[df[col] >= bins[i-1], col] = i
        else:
            df.loc[(df[col] >= bins[i-1])& (df[col] < bins[i]), col] = i        
    return df

def choose_x(var_list,messydata):
    tv_dist=dict()
    for x_name in var_list:
        x_range_single=list(pd.pivot_table(messydata,index=x_name,values=['W'])[('W')].index) 
        dist=rdata_analysis(messydata,x_range_single,x_name)
        tv_dist[x_name]=sum(abs(dist['x_0']-dist['x_1']))/2
    x_list=[]
    for key,val in tv_dist.items():
        if val>0.1:
            x_list+=[key]  
    return x_list,tv_dist

In [4]:
data_path='..//data//adult'
var_list=['hours-per-week','age','capital-gain','capital-loss','education-num'] #
pa='race'
favorable_label = 1
var_dim=len(var_list)

K=200
e=0.01

if pa == 'sex':
    thresh=0.05
elif pa == 'race':
    thresh=0.05

messydata = load_data(data_path,var_list,pa)
x_list,tv_dist = choose_x(var_list,messydata)

X=messydata[var_list+['S','W']].to_numpy() # [X,S,W]
y=messydata['Y'].to_numpy() #[Y]

In [5]:
tv_dist

{'hours-per-week': np.float64(0.12216173195089294),
 'age': np.float64(0.04149230147335384),
 'capital-gain': np.float64(0.026764553949230142),
 'capital-loss': np.float64(0.014217783478743178),
 'education-num': np.float64(0.11867963282506956)}

Feature distributions differ across sensitive groups:  

- **hours-per-week** and **education-num** show strong differences -> we repair these  
- Other features are group-neutral → no repair needed  

With `X, y` ready, we now move on to the **main experiment loop**.

## 3 Training & Post-Processing Experiment

We define **groups** based on the sensitive attribute `S = 0/1`.

Steps:

1. Split data into train / validation / test  
2. Fit a Random Forest baseline  
3. Apply five post-processing strategies:  

   - **origin** - no fairness correction; use the training data as is  
   - **unconstrained** - project all data (both groups combined) to a target distribution via vanilla optimal transport (Sinkhorn algorithm)  
   - **barycentre** - project each group separately into a barycentric distribution (weights proportional to group size); requires group membership to apply different maps  
   - **partial** - our method; project all data into a target distribution using a group-blind coupling from constrained optimal transport  
   - **ROC** - another baseline post-processing method; favorable outcomes are assigned to the unprivileged group (and unfavorable to the privileged) within a confidence band around the decision boundary (see [Kamiran et al., 2012](https://aif360.readthedocs.io/en/stable/modules/generated/aif360.algorithms.postprocessing.RejectOptionClassification.html))  

Each experiment is repeated 10x to smooth out randomness.

In [6]:
methods=['origin','unconstrained','barycentre','partial','ROC'] # Place ROC in the end
report=pd.DataFrame(columns=['DI','f1 macro','f1 micro','f1 weighted','TV distance','method'])
for ignore in range(10):
    # train val test 4:2:4
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.4)
    X_train, X_val, y_train, y_val = train_test_split(X_train, y_train, test_size=0.3)

    clf=RandomForestClassifier(max_depth=5).fit(X_train[:,0:var_dim],y_train)
    projpost = Projpostprocess(X_test,y_test,x_list,var_list,clf,K,e,thresh,favorable_label,linspace_range=(0.01,0.1),theta=1e-2)
    for method in methods[:-1]:
        # report = pd.concat([report,projpost.postprocess(method,para=1e-2)], ignore_index=True)
        report = pd.concat([report,projpost.postprocess(method,para=1e-3)], ignore_index=True)

    ROCpost = ROCpostprocess(X_val,y_val,var_list,clf,favorable_label) # use validation set to train a ROC model
    report = pd.concat([report,ROCpost.postprocess(X_test,y_test,tv_origin=projpost.tv_origin)], ignore_index=True)

report.to_csv('../data/report_postprocess_adult_'+str(pa)+'.csv',index=None)

Optimal classification threshold (with fairness constraints) = 0.1900
Optimal ROC margin = 0.0211
Optimal classification threshold (with fairness constraints) = 0.2300
Optimal ROC margin = 0.0256
Optimal classification threshold (with fairness constraints) = 0.2100
Optimal ROC margin = 0.0233
Optimal classification threshold (with fairness constraints) = 0.2500
Optimal ROC margin = 0.0278
Optimal classification threshold (with fairness constraints) = 0.2700
Optimal ROC margin = 0.0300
Optimal classification threshold (with fairness constraints) = 0.2300
Optimal ROC margin = 0.0256
Optimal classification threshold (with fairness constraints) = 0.1900
Optimal ROC margin = 0.0211
Optimal classification threshold (with fairness constraints) = 0.1900
Optimal ROC margin = 0.0211
Optimal classification threshold (with fairness constraints) = 0.1900
Optimal ROC margin = 0.0211
Optimal classification threshold (with fairness constraints) = 0.2300
Optimal ROC margin = 0.0256


The logs above show, for each fold, the selected decision threshold and ROC margin used in the Reject Option Classification. 

Lower margins indicate less aggressive adjustments.

Let’s aggregate results across our folds.  In the table below:

- **DI** (Disparate Impact): ratio of favourable outcomes  

- **F1** (macro/micro/weighted): classification quality  

- **TV distance**: remaining distribution gap on the repaired features  

- **method**: which post-processor was used

In [7]:
report

Unnamed: 0,DI,f1 macro,f1 micro,f1 weighted,TV distance,method
0,0.444722,0.68251,0.819635,0.79139,0.193409,origin
1,0.444722,0.68251,0.819635,0.79139,0.193278,unconstrained
2,0.473157,0.680658,0.813015,0.78794,2.3e-05,barycentre
3,0.938935,0.648905,0.72076,0.731788,0.024862,partial_0.001
4,1.07626,0.674192,0.710264,0.730763,0.193409,ROC
5,0.403671,0.673873,0.813876,0.783536,0.185509,origin
6,0.403671,0.673873,0.813876,0.783536,0.185292,unconstrained
7,0.568627,0.67014,0.808117,0.779628,5.9e-05,barycentre
8,1.057937,0.609504,0.68572,0.698042,0.024776,partial_0.001
9,0.97078,0.71439,0.769525,0.778792,0.185509,ROC


## 4  Compute Average Feature Importance

Next, we revisit our Random Forest baseline to identify which features most strongly influence income prediction.  

Understanding feature importance helps us see which attributes the model relies on—and which ones may need repair to ensure fairness.

In [8]:
importance=[]
for ignore in range(10):
    # train val test 4:2:4
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.4)
    X_train, X_val, y_train, y_val = train_test_split(X_train, y_train, test_size=0.3)

    clf=RandomForestClassifier(max_depth=5).fit(X_train[:,0:var_dim],y_train)
    importance.append(list(clf.feature_importances_))

In [9]:
importance=np.array(importance)
print("features", var_list)
print("mean importances", importance.mean(axis=0))

features ['hours-per-week', 'age', 'capital-gain', 'capital-loss', 'education-num']
mean importances [0.09327544 0.20548356 0.34358706 0.0482595  0.30939444]


## 5 Conclusions

Our **baseline** (origin/unconstrained) shows the best accuracy but also the biggest gap between protected groups. 

The **barycentre** method virtually erases that gap - but does so by flipping a lot of predictions, which could feel jarring in practice. 

With **partial repair**, you get a handy dial: small tweaks nudge toward parity with minimal impact, while larger tweaks tighten fairness at a greater cost. 

And **ROC post-processing** strikes a nice compromise, cutting disparity quite a bit while keeping f-scores close to where we started.

Looking at feature importance reminds us what the model "cares about" most: **capital gain** and **education level** top the list, with **age** not far behind. If you’re worried about proxying sensitive traits, these are the variables to think hard about - either by guarding them or by designing even earlier interventions.