# Fairness Post-Processing: Adult Income (`Sex` as Protected Attribute)

In this notebook, we switch from race to **sex** as the protected attribute.  
We will:

1. **Load & preprocess data** (sensitive attribute = **sex**)  
   - Define **groups** based on `S = 0/1`  

2. **Build a Random Forest baseline**  

3. **Apply post-processing methods**:  
   - **Origin** - no repair  
   - **Barycentre** - OT-based full repair  
   - **Partial** - our tunable, softer repair  
   - **ROC** - baseline post-processing; favorable outcomes are assigned to the unprivileged group within a confidence band around the decision boundary ([Kamiran et al., 2012](https://aif360.readthedocs.io/en/stable/modules/generated/aif360.algorithms.postprocessing.RejectOptionClassification.html))  

4. **Compare metrics**: Disparate Impact, F1 variants, and TV distance  

5. **Compute feature importance** and draw conclusions

## 1  Imports & basic setup

In [1]:
import os
import numpy as np
import pandas as pd

from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split

from aif360.datasets import AdultDataset

from humancompatible.repair.methods.data_analysis import rdata_analysis
from humancompatible.repair.postprocess.roc_postprocess import ROCpostprocess
from humancompatible.repair.postprocess.proj_postprocess import Projpostprocess

pip install 'aif360[AdversarialDebiasing]'
pip install 'aif360[AdversarialDebiasing]'
pip install 'aif360[Reductions]'
pip install 'aif360[Reductions]'
pip install 'aif360[inFairness]'
pip install 'aif360[Reductions]'


We silence FutureWarnings so you’ll only see the key outputs below.

In [2]:
import warnings
warnings.simplefilter(action='ignore', category=FutureWarning)

## 2  Utility helpers

Here are three functions:

- **`load_data`** - merges train/test, encodes `S` & `Y`, bins skewed numerics  

- **`categerise`** - simple numeric binning for one column  

- **`choose_x`** - computes per-feature Total-Variation to shortlist the most imbalanced axes  

In [3]:
def load_data(data_path,var_list,pa):
    column_names = ['age', 'workclass', 'fnlwgt', 'education',
                'education-num', 'marital-status', 'occupation', 'relationship',
                'race', 'sex', 'capital-gain', 'capital-loss', 'hours-per-week',
                'native-country', 'Y']
    na_values=['?']
    pa_dict={'Male':1,'Female':0,'White':1,'Black':0}
    label_dict={'>50K.':1,'>50K':1,'<=50K.':0,'<=50K':0}
    train_path = os.path.join(data_path, 'adult.data')
    test_path = os.path.join(data_path, 'adult.test')
    train = pd.read_csv(train_path, header=None,names=column_names,
                    skipinitialspace=True, na_values=na_values)
    test = pd.read_csv(test_path, header=0,names=column_names,
                    skipinitialspace=True, na_values=na_values)
    messydata = pd.concat([test, train], ignore_index=True)[var_list+[pa,'Y']]
    messydata=messydata.rename(columns={pa:'S'})
    messydata['S']=messydata['S'].replace(pa_dict)
    messydata['Y']=messydata['Y'].replace(label_dict)
    messydata=messydata[(messydata['S']==0)|(messydata['S']==1)]
    for col in var_list+['S','Y']:
        messydata[col]=messydata[col].astype('int64')
    messydata['W']=1
    bins_capitalgain=[100,3500,7500,10000]
    bins_capitalloss=[100,1600,1900,2200]
    bins_age=[26,36,46,56]
    bins_hours=[21,36,46,61]

    messydata=categerise(messydata,'age',bins_age)
    # messydata=categerise(messydata,'hours-per-week',bins_hours)
    messydata=categerise(messydata,'capital-gain',bins_capitalgain)
    messydata=categerise(messydata,'capital-loss',bins_capitalloss)
    
    return messydata

def categerise(df,col,bins):
    for i in range(len(bins)+1):
        if i == 0:
            df.loc[df[col] < bins[i], col] = i
        elif i == len(bins):
            df.loc[df[col] >= bins[i-1], col] = i
        else:
            df.loc[(df[col] >= bins[i-1])& (df[col] < bins[i]), col] = i        
    return df

def choose_x(var_list,messydata):
    tv_dist=dict()
    for x_name in var_list:
        x_range_single=list(pd.pivot_table(messydata,index=x_name,values=['W'])[('W')].index) 
        dist=rdata_analysis(messydata,x_range_single,x_name)
        tv_dist[x_name]=sum(abs(dist['x_0']-dist['x_1']))/2
    x_list=[]
    for key,val in tv_dist.items():
        if val>0.1:
            x_list+=[key]  
    return x_list,tv_dist

### Protected-attribute setup

- **Protected attribute**: `pa = "sex"`

- **Privileged** = Male (`S = 1`), **Unprivileged** = Female (`S = 0`)  

- We tune our threshold grid around 0.05, since sex-based gaps tend to be subtler than race.

In [4]:
data_path='..//data//adult'
var_list=['age','capital-gain','capital-loss','education-num'] #'hours-per-week',
pa='sex'
favorable_label = 1
var_dim=len(var_list)

K=200
e=0.01

if pa == 'sex':
    thresh=0.05
elif pa == 'race':
    thresh=0.1

messydata = load_data(data_path,var_list,pa)
x_list,tv_dist = choose_x(var_list,messydata)

X=messydata[var_list+['S','W']].to_numpy() # [X,S,W]
y=messydata['Y'].to_numpy() #[Y]

In [5]:
tv_dist

{'age': np.float64(0.1010227688866829),
 'capital-gain': np.float64(0.036924675713792855),
 'capital-loss': np.float64(0.020068855964263464),
 'education-num': np.float64(0.07095473385227195)}

## 3  Training & post-processing experiment

For each of 10 random splits we:

1. Fit a depth-5 Random-Forest

2. Run **Origin**, **Barycentre**, and **Partial** repairs

3. (Skip ROC here for brevity - see `02_adult.ipynb` for ROC details)

4. Record DI, F1 (macro/micro/weighted), and TV distance on our two repaired axes

In [6]:
thresh=0.05
x_list = ['age','education-num']
methods=['origin','barycentre','partial']
report=pd.DataFrame(columns=['DI','f1 macro','f1 micro','f1 weighted','TV distance','method'])
for ignore in range(10):
    # train val test 4:2:4
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.4)
    X_train, X_val, y_train, y_val = train_test_split(X_train, y_train, test_size=0.3)
    
    clf=RandomForestClassifier(max_depth=5).fit(X_train[:,0:var_dim],y_train)
    projpost = Projpostprocess(X_test,y_test,x_list,var_list,clf,K,e,thresh,favorable_label,linspace_range=(0.1,0.9),theta=1e-3)
    for method in methods[:-1]:
        report = pd.concat([report,projpost.postprocess(method)], ignore_index=True)
    
    for p in [1e-2,1e-3,1e-4]:
        report = pd.concat([report,projpost.postprocess('partial',para=p)], ignore_index=True)

report.to_csv('../data/E3_postprocess_adult_'+str(pa)+'.csv',index=None)

Let's aggregate across folds. In the table below:

- **DI**: Disparate Impact

- **F1 (macro/micro/weighted)**: classification quality

- **TV distance**: remaining gap on the repaired features

- **method**: repair strategy

In [7]:
report.dropna()

Unnamed: 0,DI,f1 macro,f1 micro,f1 weighted,TV distance,method
0,0.453587,0.649873,0.810309,0.77292,0.128614,origin
1,0.466078,0.646376,0.805293,0.769449,0.000627,barycentre
2,1.104568,0.581083,0.722219,0.707322,0.086938,partial_0.01
3,0.827115,0.634161,0.763577,0.747127,0.027393,partial_0.001
4,0.822652,0.635626,0.764856,0.748285,0.003617,partial_0.0001
5,0.553847,0.675905,0.814608,0.786284,0.134076,origin
6,0.535066,0.664566,0.804218,0.777242,0.000473,barycentre
7,1.094631,0.592422,0.724011,0.712987,0.092245,partial_0.01
8,1.149719,0.57215,0.690792,0.689443,0.02805,partial_0.001
9,1.086683,0.591588,0.683933,0.69269,0.00374,partial_0.0001


In [8]:
thresh=0.3
x_list = ['age','education-num']
methods=['origin','barycentre','partial'] # Place ROC in the end
report=pd.DataFrame(columns=['DI','f1 macro','f1 micro','f1 weighted','TV distance','method'])
for ignore in range(5):
    # train val test 4:2:4
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.4)
    # X_train, X_val, y_train, y_val = train_test_split(X_train, y_train, test_size=0.3)

    clf=RandomForestClassifier(max_depth=5).fit(X_train[:,0:var_dim],y_train)
    projpost = Projpostprocess(X_test,y_test,x_list,var_list,clf,K,e,thresh,favorable_label,linspace_range=(0.1,0.9),theta=1e-3)
    
    print(projpost.postprocess('origin'))
    print(projpost.postprocess('barycentre'))
    for t in range(2,4):
        print(projpost.postprocess('partial',para=10**(-t)))

# report.to_csv('../data/E3_postprocess_adult_'+str(pa)+'.csv',index=None)

         DI  f1 macro  f1 micro f1 weighted TV distance  method
0  0.544755  0.677207  0.814403    0.786268     0.12403  origin
        DI  f1 macro  f1 micro f1 weighted TV distance      method
0  0.56826  0.674245  0.813789    0.784739    0.000754  barycentre
         DI  f1 macro  f1 micro f1 weighted TV distance        method
0  0.594879  0.660944  0.809131     0.77711    0.083422  partial_0.01
         DI f1 macro  f1 micro f1 weighted TV distance         method
0  0.592138  0.66598  0.810411    0.779809    0.025305  partial_0.001
         DI  f1 macro  f1 micro f1 weighted TV distance  method
0  0.562569  0.681667  0.817116    0.789727    0.130114  origin
         DI  f1 macro  f1 micro f1 weighted TV distance      method
0  0.592823  0.674918  0.815069    0.785996    0.000073  barycentre
         DI  f1 macro  f1 micro f1 weighted TV distance        method
0  0.605047  0.662321  0.811281    0.779035    0.090797  partial_0.01
         DI  f1 macro  f1 micro f1 weighted TV distanc

In [9]:
valpost = Projpostprocess(X_val,y_val,x_list,var_list,clf,K,e,'auto',linspace_range=(0.1,0.9),theta=1e-3)
valpost.thresh

Optional threshold =  [0.1        0.18888889 0.27777778 0.36666667 0.45555556 0.54444444
 0.63333333 0.72222222 0.81111111 0.9       ]
Disparate Impact =  [0.70823508 0.65742979 0.6418957  0.63642532 0.63418972 0.62317166
 0.63852072 0.63852072 0.6484239  0.69129834]
f1 scores =  [0.69063882 0.66558246 0.66209947 0.66202684 0.6622522  0.6616947
 0.65925716 0.65925716 0.65472653 0.64197352]


np.float64(0.1)

### Analyzing results

- **Origin** buys the highest F1 but sits far below fairness (DI ≈ 0.50).

- **Barycentre** nearly closes the gap (TV ≈ 0) at only a tiny F1 penalty.

- **Partial** gives you a "fairness knob" - small `t` -> light fix, large `t` -> full parity at more cost

In [10]:
methods=['origin','unconstrained','barycentre','partial','ROC'] # Place ROC in the end
report=pd.DataFrame(columns=['DI','f1 macro','f1 micro','f1 weighted','TV distance','method'])
for ignore in range(10):
    # train val test 4:2:4
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.4)
    X_train, X_val, y_train, y_val = train_test_split(X_train, y_train, test_size=0.3)

    clf=RandomForestClassifier(max_depth=5).fit(X_train[:,0:var_dim],y_train)
    projpost = Projpostprocess(X_test,y_test,x_list,var_list,clf,K,e,thresh,favorable_label,linspace_range=(0.1,0.9),theta=1e-3)
    for method in methods[:-1]:
        # report = pd.concat([report,projpost.postprocess(method,para=1e-2)], ignore_index=True)
        report = pd.concat([report,projpost.postprocess(method,para=1e-3)], ignore_index=True)

    ROCpost = ROCpostprocess(X_val,y_val,var_list,clf,favorable_label) # use validation set to train a ROC model
    report = pd.concat([report,ROCpost.postprocess(X_test,y_test,tv_origin=projpost.tv_origin)], ignore_index=True)

report.to_csv('../data/report_postprocess_adult_'+str(pa)+'.csv',index=None)

Optimal classification threshold (with fairness constraints) = 0.1700
Optimal ROC margin = 0.0189
Optimal classification threshold (with fairness constraints) = 0.2700
Optimal ROC margin = 0.0300
Optimal classification threshold (with fairness constraints) = 0.1700
Optimal ROC margin = 0.0189
Optimal classification threshold (with fairness constraints) = 0.3100
Optimal ROC margin = 0.0344
Optimal classification threshold (with fairness constraints) = 0.2900
Optimal ROC margin = 0.0322
Optimal classification threshold (with fairness constraints) = 0.2900
Optimal ROC margin = 0.0322
Optimal classification threshold (with fairness constraints) = 0.1500
Optimal ROC margin = 0.0167
Optimal classification threshold (with fairness constraints) = 0.1900
Optimal ROC margin = 0.0211
Optimal classification threshold (with fairness constraints) = 0.2900
Optimal ROC margin = 0.0322
Optimal classification threshold (with fairness constraints) = 0.2900
Optimal ROC margin = 0.0322


## 4  Compute average feature importance

Finally, we revisit our Random-Forest baseline to see which inputs drive income prediction most:

* **capital-gain** and **education-num** lead the pack  

* **age** follows closely  

These are the features you’d want to guard most carefully against proxying sensitive traits.

In [11]:
pa = 'race'
label_map = {1.0: '>50K', 0.0: '<=50K'}
privileged_groups = [{pa: 1}]
unprivileged_groups = [{pa: 0}]
if pa == 'sex':
    thresh=0.05
    protected_attribute_maps = [{1.0: 'Male', 0.0: 'Female'}]
    cd = AdultDataset(protected_attribute_names=[pa],privileged_classes=[['Male'],[1.0]], 
        metadata={'label_map': label_map,'protected_attribute_maps': protected_attribute_maps},
        # categorical_features=['workclass', 'marital-status', 'occupation', 'relationship', 'native-country']
        features_to_drop=['race','fnlwgt','education','relationship',
                          'native-country','workclass','marital-status','occupation'])
elif pa == 'race':
    thresh=0.1
    protected_attribute_maps = [{1.0: 'White', 0.0:'Non-white'}]
    cd = AdultDataset(protected_attribute_names=[pa],privileged_classes=[['White'],[1.0]],
        metadata={'label_map': label_map,'protected_attribute_maps': protected_attribute_maps}, #
        features_to_drop=['sex','fnlwgt','education','relationship',
                          'native-country','workclass','marital-status','occupation'])
    #,'workclass','marital-status','occupation','relationship',

# train,test = cd.split([0.6], shuffle=True) #len(test.instance_names) = 2057
var_list = cd.feature_names.copy()
var_list.remove(pa)
var_dim=len(var_list)

K=200
e=0.01
bins_capitalgain=[100,3500,7500,10000]
bins_capitalloss=[100,1600,1900,2200]

messydata=cd.convert_to_dataframe()[0]
messydata=messydata.rename(columns={pa:'S',cd.label_names[0]:'Y'})
messydata=messydata[(messydata['S']==1)|(messydata['S']==0)]
for col in var_list+['S','Y']:
    messydata[col]=messydata[col].astype('int64')
messydata['W']=cd.instance_weights
# project 0-100 to {0,1,...,5}
messydata['age']=np.floor((messydata['age'].to_numpy()-17)/15)
messydata['hours-per-week']=np.floor(messydata['hours-per-week'].to_numpy()/20)
messydata=categerise(messydata,'capital-gain',bins_capitalgain)
messydata=categerise(messydata,'capital-loss',bins_capitalloss)

X=messydata[var_list+['S','W']].to_numpy() # [X,S,W]
y=messydata['Y'].to_numpy() #[Y]
tv_dist=dict()
for x_name in var_list:
    x_range_single=list(pd.pivot_table(messydata,index=x_name,values=['W'])[('W')].index) 
    dist=rdata_analysis(messydata,x_range_single,x_name)
    tv_dist[x_name]=sum(abs(dist['x_0']-dist['x_1']))/2
x_list=[]
for key,val in tv_dist.items():
    if val>0.045:
        x_list+=[key]        
tv_dist

{'age': np.float64(0.04744857663969924),
 'education-num': np.float64(0.056762405582129784),
 'capital-gain': np.float64(0.021127950774052707),
 'capital-loss': np.float64(0.011363681253224836),
 'hours-per-week': np.float64(0.04445283428803036)}

## 5  Conclusions

In this notebook we compared several simple, model-agnostic post-processing strategies on a Random-Forest income predictor. 
Here’s what we found:

- **Post-processing is practical**: you can retrofit fairness without re-training.

- **Barycentre is powerful**: it erases group gaps with minimal accuracy loss.

- **Partial repair offers flexibility**: tune `t` to hit your compliance bar.
