# **Bias mitigation with debiasing exposure, disparate impact remover and Fair top K**

This demo  will show how to implement the post-processing methods "debiasing exposure", "disparate impact remover" and "Fair top K" to enhance the fairness of a recommender system's output.

- [Debiasing Exposure](#method-debiasing-exposure)
- [Disparate Impact Remover](#method-disparate-impact-remover)
- [Fair Top K](#method-fair-top-k)

First, install the `holisticai` package if you haven't already:
```bash
!pip install holisticai[all]
```
Then, import the necessary libraries.

In [2]:
# Base Imports
import pandas as pd
import numpy as np
from holisticai.datasets.synthetic.recruitment import generate_rankings
from holisticai.bias.mitigation.postprocessing.debiasing_exposure.algorithm_utils import exposure_metric
from holisticai.bias.mitigation.postprocessing import DebiasingExposure

np.random.seed(0)
import warnings
warnings.filterwarnings("ignore")

The dataset that we will use is a synthetic ranking dataset generated following the procedure described by Yang and Stoyanovich in their [research](https://arxiv.org/abs/1610.08559). This algorithm creates a ranked output of protected and unprotected candidates with a certain probability.

In [3]:
# Synthetic data
M = 1000
top_n = 20
p = 0.25
rankings = generate_rankings(M, top_n, p, return_p_attr=False)

baseline = exposure_metric(rankings, group_col='protected', query_col='X', score_col='score')
baseline

Unnamed: 0,Value
exposure_ratio,37027.714945
exposure difference,0.047272


## **Bias mitigation**

### **Method: Debiasing exposure**

Apply the debiasing exposure mitigator algorithm.

In [4]:
# create the DebiasingExposure class
dtr = DebiasingExposure(group_col="protected",
                        query_col = 'X',
                        doc_col = 'Y',
                        feature_cols = ['score', 'protected'],
                        score_col = 'score',
                        gamma=2, 
                        number_of_iterations=100, 
                        standardize=True,
                        verbose=1)

# train the model
dtr.fit(rankings)

<holisticai.bias.mitigation.postprocessing.debiasing_exposure.transformer.DebiasingExposure at 0x7f559cea5c10>

In [5]:
re_rankings = dtr.transform(rankings)

Observe the fairness metrics before and after applying the algorithm (lower is better).

In [6]:
df_deb_exp = exposure_metric(re_rankings, group_col='protected', query_col='X', score_col='score')
df_deb_exp

Unnamed: 0,Value
exposure_ratio,0.755373
exposure difference,0.002431


In [7]:
result = pd.concat([baseline, df_deb_exp], axis=1).iloc[:, [0,1]]
result.columns = ['Baseline','Mitigator']
result

Unnamed: 0,Baseline,Mitigator
exposure_ratio,37027.714945,0.755373
exposure difference,0.047272,0.002431


### **Method: Disparate impact remover**

In [8]:
from holisticai.bias.mitigation import DisparateImpactRemoverRS

dir = DisparateImpactRemoverRS(query_col='X', group_col='protected', score_col='score', repair_level=1)
re_rankings = dir.transform(rankings)

df_dis_imp = exposure_metric(re_rankings, group_col='protected', query_col='X', score_col='score')
df_dis_imp

Unnamed: 0,Value
exposure_ratio,1.003761
exposure difference,0.001925


In [9]:
result = pd.concat([baseline, df_dis_imp], axis=1).iloc[:, [0,1]]
result.columns = ['Baseline','Mitigator']
result

Unnamed: 0,Baseline,Mitigator
exposure_ratio,37027.714945,1.003761
exposure difference,0.047272,0.001925


### **Method: Fair Top-K**

Now, we will implement the Fair Top-K algorithm, this method works differently from the previous ones. Given a list of items, it will reorganize the list to ensure that the top-K items are fairer.

Let's create a unfair list to apply the Fair Top-K algorithm.

In [10]:
def create_unfair_example(ranking, n):
    """
    Setting an unfair ranking where protected group examples are only the last n results.
    """
    ranking = ranking.copy()
    ranking['protected']=False
    ranking['protected'].iloc[-n:]=True
    return ranking
    
M = 1
k = 20
p = 0.1
ranking = generate_rankings(M, k, p)

unfair_ranking = create_unfair_example(ranking, 6)

In [11]:
from holisticai.bias.mitigation.postprocessing.fair_topk.transformer import FairTopK

In [12]:
# Bias Mitigation Post-processing
top_n = 20
p = 0.9
alpha = 0.15
fs = FairTopK(top_n=top_n, 
              p=p, 
              alpha=alpha, 
              query_col='X', 
              doc_col='Y', 
              score_col='score', 
              group_col='protected')

re_ranking = fs.transform(unfair_ranking)

Let's observe how the original and unfair ranking was modified:

In [13]:
def compare_results(old , new):
    old = old.copy()
    new = new.copy()
    old.columns = pd.MultiIndex.from_tuples([['Old Rank',col] for col in old.columns])
    new.columns = pd.MultiIndex.from_tuples([['New Rank',col] for col in new.columns])
    return pd.concat([old.reset_index(drop=True),new.reset_index(drop=True)], axis=1)

compare_results(unfair_ranking , re_ranking)

Unnamed: 0_level_0,Old Rank,Old Rank,Old Rank,Old Rank,New Rank,New Rank,New Rank,New Rank
Unnamed: 0_level_1,X,Y,score,protected,X,Y,score,protected
0,0,20,20,False,0,20,20,False
1,0,19,19,False,0,6,6,True
2,0,18,18,False,0,5,5,True
3,0,17,17,False,0,4,4,True
4,0,16,16,False,0,3,3,True
5,0,15,15,False,0,19,19,False
6,0,14,14,False,0,2,2,True
7,0,13,13,False,0,1,1,True
8,0,12,12,False,0,18,18,False
9,0,11,11,False,0,17,17,False
