In [1]:
import sys
sys.path.append('../../../')

## Template - Bias Mitigation Benchmark ([Holistic AI](https://research.holisticai.com))

**Task:** Binary Classification

**Type:** Preprocessing


This notebook is a template for the Bias Mitigation Benchmark. It can be used to mitigate bias in datasets and models. The notebook is based on the [Holistic AI open source library](https://github.com/holistic-ai/holisticai) and follows the bias mitigation benchmark outlined in [Holistic AI](https://research.holisticai.com).

### Template Structure

The template have the following steps:

1. Setup definition: 
    - select a task: `binary_classification`, `multiclass_classification`, `regression`, `clustering`, `recommender`
    - select a type: `inprocessing`, `preprocessing`, `postprocessing`
2. Mitigator class
    - create a class for you custom mitigator
3. Evaluation
    - evaluate your mitigator and compare it with other mitigators
4. Submission
    - do you have good results? Then submit your mitigator to the Bias Mitigation Benchmark


### Step 1: Setup Definition

In [2]:
from holisticai.benchmark.tasks import task_name, get_task

print(task_name)

['binary_classification', 'multiclass_classification', 'regression', 'clustering', 'recommender']


In [3]:
# load a task
task = get_task("binary_classification")

In [4]:
# benchmark for the task by type
task.benchmark(type='preprocessing')

Dataset,Average AFS,adult,credit_card
Mitigator,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
Reweighing,0.883628,0.865184,0.902072
DisparateImpactRemover,0.877068,0.854816,0.899319
CorrelationRemover,0.874019,0.844925,0.903113
LearningFairRepresentation,0.870486,0.853018,0.887953


### Step 2: Mitigator Class

In [5]:
import numpy as np

class MyPreprocessingMitigator():
    """
    My Preprocessing Mitigator
    """

    def fit(self, X, group_a, group_b):
        sensitive_features = np.stack([group_a, group_b], axis=1).astype(np.int32)
        self.sensitive_mean_ = sensitive_features.mean()

        sensitive_features_center = sensitive_features - self.sensitive_mean_
        self.beta_, _, _, _ = np.linalg.lstsq(sensitive_features_center, X, rcond=None)
        self.X_shape_ = X.shape

        return self

    def transform(self, X, group_a, group_b):
        alpha = 0.8
        sensitive_features = np.stack([group_a, group_b], axis=1).astype(np.int32)
        self.sensitive_mean_ = sensitive_features.mean()
        sensitive_features_center = sensitive_features - self.sensitive_mean_

        X_filtered = X - sensitive_features_center.dot(self.beta_)
        X = np.atleast_2d(X)
        X_filtered = np.atleast_2d(X_filtered)
        
        return alpha * X_filtered + (1 - alpha) * X

### Step 3: Evaluation

In [6]:
my_mitigator = MyPreprocessingMitigator()

task.run_benchmark(custom_mitigator = my_mitigator, type = 'preprocessing')

Binary Classification Benchmark initialized for MyPreprocessingMitigator


  0%|          | 0/2 [00:00<?, ?it/s]

100%|██████████| 2/2 [00:31<00:00, 15.65s/it]


In [7]:
task.evaluate_table()

Dataset,Average AFS,Average Accuracy,adult,credit_card
Mitigator,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
Reweighing,0.883628,,0.865184,0.902072
DisparateImpactRemover,0.877068,,0.854816,0.899319
CorrelationRemover,0.874019,,0.844925,0.903113
MyPreprocessingMitigator,0.871352,0.83098,0.841409,0.901295
LearningFairRepresentation,0.870486,,0.853018,0.887953


### Step 4: Submission

In [8]:
task.submit()

Opening the link in your browser: https://forms.office.com/r/Vd6FT4eNL2
