In [1]:
import sys
sys.path.append('../../')

## Template - Bias Mitigation Benchmark ([Holistic AI](https://research.holisticai.com))

**Task:** Regression

**Type:** Preprocessing


This notebook is a template for the Bias Mitigation Benchmark. It can be used to mitigate bias in datasets and models. The notebook is based on the [Holistic AI open source library](https://github.com/holistic-ai/holisticai) and follows the bias mitigation benchmark outlined in [Holistic AI](https://research.holisticai.com).

### Template Structure

The template have the following steps:

1. Setup definition: 
    - select a task: `binary_classification`, `multiclass_classification`, `regression`, `clustering`, `recommender`
    - select a type: `inprocessing`, `preprocessing`, `postprocessing`
2. Mitigator class
    - create a class for you custom mitigator
3. Evaluation
    - evaluate your mitigator and compare it with other mitigators
4. Submission
    - do you have good results? Then submit your mitigator to the Bias Mitigation Benchmark


### Step 1: Setup Definition

In [2]:
from holisticai.benchmark.tasks import task_name, get_task

print(task_name)

['binary_classification', 'multiclass_classification', 'regression', 'clustering', 'recommender']


In [3]:
# load a task
task = get_task("regression")

In [4]:
# benchmark for the task by type
data = task.benchmark(type='preprocessing')
data

Dataset,Average RFS,crime
Mitigator,Unnamed: 1_level_1,Unnamed: 2_level_1
CorrelationRemover,0.978462,0.978462
DisparateImpactRemover,0.885045,0.885045


### Step 2: Mitigator Class

In [5]:
import numpy as np

from holisticai.utils.transformers.bias import BMPreprocessing as BMPre

class MyPreprocessingMitigator(BMPre):
    """
    This is a class example of Preprocessing Mitigator based on CorrelationRemover implemented in holisticai library
    """

    def __init__(self, alpha=1):
        self.alpha = alpha

    def fit(self, X: np.ndarray, group_a: np.ndarray, group_b: np.ndarray):
        """
        Fit.

        Parameters
        ----------
        X : matrix-like
            Input data
        group_a : array-like
            Group membership vector (binary)
        group_b : array-like
            Group membership vector (binary)

        Return
        ------
            Self
        """
        params = self._load_data(X=X, group_a=group_a, group_b=group_b)
        X = params["X"]
        group_a = params["group_a"]
        group_b = params["group_b"]

        sensitive_features = np.stack([group_a, group_b], axis=1).astype(np.int32)
        self.sensitive_mean_ = sensitive_features.mean()
        sensitive_features_center = sensitive_features - self.sensitive_mean_
        self.beta_, _, _, _ = np.linalg.lstsq(sensitive_features_center, X, rcond=None)
        self.X_shape_ = X.shape

        return self

    def transform(self, X: np.ndarray, group_a: np.ndarray, group_b: np.ndarray):

        """
        Description
        ----------
        Transform X by applying the correlation remover.

        Parameters
        ----------
        X : matrix-like
            Input matrix
        group_a : array-like
            Group membership vector (binary)
        group_b : array-like
            Group membership vector (binary)
        Returns
        -------
            np.ndarray
        """

        params = self._load_data(X=X, group_a=group_a, group_b=group_b)
        X = params["X"]
        group_a = params["group_a"]
        group_b = params["group_b"]

        sensitive_features = np.stack([group_a, group_b], axis=1).astype(np.int32)
        self.sensitive_mean_ = sensitive_features.mean()
        sensitive_features_center = sensitive_features - self.sensitive_mean_
        X_filtered = X - sensitive_features_center.dot(self.beta_)
        X = np.atleast_2d(X)
        X_filtered = np.atleast_2d(X_filtered)
        return self.alpha * X_filtered + (1 - self.alpha) * X

    def fit_transform(
        self,
        X: np.ndarray,
        group_a: np.ndarray,
        group_b: np.ndarray,
    ):
        """
        Fit and transform

        Description
        ----------
        Fit and transform

        Parameters
        ----------
        X : matrix-like
            Input data
        group_a : array-like
            Group membership vector (binary)
        group_b : array-like
            Group membership vector (binary)

        Return
        ------
            Self
        """
        return self.fit(X, group_a, group_b).transform(X, group_a, group_b)


### Step 3: Evaluation

In [6]:
my_mitigator = MyPreprocessingMitigator()

task.run_benchmark(mitigator = my_mitigator, type = 'preprocessing')

Regression Benchmark initialized for MyPreprocessingMitigator


100%|██████████| 1/1 [00:04<00:00,  4.39s/it]


In [7]:
task.evaluate_table()

Dataset,Average RFS,crime
Mitigator,Unnamed: 1_level_1,Unnamed: 2_level_1
CorrelationRemover,0.978462,0.978462
MyPreprocessingMitigator,0.978462,0.978462
DisparateImpactRemover,0.885045,0.885045


### Step 4: Submission

In [8]:
task.submit()

Opening the link in your browser:
