# Batch Fairness

In this notebook we compare the fairness of the different batch algorithms and experiment with ways to improve it.

We define the fairness as the amount of samples taken from each class. For simplicity sake we will only use two classes and define the fairness as: 
$$
Fairness = \frac{amount\_of\_samples\_from\_class}{amount\_of\_total\_samples}
$$

In [1]:
#imports
import numpy as np

# sklearn
from skactiveml.classifier import SklearnClassifier
from skactiveml.pool import UncertaintySampling, BatchBALD
from skactiveml.utils import MISSING_LABEL

from sklearn.datasets import make_classification
from sklearn.linear_model import LogisticRegression

# warnings
import warnings
warnings.filterwarnings("ignore")

In [2]:
def check_fairness(samples, label):
    amount = 0
    for sample in samples:
        if sample == label:
            amount += 1 
    return amount / len(samples)

In [3]:
def al_batch_fairness(iterations=100, batch_size=1, weights = [0.8, 0.2], data_size=400, query_count=10):
    data = []
    qs = UncertaintySampling(method='entropy')
    for rand in range(iterations):
        # Create the data
        Xf, yf = make_classification(n_samples=data_size, n_features=2, n_redundant=0, weights=weights, random_state=rand)
        y = np.full(shape=yf.shape, fill_value=MISSING_LABEL)
        
        clf = SklearnClassifier(LogisticRegression(), classes=np.unique(yf))
        clf.fit(Xf, y)
        out = []
        for _ in range(query_count):
            query_idx = qs.query(Xf, y, clf=clf, batch_size=batch_size)
            y[query_idx] = yf[query_idx]
            clf.fit(Xf, y)
            
            out.append(check_fairness(yf[query_idx], 1))
        data.append(out)
    return np.mean(np.array(data), axis=0)

def al_bald_fairness(iterations=100, batch_size=1, weights = [0.8, 0.2], data_size=400, query_count=10):
    data = []
    qs = BatchBALD()
    for rand in range(iterations):
        # Create the data
        Xf, yf = make_classification(n_samples=data_size, n_features=2, n_redundant=0, weights=weights, random_state=rand)
        y = np.full(shape=yf.shape, fill_value=MISSING_LABEL)
        ensemble = []
        ensemble.append(SklearnClassifier(LogisticRegression(), classes=np.unique(yf)))
        for clf in ensemble:
            clf.fit(Xf, y)
        out = []
        for _ in range(query_count):
            query_idx = qs.query(Xf, y, ensemble=ensemble, batch_size=batch_size)
            y[query_idx] = yf[query_idx]

            for clf in ensemble:
                clf.fit(Xf, y)
                
            out.append(check_fairness(yf[query_idx], 1))
        data.append(out)
    return np.mean(np.array(data), axis=0)

In [4]:
batch_fairnesses = al_batch_fairness()

In [5]:
bald_fairnesses = al_bald_fairness()

In [6]:
print(f'Normal batch fairness: {np.mean(batch_fairnesses)}')
print(f'Batch bald fairness: {np.mean(bald_fairnesses)}')

Normal batch fairness: 0.363
Batch bald fairness: 0.11099999999999999


These scores seem to align with the performance of the models, meaning that if a model is more fair it would perform better. We want the scores to near 50% which is the best with normal batches.