# **Measuring Bias in multiclass classification**


This tutorial will explain how to measure bias in a multiclass classification task using the holisticai library. We will introduce here some of the functions that can help study algorithmic bias.

The sections are organised as follows :
1. Load the data : we load the student dataset as a pandas DataFrame
2. Train a Model : we train a model (sklearn)
3. Measure Efficacy : we compute a few efficacy metrics.

## **Load the data**

The student dataset can be easily

In [1]:
from holisticai.datasets import load_dataset
from sklearn.ensemble import RandomForestClassifier

# load data
dataset = load_dataset('student_multiclass')
dataset = dataset.train_test_split(test_size=0.2, random_state=42)
train = dataset['train']
test = dataset['test']
y_test = test['y']

## **2. Train a Model**

In [2]:
# Train a simple Random Forest Classifier
model = RandomForestClassifier(random_state=111)
model.fit(train['X'], train['y'])

# Predict values
y_pred = model.predict(test['X'])
y_proba = model.predict_proba(test['X'])

## **3. Measure Efficacy**

In [3]:
from holisticai.efficacy.metrics import confusion_matrix
from sklearn.metrics import precision_score, recall_score, accuracy_score

confusion_matrix(y_pred, y_test, normalize='true')

Unnamed: 0,0,1,2
0,0.518519,0.25,0.166667
1,0.185185,0.0625,0.083333
2,0.296296,0.6875,0.75


In [4]:
from holisticai.efficacy.metrics import multiclassification_efficacy_metrics

multiclassification_efficacy_metrics(y_pred, y_test, by_class=True)

Unnamed: 0_level_0,Value,0,1,2,Reference
Metric,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
Accuracy,0.531646,,,,1
Balanced Accuracy,0.443673,,,,1
Precision,0.531646,0.583333,0.111111,0.586957,1
Recall,0.531646,0.518519,0.0625,0.75,1
F1-Score,0.531646,0.54902,0.08,0.658537,1


In [5]:
multiclassification_efficacy_metrics(y_pred, y_test, y_proba, by_class=True)

Unnamed: 0_level_0,Value,0,1,2,Reference
Metric,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
Accuracy,0.531646,,,,1
Balanced Accuracy,0.443673,,,,1
Precision,0.531646,0.583333,0.111111,0.586957,1
Recall,0.531646,0.518519,0.0625,0.75,1
F1-Score,0.531646,0.54902,0.08,0.658537,1
AUC,0.653253,0.737179,0.522321,0.700258,1
Log Loss,0.993429,,,,0
