# **Measuring Efficacy in classification**


This notebook is a tutorial on auditing efficacy within a binary classification task. We will use the holisticai library **efficacy metrics** sections.
The sections are organised as follows :

1. Load the data : we load the law school dataset as a pandas DataFrame
2. Train a Model : we train a simple logistic regression model (sklearn)
3. Measure Efficacy : we compute a few efficacy metrics.

## **1. Load the data**

We host a few example datasets on the holisticai library for quick loading and experimentation. Here we load and use the Law School dataset. The goal of this dataset is the prediction of the binary attribute 'bar' (whether a student passes the law school bar). The protected attributes are race and gender. We pay special attention to race in this case, because preliminary exploration hints there is strong inequality on that sensitive attribute.

In [1]:
# Get data
from holisticai.datasets import load_dataset
dataset = load_dataset("law_school")
dataset

## **2. Train a model**

Here we train a Logistic Regression classifier.

In [2]:
from sklearn.linear_model import LogisticRegression
from sklearn.preprocessing import StandardScaler

# train a model, do not forget to standard scale data
dataset = load_dataset("law_school")
dataset = dataset.train_test_split(test_size=0.2, random_state=42)
train = dataset['train']
test = dataset['test']

y_train = train['y']
y_test = test['y']

scaler = StandardScaler()
X_train_t = scaler.fit_transform(train['X'])

LR = LogisticRegression(random_state=42, max_iter=500)
LR.fit(X_train_t,y_train)

X_test_t = scaler.transform(test['X'])
y_pred = LR.predict(X_test_t)
y_proba = LR.predict_proba(X_test_t)

## **3. Measure Efficacy**

In [3]:
from holisticai.efficacy.metrics import classification_efficacy_metrics

classification_efficacy_metrics(y_pred, y_test, y_proba)

Unnamed: 0_level_0,Value,Reference
Metric,Unnamed: 1_level_1,Unnamed: 2_level_1
Accuracy,0.903125,1
Balanced Accuracy,0.791643,1
Precision,0.985683,1
Recall,0.912478,1
F1-Score,0.947669,1
AUC,0.791643,1
Log Loss,3.491729,0
