# Day 37: Fairness Auditing Tool

In this lab, we will use the `FairnessAuditor` to evaluate the fairness of a model.
We will simulate a dataset where a sensitive attribute (e.g., gender or race) is correlated with the target, leading to a biased model.

In [None]:
import sys
import os
import numpy as np
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression

# Add root directory to sys.path
sys.path.append(os.path.abspath('../../'))

from src.fairness.audit import FairnessAuditor

## 1. Simulate Biased Data

We create a dataset where Group 0 (Unprivileged) has lower probability of Positive Outcome (1).

In [None]:
np.random.seed(42)
n = 1000

# Sensitive Attribute (0 or 1)
sensitive = np.random.randint(0, 2, n)

# Feature correlated with sensitive attr
# If sensitive=1, feature is higher on average
feature = np.random.normal(0, 1, n) + sensitive * 1.0

# Target depends on feature regarding of sensitive attr (fair process?)
# But since feature depends on sensitive, target will be correlated with sensitive (indirect bias)
diff = feature + np.random.normal(0, 0.5, n)
target = (diff > 0.5).astype(int)

df = pd.DataFrame({'Sensitive': sensitive, 'Feature': feature, 'Target': target})
df.head()

## 2. Train Model

We train a Logistic Regression model on this data.

In [None]:
X = df[['Feature']]
y = df['Target']

model = LogisticRegression()
model.fit(X, y)

y_pred = model.predict(X)

## 3. Run Audit

We check if the model is treating both groups fairly.

In [None]:
auditor = FairnessAuditor()
metrics = auditor.audit(y, y_pred, df['Sensitive'])

report = auditor.generate_report(metrics)
print(report)

**Analysis**:
- **Disparate Impact**: If < 0.8, it indicates the Unprivileged group (0) is selected at a significantly lower rate.
- **Demographic Parity Diff**: Shows the absolute difference in selection rates.