# Lab: Detecting AI Bias in a Hiring Scenario

In this lab, you'll train two logistic regression models to predict hiring decisions and investigate whether gender introduces bias into the model's predictions.

**Dataset:** `hiring_dataset.csv`  
**Features:** `Experience`, `Gender`  
**Target:** `Shortlisted` (Selected / Not Selected)

# Load and Explore Data

In [None]:
import pandas as pd
import numpy as np

# Load dataset
df = pd.read_csv("hiring_dataset.csv")

# Preview
print(df.head())
print(df.isnull().sum())

# Encode Gender

In [None]:
# Encode Gender: Male = 0, Female = 1
df['Gender_encoded'] = df['Gender'].map({'Male': 0, 'Female': 1})

# Define features and target

In [None]:
# Define features and target
X_with_gender = df[['Gender_encoded', 'Experience']]
X_without_gender = df[['Experience']]
y = df['Shortlisted']

# Train-Test Split

In [None]:
from sklearn.model_selection import train_test_split

# Split data
X_train_wg, X_test_wg, y_train, y_test = train_test_split(X_with_gender, y, test_size=0.3, random_state=42)
X_train_wo, X_test_wo, _, _ = train_test_split(X_without_gender, y, test_size=0.3, random_state=42)

# Train The Model
from sklearn.linear_model import LogisticRegression

# Train models
##YOUR CODE HERE

# Predict Probabilities

# #Your code here

In [None]:
# Visualize Predictions
## YOUR CODE HERE

import seaborn as sns
import matplotlib.pyplot as plt

# Model WITH Gender
plt.figure(figsize=(10, 6))
sns.boxplot(x='Gender', y='Pred_with_gender', data=)
plt.title('Predicted Probability of Shortlisting (Model WITH Gender)')
plt.show()

# Model WITHOUT Gender
plt.figure(figsize=(10, 6))
sns.boxplot(x='Gender', y='Pred_without_gender', data= )
plt.title('Predicted Probability of Shortlisting (Model WITHOUT Gender)')
plt.show()


In [None]:
##Evaluate Model:

## Your code here

from sklearn.metrics import accuracy_score, confusion_matrix, roc_auc_score

# Model WITH Gender
print("Accuracy (With Gender):", accuracy_score())
print("Confusion Matrix (With Gender):\n", confusion_matrix(y_test, ))
print("AUC (With Gender):", roc_auc_score())

# Model WITHOUT Gender
print("Accuracy (Without Gender):", accuracy_score())
print("Confusion Matrix (Without Gender):\n", confusion_matrix()))
print("AUC (Without Gender):", roc_auc_score())

## Analysis Questions

1. Did you observe any bias in the modelâ€™s predictions?
2. How did removing gender affect the predictions?
3. What ethical concerns arise when using sensitive attributes like gender?
4. What would you recommend to mitigate bias in hiring models?