Artificial Intelligence (AI) is everywhere:


*   Social media decides what you see.
*   Banks decide who gets a loan.
*   Companies decide who gets hired.
*   Governments decide who gets extra screening at the airport.

AI is not neutral.

It learns from data about the past; and the past often contains unfairness and bias. If we’re not careful, AI can copy and even amplify these problems.

In this workshop, we will:

**See AI make predictions**

You’ll test a model that predicts whether someone earns more than $50K based on their personal data (from the U.S. Census).

**Look inside the “black box”**

With tools like SHAP, etc you’ll see why the model makes certain decisions.
(we do not use SHAP because of speed)

**Check for bias**

You’ll compare predictions for different groups (men vs women) and ask: is this fair?

**Reflect on ethics**

What should we do when AI shows bias? Who is responsible?



In [None]:
!pip install shap



Just run this cell do not modify!

In [None]:
import shap
import matplotlib.pyplot as plt
import pandas as pd
from sklearn.datasets import fetch_openml
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.preprocessing import LabelEncoder

# Load dataset
data = fetch_openml(name="adult", version=2, as_frame=True)
df = data.frame.copy()

# Use a smaller dataset for speed
df_small = df.sample(5000, random_state=42)

y = (df_small["class"] == ">50K").astype(int)
X = df_small.drop("class", axis=1)

for col in X.select_dtypes(include=["category", "object"]).columns:
    X[col] = LabelEncoder().fit_transform(X[col])

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train simple model
model = RandomForestClassifier(random_state=42)
model.fit(X_train, y_train)

print("✅ Model ready! Accuracy on test set:", model.score(X_test, y_test))


✅ Model ready! Accuracy on test set: 0.861


In [None]:
# ===============================================
# STEP 0b — Find person closest to 50-50
# ===============================================
import numpy as np

# Predict probabilities for all training samples
probs_train = model.predict_proba(X_train)[:, 1]  # probability of >50K

# Find index of person closest to 0.5
closest_idx = np.argmin(np.abs(probs_train - 0.5))

# Extract that person
baseline = X_train.iloc[closest_idx].copy()
baseline_prob = probs_train[closest_idx]
baseline_pred = model.predict([baseline])[0]

print("=== Baseline person closest to 50-50 ===")
print(f"index of person closest to 50-50 {closest_idx}")
print(baseline[["age", "sex", "education-num", "hours-per-week"]])
print(f"Prediction: {'>50K income' if baseline_pred==1 else '<=50K income'}")
print(f"Probability: {baseline_prob*100:.1f}% chance of >50K, {(1-baseline_prob)*100:.1f}% chance of <=50K")


=== Baseline person closest to 50-50 ===
index of person closest to 50-50 2296
age               47
sex                1
education-num      9
hours-per-week    40
Name: 4034, dtype: int64
Prediction: >50K income
Probability: 56.0% chance of >50K, 44.0% chance of <=50K




We’ll use real-world data (like age, education, gender, hours worked per week) to predict:
👉 “Does this person earn more than 50K?” (Adult Income dataset)
OR
👉 “Does this person get good credit?” (German Credit dataset)

What to do

Find the code block with a person = {...} dictionary.

Change some values, for example:

age → try 20 vs. 60

sex → switch between male/female

hours-per-week → try 20 vs. 60

Run the cell and see how the prediction changes.

In [None]:
# ===============================================
# STEP 1 — PLAY WITH PREDICTIONS
# ===============================================
# 🧑‍🎓 INSTRUCTION:
# We start from a real person in the dataset (baseline).
# Look at their characteristics below.
# Then change some values (age, sex, hours-per-week, education-num).
# Run the cell again and compare predictions!

# Pick a baseline person from the test set
baseline = X_train.iloc[2296].copy()
#baseline = X_train.iloc[0].copy()

# Save original for comparison
original_person = baseline.copy()

def predict_one(person_series):
    sample = pd.DataFrame([person_series])
    pred = model.predict(sample)[0]
    prob = model.predict_proba(sample)[0]
    return pred, prob

# 👩‍🎓 STUDENT: Change only these features
baseline["age"] = 25            # Try 20 vs 60
baseline["sex"] = 1            # 0 = Female, 1 = Male
baseline["education-num"] = 10  # 9 = HS-grad, 13 = Bachelors, 16 = PhD
baseline["hours-per-week"] = 40 # Try 20 vs 60

# Predictions for both
orig_pred, orig_prob = predict_one(original_person)
new_pred, new_prob = predict_one(baseline)

# Show before/after
print("=== Original person ===")
print(original_person[["age", "sex", "education-num", "hours-per-week"]])
print(f"Prediction: {'>50K income' if orig_pred==1 else '<=50K income'}")
print(f"Probability: {orig_prob[1]*100:.1f}% chance of >50K, {orig_prob[0]*100:.1f}% chance of <=50K")

print("\n=== Changed person ===")
print(baseline[["age", "sex", "education-num", "hours-per-week"]])
print(f"Prediction: {'>50K income' if new_pred==1 else '<=50K income'}")
print(f"Probability: {new_prob[1]*100:.1f}% chance of >50K, {new_prob[0]*100:.1f}% chance of <=50K")


=== Original person ===
age               47
sex                1
education-num      9
hours-per-week    40
Name: 4034, dtype: int64
Prediction: >50K income
Probability: 56.0% chance of >50K, 44.0% chance of <=50K

=== Changed person ===
age               25
sex                1
education-num     10
hours-per-week    40
Name: 4034, dtype: int64
Prediction: <=50K income
Probability: 2.0% chance of >50K, 98.0% chance of <=50K


**Look Inside the AI’s “Brain”**

The AI isn’t magic. Tools like SHAP show which features mattered most for a prediction.

What to do

Run the explanation cell (don’t worry about the code).

You’ll see a chart:

Features in red push the model toward a “yes” (e.g., >50K income).

Features in blue push it toward “no.”

Try this

Compare explanations for two people:

A young woman, working 40h with a Bachelor’s degree.

A young man, same profile.

What’s different?

In [None]:
# ===============================================
# STEP 2 — EXPLAIN THE DECISION (simple & fast)
# ===============================================
# 🧑‍🎓 INSTRUCTION:
# This shows which features the model finds most important overall.
# Then we look at how ONE person's features influence their prediction.

from sklearn.inspection import permutation_importance

# Global feature importance (which features matter most)
result = permutation_importance(model, X_test, y_test, n_repeats=5, random_state=42)
importances = pd.Series(result.importances_mean, index=X.columns).sort_values(ascending=False)

print("=== Top 10 most important features (overall) ===")
print(importances.head(10))

# 👩‍🎓 STUDENT: Change this index to inspect different people
i = 0
person = X_test.iloc[i]
pred = model.predict([person])[0]
prob = model.predict_proba([person])[0][1]

print("\n=== Person's key features ===")
print(person[["age", "sex", "education-num", "hours-per-week"]])
print(f"Model prediction: {'>50K income' if pred==1 else '<=50K income'}")
print(f"Probability of >50K: {prob*100:.1f}%")

# Local "explanation" (just show this person’s values in context of top features)
print("\n=== Why this prediction? ===")
for feat in importances.head(5).index:
    print(f"{feat}: {person[feat]}")


=== Top 10 most important features (overall) ===
capital-gain      0.0598
age               0.0212
relationship      0.0204
education-num     0.0188
marital-status    0.0106
occupation        0.0100
hours-per-week    0.0050
capital-loss      0.0050
workclass         0.0044
sex               0.0004
dtype: float64

=== Person's key features ===
age               24
sex                0
education-num     13
hours-per-week    40
Name: 10544, dtype: int64
Model prediction: <=50K income
Probability of >50K: 1.0%

=== Why this prediction? ===
capital-gain: 0
age: 24
relationship: 1
education-num: 13
marital-status: 4




**Group Bias Check**

AI might not treat groups equally.
In this step, you’ll run a cell that shows the average prediction for men vs. women.

👉 Look carefully: are the averages very different?

In [None]:
# ===============================================
# CHECK FOR GROUP BIAS
# ===============================================
# 🧑‍🎓 INSTRUCTION:
# Here we compare the *average predicted income probability*
# for men vs women. Do you see a difference?

male_idx = X_test["sex"] == 1
female_idx = X_test["sex"] == 0

male_pred = model.predict_proba(X_test[male_idx])[:,1].mean()
female_pred = model.predict_proba(X_test[female_idx])[:,1].mean()

print("Average >50K prediction for men   :", round(male_pred, 3))
print("Average >50K prediction for women :", round(female_pred, 3))


Average >50K prediction for men   : 0.309
Average >50K prediction for women : 0.115


**Reflection Questions**


Patterns: What inputs had the biggest effect on predictions?

Bias: Did you see differences between men and women? Between young and old?

Fairness: If this AI was used in hiring or banking, what could go wrong?

Responsibility: Who should make sure AI systems are fair — the programmers, the companies, or society/laws?

