# Chapter 10 ‚Äî Ethics, Bias, and Responsible AI
## *Python for AI/ML: A Complete Learning Journey*

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/timothy-watt/python-for-ai-ml/blob/main/CH10_Ethics_Bias_Responsible_AI.ipynb)
&nbsp;&nbsp;[![Back to TOC](https://img.shields.io/badge/Back_to-Table_of_Contents-1B3A5C?style=flat-square)](https://colab.research.google.com/github/timothy-watt/python-for-ai-ml/blob/main/Python_for_AIML_TOC.ipynb)

---

**Part:** 3 -- Machine Learning and AI  
**Prerequisites:** Chapter 6 (scikit-learn), Chapter 7 (PyTorch)  
**Estimated time:** 3-4 hours

---

### Learning Objectives

By the end of this chapter you will be able to:

- Identify sources of bias in ML pipelines: data, label, measurement, and deployment bias
- Compute and interpret fairness metrics: demographic parity, equalised odds, calibration
- Use SHAP to explain individual predictions and global feature importance
- Apply LIME to explain a single prediction in human-readable terms
- Audit a salary prediction model for geographic and demographic disparities
- Apply practical mitigation strategies: reweighting, threshold adjustment, and documentation
- Write a model card summarising intended use, limitations, and bias findings

---

### Why This Chapter Exists

Every model built in Chapters 6-8 makes decisions that affect real people.
A salary prediction model used in hiring encodes historical pay disparities.
A developer role classifier trained on biased labels may systematically
under-represent certain groups. These are not hypothetical risks -- they are
documented failures in production systems.

This chapter does not treat ethics as a soft add-on. It provides concrete,
measurable tools: compute the disparity, visualise it, and apply a mitigation.
Responsible AI is an engineering discipline, not just a policy statement.

---

### Project Thread -- Chapter 9

We audit the Chapter 6 salary regression model for geographic bias:
does the model systematically over- or under-predict salaries for developers
in certain countries? We then apply SHAP to explain individual predictions,
measure calibration across groups, and produce a model card.


---

## Setup -- Imports and Data


In [None]:
# Install SHAP and LIME if not already present
import subprocess
subprocess.run(['pip', 'install', 'shap', 'lime', '-q'], check=False)

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.ticker as mticker
import seaborn as sns
import warnings
warnings.filterwarnings('ignore')

import shap
from sklearn.ensemble import RandomForestRegressor, GradientBoostingClassifier
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler, LabelEncoder
from sklearn.metrics import mean_absolute_error, r2_score, confusion_matrix
from sklearn.pipeline import Pipeline
from sklearn.compose import ColumnTransformer
from sklearn.preprocessing import OneHotEncoder
from sklearn.impute import SimpleImputer

print(f'SHAP version: {shap.__version__}')

plt.style.use('seaborn-v0_8-whitegrid')
plt.rcParams['figure.dpi']       = 110
plt.rcParams['axes.titlesize']   = 13
plt.rcParams['axes.titleweight'] = 'bold'

DATASET_URL  = 'https://raw.githubusercontent.com/timothy-watt/python-for-ai-ml/main/data/so_survey_2025_curated.csv'
RANDOM_STATE = 42


In [None]:
# Load and clean SO 2025 -- same pipeline as previous chapters
df_raw = pd.read_csv(DATASET_URL)
df = df_raw.copy()
df = df.dropna(subset=['ConvertedCompYearly'])
df['ConvertedCompYearly'] = pd.to_numeric(df['ConvertedCompYearly'], errors='coerce')
Q1, Q3 = df['ConvertedCompYearly'].quantile([0.25, 0.75])
IQR = Q3 - Q1
df = df[
    (df['ConvertedCompYearly'] >= max(Q1 - 3*IQR, 5_000)) &
    (df['ConvertedCompYearly'] <= min(Q3 + 3*IQR, 600_000))
].copy()
if 'YearsCodePro' in df.columns:
    df['YearsCodePro'] = pd.to_numeric(df['YearsCodePro'], errors='coerce')
    df['YearsCodePro'] = df['YearsCodePro'].fillna(df['YearsCodePro'].median())
for col in ['Country', 'EdLevel', 'RemoteWork', 'DevType']:
    if col in df.columns:
        df[col] = df[col].fillna('Unknown')
df['uses_python'] = df.get('LanguageHaveWorkedWith', pd.Series(dtype=str)).str.contains('Python', na=False).astype(int)
df['uses_sql']    = df.get('LanguageHaveWorkedWith', pd.Series(dtype=str)).str.contains('SQL', na=False).astype(int)
df['log_salary']  = np.log(df['ConvertedCompYearly'])
df['primary_role'] = df.get('DevType', pd.Series(dtype=str)).str.split(';').str[0].str.strip()
df = df.reset_index(drop=True)
print(f'Dataset: {len(df):,} rows')
print(f'Countries: {df["Country"].nunique()}')


---

## Section 9.1 -- Sources of Bias in ML Pipelines

Bias enters ML systems at every stage. Understanding where it comes from
is the prerequisite for measuring and mitigating it.

**Data bias** -- the training data does not represent the real-world population.
The SO 2025 survey over-represents English-speaking, Western developers.
A salary model trained on it will have higher accuracy for those groups
and lower accuracy -- with systematic errors -- for under-represented groups.

**Label bias** -- the labels themselves encode historical inequities.
If past salaries reflect discrimination, a model trained to predict salary
will learn to reproduce that discrimination.

**Measurement bias** -- different groups are measured differently.
Self-reported salary in a developer survey may be more accurate for
salaried employees than for contractors or freelancers.

**Feedback loop bias** -- model predictions influence future data collection.
A hiring model that down-ranks candidates from certain universities
produces fewer training examples from those universities in the next round.

**Deployment bias** -- a model trained in one context is used in another.
A salary model trained on SO survey data used to set actual compensation
applies a tool to a context it was never validated for.


In [None]:
# 9.1.1 -- Audit the training data: representation by country

top_countries = df['Country'].value_counts().head(12)

fig, axes = plt.subplots(1, 2, figsize=(15, 5))

# Left: respondent count by country
top_countries.plot(kind='barh', ax=axes[0], color='#2E75B6')
axes[0].set_title('Respondent Count by Country\n(data bias: over-representation of certain regions)')
axes[0].set_xlabel('Number of Respondents')
for i, v in enumerate(top_countries.values):
    axes[0].text(v + 10, i, f'{v:,}', va='center', fontsize=8)

# Right: median salary by country
country_salary = (
    df[df['Country'].isin(top_countries.index)]
    .groupby('Country')['ConvertedCompYearly']
    .median()
    .sort_values(ascending=True)
)
country_salary.plot(kind='barh', ax=axes[1], color='#E8722A')
axes[1].xaxis.set_major_formatter(mticker.FuncFormatter(lambda x, _: f'${x/1000:.0f}k'))
axes[1].set_title('Median Salary by Country\n(wide range: model must not treat these as equivalent)')
axes[1].set_xlabel('Median Annual Salary (USD)')

plt.suptitle('SO 2025: Data Representation Audit', fontsize=13, fontweight='bold')
plt.tight_layout()
plt.show()

# Quantify the representation imbalance
top_3_pct  = top_countries.head(3).sum() / len(df) * 100
bottom_pct = top_countries.tail(3).sum() / len(df) * 100
print(f'Top 3 countries: {top_3_pct:.1f}% of all respondents')
print(f'Bottom 3 of top-12: {bottom_pct:.1f}% of all respondents')
print(f'Salary range (top-12 countries): '
      f'${country_salary.min():,.0f} to ${country_salary.max():,.0f}')
print(f'That is a {country_salary.max()/country_salary.min():.1f}x salary ratio -- '
      f'treating all countries equivalently would be a serious model error.')


---

## Section 9.2 -- Train the Model We Will Audit

We train a Random Forest salary regressor that deliberately omits `Country`
as a feature -- a common real-world decision made to avoid 'discriminating'
by geography. We then show that this decision does not remove geographic bias;
it just makes it invisible and harder to measure.


In [None]:
# 9.2.1 -- Train salary model WITHOUT Country as a feature

feature_cols = [c for c in ['YearsCodePro', 'uses_python', 'uses_sql']
                if c in df.columns]

X = df[feature_cols].copy()
for col in feature_cols:
    med = X[col].median()
    X[col] = X[col].fillna(med if pd.notna(med) else 0)
y = df['log_salary']

X_train, X_test, y_train, y_test, idx_train, idx_test = train_test_split(
    X, y, df.index, test_size=0.2, random_state=RANDOM_STATE
)

scaler = StandardScaler()
X_train_sc = scaler.fit_transform(X_train)
X_test_sc  = scaler.transform(X_test)

model = RandomForestRegressor(
    n_estimators=100, max_depth=8,
    random_state=RANDOM_STATE, n_jobs=-1
)
model.fit(X_train_sc, y_train)

y_pred_log = model.predict(X_test_sc)
y_pred_usd = np.exp(y_pred_log)
y_true_usd = np.exp(y_test)

overall_r2  = r2_score(y_test, y_pred_log)
overall_mae = mean_absolute_error(y_true_usd, y_pred_usd)

print(f'Overall model performance (Country NOT a feature):')
print(f'  R^2:  {overall_r2:.4f}')
print(f'  MAE:  ${overall_mae:,.0f}')
print(f'Features used: {feature_cols}')
print()
print('Now we audit whether performance is equal across countries...')


---

## Section 9.3 -- Fairness Audit: Per-Group Performance

A model with good overall performance can still perform very poorly
for specific subgroups. The standard fairness audit computes performance
metrics separately for each group and measures the disparity.

**Demographic parity** -- do predictions have the same mean across groups?

**Equalised odds** -- are error rates equal across groups?
(Equal false positive rates and false negative rates)

**Calibration** -- does a predicted salary of $X actually correspond to
actual salaries near $X equally well for all groups?


In [None]:
# 9.3.1 -- Per-country performance audit

test_df = df.loc[idx_test].copy()
test_df['y_pred_log'] = y_pred_log
test_df['y_pred_usd'] = y_pred_usd
test_df['y_true_usd'] = y_true_usd.values
test_df['abs_error']  = np.abs(test_df['y_true_usd'] - test_df['y_pred_usd'])
test_df['pct_error']  = test_df['abs_error'] / test_df['y_true_usd'] * 100

# Compute per-country metrics for countries with enough test samples
min_samples = 30
country_metrics = []

for country, grp in test_df.groupby('Country'):
    if len(grp) < min_samples:
        continue
    r2  = r2_score(np.log(grp['y_true_usd']), np.log(grp['y_pred_usd']))
    mae = grp['abs_error'].mean()
    mpe = grp['pct_error'].mean()   # mean percentage error
    bias = (grp['y_pred_usd'] - grp['y_true_usd']).mean()  # signed: + = over-predicts
    country_metrics.append({
        'Country': country, 'n': len(grp),
        'R2': r2, 'MAE': mae, 'MPE': mpe, 'Bias_USD': bias
    })

metrics_df = pd.DataFrame(country_metrics).sort_values('MAE', ascending=False)

print(f'Countries with >= {min_samples} test samples: {len(metrics_df)}')
print()
print(f'{"Country":<25} {"n":>5}  {"R2":>6}  {"MAE":>10}  {"Bias":>10}')
print('-' * 60)
for _, row in metrics_df.iterrows():
    bias_str = f'+${row["Bias_USD"]:,.0f}' if row['Bias_USD'] >= 0 else f'-${abs(row["Bias_USD"]):,.0f}'
    print(f'{row["Country"]:<25} {row["n"]:>5}  {row["R2"]:>6.3f}  '
          f'${row["MAE"]:>9,.0f}  {bias_str:>10}')

print()
worst_mae = metrics_df.iloc[0]
best_mae  = metrics_df.iloc[-1]
print(f'MAE disparity ratio: {worst_mae["MAE"]/best_mae["MAE"]:.1f}x '
      f'({worst_mae["Country"]} vs {best_mae["Country"]})')


In [None]:
# 9.3.2 -- Visualise the disparity

fig, axes = plt.subplots(1, 2, figsize=(15, 6))

# Left: MAE by country
plot_df = metrics_df.sort_values('MAE', ascending=True)
colours = ['#E8722A' if mae > overall_mae * 1.5 else '#2E75B6'
           for mae in plot_df['MAE']]
axes[0].barh(plot_df['Country'], plot_df['MAE'], color=colours)
axes[0].axvline(overall_mae, color='red', linestyle='--', linewidth=2,
                label=f'Overall MAE ${overall_mae:,.0f}')
axes[0].xaxis.set_major_formatter(mticker.FuncFormatter(lambda x, _: f'${x/1000:.0f}k'))
axes[0].set_title('MAE by Country\n(orange = 1.5x above overall MAE)')
axes[0].set_xlabel('Mean Absolute Error (USD)')
axes[0].legend(fontsize=9)

# Right: prediction bias by country (over vs under prediction)
bias_df = metrics_df.sort_values('Bias_USD', ascending=True)
bias_colours = ['#E8722A' if b > 0 else '#2E75B6' for b in bias_df['Bias_USD']]
axes[1].barh(bias_df['Country'], bias_df['Bias_USD'], color=bias_colours)
axes[1].axvline(0, color='black', linewidth=1)
axes[1].xaxis.set_major_formatter(mticker.FuncFormatter(lambda x, _: f'${x/1000:.0f}k'))
axes[1].set_title('Prediction Bias by Country\n(orange = over-predicts, blue = under-predicts)')
axes[1].set_xlabel('Mean Bias (predicted - actual) USD')

plt.suptitle('Fairness Audit: Salary Model Performance by Country',
             fontsize=13, fontweight='bold')
plt.tight_layout()
plt.show()

print('Key finding: the model that omits Country as a feature')
print('still produces highly unequal errors across countries.')
print('Omitting a sensitive attribute does not remove bias -- it hides it.')


---

## Section 9.4 -- Model Explainability with SHAP

**SHAP (SHapley Additive exPlanations)** assigns each feature a contribution
value for a specific prediction. It is grounded in cooperative game theory:
the SHAP value for a feature is its average marginal contribution across all
possible orderings of features.

SHAP provides two levels of explanation:
- **Global:** which features matter most overall? (feature importance)
- **Local:** why did the model predict $X for this specific person?

For tree-based models, SHAP uses an exact, fast algorithm (`TreeExplainer`).
For other models, it uses `KernelExplainer` (slower, approximate).


In [None]:
# 9.4.1 -- Global SHAP: feature importance and summary plot

explainer   = shap.TreeExplainer(model)
# Compute SHAP values on a sample for speed
sample_idx  = np.random.RandomState(RANDOM_STATE).choice(len(X_test_sc), size=min(500, len(X_test_sc)), replace=False)
X_sample    = X_test_sc[sample_idx]
shap_values = explainer.shap_values(X_sample)

print(f'SHAP values shape: {np.array(shap_values).shape}')
print(f'Features: {feature_cols}')

# Summary plot: beeswarm showing feature impact distribution
plt.figure(figsize=(9, 4))
shap.summary_plot(
    shap_values, X_sample,
    feature_names=feature_cols,
    show=False
)
plt.title('SHAP Summary Plot: Feature Impact on log(Salary) Predictions')
plt.tight_layout()
plt.show()

# Mean absolute SHAP value = global importance
mean_shap = np.abs(shap_values).mean(axis=0)
print('Global feature importance (mean |SHAP|):')
for feat, imp in sorted(zip(feature_cols, mean_shap), key=lambda x: -x[1]):
    print(f'  {feat:<25} {imp:.6f}')


In [None]:
# 9.4.2 -- Local SHAP: explain individual predictions

# Pick two respondents: highest and lowest predicted salary
pred_series = pd.Series(np.exp(model.predict(X_test_sc)), index=idx_test)
highest_idx = pred_series.idxmax()
lowest_idx  = pred_series.idxmin()

for label, row_idx in [('Highest predicted salary', highest_idx),
                        ('Lowest predicted salary',  lowest_idx)]:
    pos_in_test = list(idx_test).index(row_idx)
    x_row  = X_test_sc[pos_in_test]
    shap_row = explainer.shap_values(x_row.reshape(1, -1))[0]
    pred_usd = np.exp(model.predict(x_row.reshape(1, -1))[0])
    true_usd = np.exp(y_test.loc[row_idx])
    country  = df.loc[row_idx, 'Country'] if 'Country' in df.columns else 'N/A'

    print(f'{label}:')
    print(f'  Country:    {country}')
    print(f'  Predicted:  ${pred_usd:,.0f}')
    print(f'  Actual:     ${true_usd:,.0f}')
    print(f'  SHAP contributions:')
    for feat, sv in zip(feature_cols, shap_row):
        direction = 'increases' if sv > 0 else 'decreases'
        print(f'    {feat:<20} {sv:+.4f}  ({direction} prediction)')
    print()


In [None]:
# 9.4.3 -- SHAP dependence plot: how YearsCodePro affects predictions

if 'YearsCodePro' in feature_cols:
    yrs_idx = feature_cols.index('YearsCodePro')

    fig, ax = plt.subplots(figsize=(9, 5))
    ax.scatter(
        X_sample[:, yrs_idx],
        shap_values[:, yrs_idx],
        alpha=0.4, s=12, color='#2E75B6'
    )
    ax.axhline(0, color='red', linestyle='--', linewidth=1.5)
    ax.set_xlabel('YearsCodePro (standardised)')
    ax.set_ylabel('SHAP value (impact on log salary)')
    ax.set_title('SHAP Dependence Plot: YearsCodePro\n'
                 'Above zero = increases salary prediction; below zero = decreases')
    plt.tight_layout()
    plt.show()

    print('Interpretation:')
    print('  Positive SHAP values: this experience level pushes the prediction higher')
    print('  Negative SHAP values: this experience level pushes the prediction lower')
    print('  The transition point (where line crosses 0) is the average experience level')


---

## Section 9.5 -- Bias Mitigation Strategies

Once you have measured a disparity, you have three categories of options:

**Pre-processing** -- fix the data before training:
reweighting under-represented groups, resampling, or removing proxy features.

**In-processing** -- modify the model or training objective:
add fairness constraints to the loss function, use adversarial debiasing.

**Post-processing** -- adjust model outputs after training:
apply different decision thresholds per group, recalibrate predictions.

We demonstrate the two most practical techniques: sample reweighting
(pre-processing) and threshold adjustment (post-processing).


In [None]:
# 9.5.1 -- Pre-processing: sample reweighting
#
# Give under-represented countries higher weight during training
# so the model pays equal attention to each country regardless of sample size.

# Compute inverse-frequency weights: rare countries get higher weight
country_counts = df.loc[idx_train, 'Country'].value_counts()
total          = len(idx_train)
country_weights = (total / (len(country_counts) * country_counts)).to_dict()

train_weights = df.loc[idx_train, 'Country'].map(country_weights).fillna(1.0).values
train_weights = train_weights / train_weights.mean()   # normalise so mean weight = 1

print(f'Sample weights range: {train_weights.min():.3f} to {train_weights.max():.3f}')
print(f'Mean weight: {train_weights.mean():.3f}')

# Retrain with sample weights
model_weighted = RandomForestRegressor(
    n_estimators=100, max_depth=8,
    random_state=RANDOM_STATE, n_jobs=-1
)
model_weighted.fit(X_train_sc, y_train, sample_weight=train_weights)

y_pred_w    = model_weighted.predict(X_test_sc)
y_pred_w_usd = np.exp(y_pred_w)

overall_r2_w  = r2_score(y_test, y_pred_w)
overall_mae_w = mean_absolute_error(y_true_usd, y_pred_w_usd)

print(f'Weighted model overall: R^2={overall_r2_w:.4f}, MAE=${overall_mae_w:,.0f}')
print(f'Original model overall: R^2={overall_r2:.4f},  MAE=${overall_mae:,.0f}')

# Compare per-country MAE improvement
test_df['y_pred_w_usd'] = y_pred_w_usd
test_df['abs_error_w']  = np.abs(test_df['y_true_usd'] - test_df['y_pred_w_usd'])

improvements = []
for country, grp in test_df.groupby('Country'):
    if len(grp) < min_samples:
        continue
    mae_orig = grp['abs_error'].mean()
    mae_wtd  = grp['abs_error_w'].mean()
    improvements.append({'Country': country, 'n': len(grp),
                         'MAE_orig': mae_orig, 'MAE_weighted': mae_wtd,
                         'Delta': mae_wtd - mae_orig})

imp_df = pd.DataFrame(improvements).sort_values('Delta')
improved = (imp_df['Delta'] < 0).sum()
print(f'Countries with improved MAE after reweighting: {improved}/{len(imp_df)}')


In [None]:
# 9.5.2 -- Visualise the mitigation effect

fig, ax = plt.subplots(figsize=(11, 6))

x     = np.arange(len(imp_df))
width = 0.35

bars1 = ax.bar(x - width/2, imp_df['MAE_orig']/1000,     width,
               label='Original model', color='#2E75B6', alpha=0.8)
bars2 = ax.bar(x + width/2, imp_df['MAE_weighted']/1000, width,
               label='Reweighted model', color='#E8722A', alpha=0.8)

ax.set_xticks(x)
ax.set_xticklabels(imp_df['Country'], rotation=30, ha='right', fontsize=9)
ax.set_ylabel('MAE ($k)')
ax.set_title('Bias Mitigation: MAE Before and After Sample Reweighting\n'
             '(lower = better; reweighting reduces errors for under-represented countries)')
ax.legend(fontsize=10)
ax.axhline(overall_mae/1000,   color='#2E75B6', linestyle='--', linewidth=1.5,
           label=f'Overall original ${overall_mae/1000:.0f}k')
ax.axhline(overall_mae_w/1000, color='#E8722A', linestyle='--', linewidth=1.5,
           label=f'Overall weighted ${overall_mae_w/1000:.0f}k')
plt.tight_layout()
plt.show()

print('Reweighting often improves fairness at a small cost to overall accuracy.')
print('This is the fundamental fairness-accuracy tradeoff -- it cannot be entirely avoided,')
print('but it can be measured and managed.')


---

## Section 9.6 -- Model Cards: Documenting Your Model

A **model card** is a structured document that accompanies a deployed model.
It records what the model does, how it was trained, what it should and should
not be used for, its performance across subgroups, and its known limitations.

Model cards were introduced by Google in 2019 and are now the industry standard
for responsible model documentation. HuggingFace requires them for all models
on the Hub. Many organisations require them before production deployment.

The cell below generates a complete model card for the salary model we built.


In [None]:
# 9.6.1 -- Generate a model card

from datetime import date

model_card = f"""
# Model Card: SO 2025 Salary Regression Model

**Date:** {date.today()}
**Version:** 1.0
**Authors:** Python for AI/ML (Chapter 9 demonstration)

---

## Model Description

A Random Forest regression model that predicts annual developer salary (USD)
from professional experience, Python usage, and SQL usage.
Predictions are made in log-salary space and exponentiated to USD.

**Architecture:** RandomForestRegressor (100 trees, max_depth=8)
**Target variable:** log(ConvertedCompYearly) -- natural log of annual salary in USD
**Features:** YearsCodePro, uses_python, uses_sql

---

## Intended Use

- **Intended uses:** Educational demonstration of salary prediction modelling
- **Intended users:** Students and practitioners learning ML with the SO 2025 dataset
- **Out-of-scope uses:** Setting actual employee compensation, hiring decisions,
  benchmarking individual salaries, any production deployment

---

## Training Data

Stack Overflow 2025 Developer Survey (curated 15,000-respondent subset).
Source: stackoverflow.com/survey

**Known data limitations:**
- Over-represents English-speaking, Western developers
- Self-reported salary may differ from actual compensation
- Survey respondents are not a random sample of all developers globally
- Salary reported in local currency and converted to USD at a single rate

---

## Performance

| Metric | Value |
|--------|-------|
| R^2 (log scale, test set) | {overall_r2:.4f} |
| MAE (USD, test set) | ${overall_mae:,.0f} |

**Performance varies significantly by country.**
Countries with fewer training examples show higher MAE.
See the fairness audit in Section 9.3 for per-country breakdown.

---

## Bias and Fairness Findings

1. **Geographic disparity:** MAE varies by up to {metrics_df['MAE'].max()/metrics_df['MAE'].min():.1f}x
   across countries. Countries with fewer survey respondents show higher prediction error.

2. **Systematic bias direction:** The model over-predicts salaries for lower-income
   countries and under-predicts for higher-income countries. This reflects the model
   learning a global average rather than country-specific salary distributions.

3. **Feature omission does not remove bias:** Country was deliberately excluded as a
   feature. The model still produces country-unequal errors because correlated features
   (experience patterns, language usage) partially proxy for geography.

4. **Mitigation applied:** Sample reweighting by inverse country frequency reduced
   per-country MAE disparity. See Section 9.5 for results.

---

## Limitations and Recommendations

- Do not use this model to set or justify actual salaries
- Do not use predictions for individuals from countries with fewer than 100 survey respondents
- Retrain annually as salary distributions change
- Add country as an explicit feature if geographic fairness is required
- Conduct a full fairness audit before any deployment beyond education

---

*This model card was generated as part of Chapter 9 of Python for AI/ML.
It is a teaching example of responsible documentation practice.*
"""

print(model_card)

# Save to file
with open('/tmp/salary_model_card.md', 'w') as f:
    f.write(model_card)
print('Model card saved to /tmp/salary_model_card.md')


---

## Concept Check Questions

> Test your understanding before moving on. Answer each question without referring back to the notebook, then expand to check.

**Q1.** Name the five sources of ML bias and give a one-sentence description of each.

<details><summary>Show answer</summary>

1. **Representation bias** ‚Äî certain groups are under-represented in training data. 2. **Measurement bias** ‚Äî features or labels are measured differently across groups. 3. **Aggregation bias** ‚Äî one model is applied to populations with different underlying patterns. 4. **Evaluation bias** ‚Äî the benchmark doesn't represent the deployment population. 5. **Deployment bias** ‚Äî the model is used in a different context than it was built for.

</details>

**Q2.** What is **demographic parity** and when might satisfying it still be unfair?

<details><summary>Show answer</summary>

Demographic parity requires equal positive prediction rates across groups. It can still be unfair if recall differs between groups ‚Äî equal prediction rates could be achieved by approving less qualified candidates in one group and denying qualified ones in another.

</details>

**Q3.** Explain the difference between a **global** and a **local** SHAP explanation.

<details><summary>Show answer</summary>

**Global** SHAP summarises importance across the dataset ‚Äî which features matter most overall. **Local** SHAP explains a single prediction ‚Äî each feature's contribution to pushing that specific prediction above or below the baseline.

</details>

**Q4.** What is a **model card** and what six elements should it contain?

<details><summary>Show answer</summary>

(1) Model description and intended use; (2) Out-of-scope uses and limitations; (3) Training data and known biases; (4) Evaluation metrics broken down by subgroup; (5) Fairness considerations and mitigations; (6) Caveats and recommendations for users.

</details>

**Q5.** Your model has MAE = $8k overall but $14k for lower-income countries. What is this called, and what is one mitigation?

<details><summary>Show answer</summary>

**Differential performance** ‚Äî a form of algorithmic bias. One mitigation is **sample reweighting**: upweight underrepresented groups by passing `sample_weight` (inversely proportional to group frequency) to the model's `fit()` method.

</details>



---

## Coding Exercises

> Three exercises per chapter: **üîß Guided** (fill-in-the-blanks) ¬∑ **üî® Applied** (write from scratch) ¬∑ **üèóÔ∏è Extension** (go beyond the chapter)

Exercises use the SO 2025 developer survey dataset.
Expand each **Solution** block only after attempting the exercise.


### Exercise 1 üîß Guided ‚Äî Compute fairness metrics across groups

Complete `fairness_report(y_true, y_pred, groups)` that computes
for each unique group value: accuracy, TPR (recall), FPR, and
demographic parity (P(≈∑=1|group)).
Flag any pair of groups where demographic parity differs by > 0.10.


In [None]:
import numpy as np
import pandas as pd

def fairness_report(y_true: np.ndarray, y_pred: np.ndarray,
                     groups: np.ndarray) -> pd.DataFrame:
    rows = []
    for g in np.unique(groups):
        mask = groups == g
        yt, yp = y_true[mask], y_pred[mask]
        tp = ((yt==1) & (yp==1)).sum()
        fp = None  # YOUR CODE
        fn = None  # YOUR CODE
        tn = None  # YOUR CODE
        acc  = (tp + tn) / len(yt)
        tpr  = tp / (tp + fn) if (tp + fn) > 0 else 0  # recall
        fpr  = None  # YOUR CODE
        dp   = None  # YOUR CODE
        rows.append({'group': g, 'n': mask.sum(),
                     'accuracy': acc, 'TPR': tpr, 'FPR': fpr, 'dem_parity': dp})
    df = pd.DataFrame(rows).set_index('group')
    dp_range = df['dem_parity'].max() - df['dem_parity'].min()
    if dp_range > 0.10:
        print(f'‚ö† Demographic parity gap: {dp_range:.3f} (> 0.10 threshold)')
    return df.round(3)

<details><summary>üí° Hint</summary>

`fp = ((yt==0) & (yp==1)).sum()`
`fn = ((yt==1) & (yp==0)).sum()`
`tn = ((yt==0) & (yp==0)).sum()`
`fpr = fp / (fp + tn) if (fp + tn) > 0 else 0`
`dp = yp.mean()`

</details>

<details><summary>‚úÖ Solution</summary>

```python
fp=((yt==0)&(yp==1)).sum(); fn=((yt==1)&(yp==0)).sum(); tn=((yt==0)&(yp==0)).sum()
fpr=fp/(fp+tn) if (fp+tn)>0 else 0; dp=yp.mean()
```

</details>


### Exercise 2 üî® Applied ‚Äî Audit a salary model for geographic bias

Train a salary classifier on synthetic SO 2025 data where `Country`
is intentionally correlated with `EdLevel` (simulating structural bias).

1. Train a GBM classifier (no country feature)
2. Use `fairness_report()` from Exercise 1 to audit by country
3. Identify which country has the highest FPR and TPR disparity
4. Apply `class_weight` rebalancing and re-audit


In [None]:
import numpy as np, pandas as pd
from sklearn.ensemble import GradientBoostingClassifier
from sklearn.model_selection import train_test_split

rng=np.random.default_rng(42); n=6000
country=rng.choice(['US','India','Germany','UK'],n,p=[0.4,0.3,0.2,0.1])
# Structural bias: India more likely to have no degree
ed_noise = np.where(country=='India', rng.normal(0.3,0.1,n), rng.normal(0,0.1,n))
salary=np.exp(10.5+0.08*rng.exponential(5,n)-0.3*(country=='India').astype(float)+rng.normal(0,0.4,n))
y=(salary>70000).astype(int)
X=pd.DataFrame({'years':rng.exponential(6,n),'py':rng.integers(0,2,n),
                'sql':rng.integers(0,2,n),'ai':rng.integers(0,2,n)})

# YOUR CODE: split, train, fairness_report(y_te, y_pred, country_te)

<details><summary>üí° Hint</summary>

After training, call `fairness_report(y_te, y_pred, country_te)` where
`country_te` is the country values for the test split.
For rebalancing: compute `sample_weight` as `n_total / (n_classes * count_per_country)`

</details>

<details><summary>‚úÖ Solution</summary>

```python
X_tr,X_te,y_tr,y_te,c_tr,c_te=train_test_split(X,y,country,test_size=0.2,random_state=42)
clf=GradientBoostingClassifier(n_estimators=100,random_state=42).fit(X_tr,y_tr)
print(fairness_report(y_te,clf.predict(X_te),c_te))
```

</details>


### Exercise 3 üèóÔ∏è Extension ‚Äî Write a model card

Write a complete model card (as a Python dict that you save as JSON)
for the salary prediction model trained in Chapter 12.

The model card must include all six required sections from the Hugging Face
model card specification:
1. Model details (name, version, type, training date)
2. Intended use (primary use, out-of-scope uses)
3. Factors (relevant groups, instrumentation)
4. Metrics (performance metrics, thresholds)
5. Evaluation data (source, preprocessing, motivation)
6. Ethical considerations and caveats


In [None]:
import json

model_card = {
    'model_details': {
        'name': 'SO2025 Salary Tier Classifier',
        'version': '1.0.0',
        # YOUR CODE: fill in all required fields
    },
    'intended_use': {},
    'factors': {},
    'metrics': {},
    'evaluation_data': {},
    'ethical_considerations': {}
}

# Validate completeness
required_sections = ['model_details','intended_use','factors',
                      'metrics','evaluation_data','ethical_considerations']
for s in required_sections:
    assert model_card.get(s), f'Missing section: {s}'

with open('/tmp/model_card.json', 'w') as f:
    json.dump(model_card, f, indent=2)
print('Model card saved.')

<details><summary>üí° Hint</summary>

This is a writing exercise ‚Äî there is no single correct answer.
Think carefully about: Who will use this model? What can go wrong?
What groups might be disadvantaged? What data was it trained on?
A strong model card is specific, honest, and actionable.

</details>

<details><summary>‚úÖ Solution</summary>

```python
# No single correct answer ‚Äî evaluate against completeness and specificity.
# Key things to include in ethical_considerations:
# - Geographic bias documented in Exercise 2
# - Self-reported survey data limitations
# - Currency / inflation effects
# - What the model should NOT be used for (individual salary negotiation)
```

</details>


---

## Chapter 10 Summary

### Key Takeaways

- **Bias has five entry points:** data, label, measurement, feedback loop, and deployment.
  Each requires a different mitigation strategy.
- **Omitting sensitive attributes does not remove bias.** It makes bias invisible
  and harder to measure. Correlated proxy features carry the bias forward.
- **Fairness is multi-dimensional.** Demographic parity, equalised odds, and
  calibration are all valid fairness criteria -- and they cannot all be satisfied
  simultaneously when base rates differ across groups (Chouldechova's theorem).
- **SHAP** provides both global feature importance and local per-prediction
  explanations grounded in game theory. `TreeExplainer` is exact and fast
  for tree-based models.
- **Sample reweighting** is the lowest-friction pre-processing mitigation:
  add `sample_weight` to `fit()`. It typically improves fairness at a small
  cost to overall accuracy.
- **Model cards** are the professional standard for model documentation.
  They record intended use, data limitations, performance by subgroup,
  and known biases. Write one before any deployment.

### Project Thread Status

| Task | Status |
|------|--------|
| Data representation audit | Done |
| Per-country fairness audit (MAE, bias direction) | Done |
| Global SHAP feature importance | Done |
| Local SHAP individual prediction explanation | Done |
| Sample reweighting mitigation | Done |
| Model card written | Done |

---

### What's Next

Chapter 9 completes Part 3. The book continues with Part 4 and six appendices:

**Part 4 ‚Äî Production and Deployment:**
- **Chapter 10** -- MLOps and Production ML: experiment tracking with MLflow, model registry, FastAPI serving, unit testing, and drift detection
- **Chapter 11** -- Computer Vision with PyTorch: CNNs from scratch, transfer learning with ResNet-18, feature map visualisation

**Appendices:**
- **Appendix A** -- Python environment setup: `venv`, `conda`, `requirements.txt`, GPU drivers
- **Appendix B** -- Keras 3 companion: the Chapter 7 networks rebuilt in Keras
- **Appendix C** -- Project ideas and further reading: ten capstone projects and curated resources
- **Appendix D** -- Reinforcement learning: Q-learning, Bellman equation, DQN on CartPole
- **Appendix E** -- SQL for data scientists: `sqlite3`, `pandas.read_sql`, window functions
- **Appendix F** -- Git and GitHub for ML: branching, `.gitignore`, `nbstripout`, DVC

---

*End of Chapter 10 -- Python for AI/ML*  
[![Back to TOC](https://img.shields.io/badge/Back_to-Table_of_Contents-1B3A5C?style=flat-square)](https://colab.research.google.com/github/timothy-watt/python-for-ai-ml/blob/main/Python_for_AIML_TOC.ipynb)
