# Class 8 Notebook – Bias and Ethics in AI/ML Basics

This notebook introduces **bias and ethics** in AI and machine learning.

As ML systems are deployed in hiring, lending, healthcare, and justice, understanding bias and fairness becomes essential:
- **Data bias** – Training data reflects historical inequities
- **Algorithmic bias** – Models can amplify or introduce new bias
- **Fairness** – Different definitions (e.g., demographic parity, equalized odds) and trade-offs

**Objective**: Set up the environment and explore core concepts for thinking critically about bias and responsible AI.

**Key ideas**:
- Bias can enter at data collection, labeling, feature selection, and model design
- Fairness metrics can conflict; there is no single "correct" definition
- Responsible AI includes transparency, interpretability, and ongoing monitoring

Run the first code cell to confirm your environment works.

## Run in the browser (no local setup)

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/adzuci/ai-fundamentals/blob/class-8-bias-and-ethics/class-8-bias-and-ethics/01_class_8_bias_and_ethics_basics.ipynb)

> Tip: This notebook assumes you're comfortable with basic Python, pandas, and scikit-learn from earlier classes.

## STEP 1: Environment check and imports

Verify that NumPy, pandas, and scikit-learn are available for building simple classifiers and analyzing predictions.

In [1]:
# Concept: Environment sanity check for bias/ethics notebook
import platform
import numpy as np
import pandas as pd
import sklearn
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import classification_report

print(f"Python: {platform.python_version()}")
print(f"OS: {platform.system()} {platform.release()}")
print(f"NumPy: {np.__version__}")
print(f"pandas: {pd.__version__}")
print(f"scikit-learn: {sklearn.__version__}")
print("All libraries imported successfully!")

Python: 3.10.14
OS: Darwin 25.2.0
NumPy: 1.26.4
pandas: 2.3.3
scikit-learn: 1.7.2
All libraries imported successfully!


## What is bias in ML?

**Bias** in machine learning refers to systematic errors or unfairness in model behavior—often toward or against certain groups.

- **Data bias**: Training data underrepresents groups, reflects historical discrimination, or has labeling errors that correlate with protected attributes.
- **Algorithmic bias**: The model itself (e.g., regularization, threshold choices) produces different error rates or outcomes across groups.
- **Feedback loops**: Deployed models influence future data (e.g., predictive policing), reinforcing existing bias.

In later cells, we'll add examples and simple fairness checks.

## STEP 2: Create a toy dataset for bias exploration

We create a simple dataset with **experience**, **test_score**, and **gender**—typical features in hiring or performance contexts. We'll use it to explore how models behave across groups and discuss fairness.

In [2]:
# Concept: Create dataset for bias exploration
# import Libraries
import numpy as np
import pandas as pd

np.random.seed(42)
# Create Dataset
mydata = pd.DataFrame({
    "exp": np.random.randint(0, 10, 100),
    "test_score": np.random.randint(50, 100, 100),
    "gender": np.random.choice(["Male", "Female"], 100)
})

mydata.head(10)

Unnamed: 0,exp,test_score,gender
0,6,61,Male
1,3,83,Female
2,7,82,Male
3,4,97,Male
4,6,72,Male
5,9,73,Female
6,2,86,Female
7,6,84,Male
8,7,93,Female
9,4,89,Female


## STEP 3: Introduce bias into the dataset

We create a **hired** column: first based on merit (exp > 5 and test_score > 70), then we **manually add bias** so males have a 70% chance of being marked hired regardless of qualifications. This simulates the kind of historical bias that can appear in real hiring data.

In [3]:
# Concept: Introduce bias (males have higher chance of being hired)
mydata["hired"] = ((mydata["exp"] > 5) & (mydata["test_score"] > 70)).astype(int)

# Add bias manually
mydata.loc[mydata["gender"] == "Male", "hired"] = np.where(
    np.random.rand(len(mydata[mydata["gender"] == "Male"])) > 0.3, 1, 0
)

In [4]:
mydata.head(10)

Unnamed: 0,exp,test_score,gender,hired
0,6,61,Male,1
1,3,83,Female,0
2,7,82,Male,1
3,4,97,Male,0
4,6,72,Male,0
5,9,73,Female,1
6,2,86,Female,0
7,6,84,Male,1
8,7,93,Female,1
9,4,89,Female,0


## STEP 4: Encode categorical features and prepare X, y

Encode the **gender** column so the model can use it (scikit-learn expects numeric features). Then split into **X** (features) and **y** (target) for training.

In [5]:
# Concept: Encode categorical features and prepare X, y for modeling

# LabelEncoder turns categorical values (e.g., "Male", "Female") into integers (0, 1)
# so the model can use them. Caution: using gender as a feature trains the model
# to predict from it—which can perpetuate bias if the data is biased.
le = LabelEncoder()
mydata["gender"] = le.fit_transform(mydata["gender"])

# Features (X): exp, test_score, and encoded gender. The model will learn from these.
X = mydata[["exp", "test_score", "gender"]]

# Target (y): hired (0 or 1). This is what we're predicting.
y = mydata["hired"]

## STEP 5: Split data and train the model

Split X and y into train/test sets (80% train, 20% test), then train a Logistic Regression classifier. The model learns from the (biased) training data—including the gender feature.

In [6]:
# Concept: Train-test split and model training

# Split into 80% train, 20% test. We hold out test data to evaluate on unseen examples.
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

# Logistic Regression: a simple binary classifier for hired (0/1)
model = LogisticRegression()
model.fit(X_train, y_train)

0,1,2
,penalty,'l2'
,dual,False
,tol,0.0001
,C,1.0
,fit_intercept,True
,intercept_scaling,1
,class_weight,
,random_state,
,solver,'lbfgs'
,max_iter,100


## STEP 6: Evaluate the model

Predict on the test set and print a **classification report**: precision, recall, F1-score, and support per class. High overall metrics can mask unfair performance across groups—next we could drill down by gender to check for bias.

In [7]:
# Concept: Classification report (precision, recall, F1, support)
y_pred = model.predict(X_test)
print(classification_report(y_test, y_pred))

              precision    recall  f1-score   support

           0       0.80      0.80      0.80        10
           1       0.80      0.80      0.80        10

    accuracy                           0.80        20
   macro avg       0.80      0.80      0.80        20
weighted avg       0.80      0.80      0.80        20



## STEP 7: Compare with a model trained without gender

Retrain using only **exp** and **test_score** (no gender). Compare the classification reports: does removing the protected attribute change performance or reduce the risk of discriminatory predictions?

In [8]:
# Concept: Remove gender and retrain to compare fairness
print("---------Remove gender---------")
X = mydata[["exp", "test_score"]]

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
model = LogisticRegression()
model.fit(X_train, y_train)

pred = model.predict(X_test)
print(classification_report(y_test, pred))

---------Remove gender---------
              precision    recall  f1-score   support

           0       0.62      0.56      0.59         9
           1       0.67      0.73      0.70        11

    accuracy                           0.65        20
   macro avg       0.65      0.64      0.64        20
weighted avg       0.65      0.65      0.65        20



---
## Class Exercise: Bias in Loan Approval

**Scenario:** A bank has an AI-built system that approves loans for customers. There is a suspicion that it favors one specific group. As an ethical AI engineer, you have to find or investigate—and provide an appropriate dataset.

**Features:** income / credit_score / gender  
**Label:** approved

Use random seeded data, similar to the hiring example above. Generate a dataset with these features so you can later train a model and investigate whether it shows bias.

### Exercise: Create the loan dataset

Mimic the hiring data generation: use `np.random.seed`, build a DataFrame with **income**, **credit_score**, and **gender**. Add an **approved** column (0/1) as the label. You can start with a merit-based rule (e.g., income and credit_score above thresholds), then optionally introduce bias for one group.

In [9]:
# Concept: Loan approval dataset (mimics hiring data generation)
np.random.seed(42)

loandata = pd.DataFrame({
    "income": np.random.randint(30, 150, 100) * 1000,      # 30k–150k
    "credit_score": np.random.randint(300, 850, 100),
    "gender": np.random.choice(["Male", "Female"], 100)
})

# Merit-based approval: income > 50k and credit_score > 600
loandata["approved"] = ((loandata["income"] > 50000) & (loandata["credit_score"] > 600)).astype(int)

# Introduce bias by income: high-income applicants get 70% approval regardless of credit
loandata.loc[loandata["income"] > 80000, "approved"] = np.where(
    np.random.rand(len(loandata[loandata["income"] > 80000])) > 0.3, 1, 0
)

loandata.head(10)

Unnamed: 0,income,credit_score,gender,approved
0,132000,569,Male,1
1,81000,570,Female,1
2,122000,755,Male,1
3,44000,761,Male,0
4,136000,551,Female,0
5,101000,595,Male,1
6,90000,637,Male,1
7,50000,352,Male,0
8,132000,516,Female,1
9,112000,487,Female,1


In [12]:
le = LabelEncoder()
loandata["gender"] = le.fit_transform(loandata["gender"])

# Features (X): exp, test_score, and encoded gender. The model will learn from these.
X = loandata[["income", "credit_score", "gender"]]

# Target (y): hired (0 or 1). This is what we're predicting.
y = loandata["approved"]

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

# Logistic Regression: a simple binary classifier for hired (0/1)
model = LogisticRegression()
model.fit(X_train, y_train)

0,1,2
,penalty,'l2'
,dual,False
,tol,0.0001
,C,1.0
,fit_intercept,True
,intercept_scaling,1
,class_weight,
,random_state,
,solver,'lbfgs'
,max_iter,100


In [13]:
y_pred = model.predict(X_test)
print(classification_report(y_test, y_pred))

              precision    recall  f1-score   support

           0       0.80      0.67      0.73        12
           1       0.60      0.75      0.67         8

    accuracy                           0.70        20
   macro avg       0.70      0.71      0.70        20
weighted avg       0.72      0.70      0.70        20



In [14]:
# Concept: Remove income and retrain to compare fairness
print("---------Remove income---------")
X = loandata[["credit_score", "gender"]]

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
model = LogisticRegression()
model.fit(X_train, y_train)

pred = model.predict(X_test)
print(classification_report(y_test, pred))

---------Remove income---------
              precision    recall  f1-score   support

           0       0.62      0.45      0.53        11
           1       0.50      0.67      0.57         9

    accuracy                           0.55        20
   macro avg       0.56      0.56      0.55        20
weighted avg       0.57      0.55      0.55        20

