# Part 4: Bias in Machine Learning

### What is Bias?

Bias refers to systematic **distortions** that lead to **unfair outcomes**. In machine learning, bias often arises **unintentionally** and is closely tied to **discrimination**. Bias can occur at any stage of the ML pipeline and is often introduced through **training data, modeling decisions, or deployment context**.

Bias is not always harmful. However, when it influences decisions about people’s lives, we must understand and address it. **Recognizing** and **reducing harmful bias is key** to building fair and trustworthy systems.<sup>1,</sup><sup>2</sup>

---

### Types of Bias in Machine Learning

There are many forms of bias that can affect data-driven systems. This notebook focuses on the seven bias categories introduced by **Suresh & Guttag (2021)**, which are especially relevant in the **machine learning lifecycle**. Each category highlights a specific entry point for bias and includes relevant subtypes.

### 1. Historical Bias<sup>3</sup>
Bias that already exists in the world, **before** any data collection, model training, or algorithmic decision-making takes place. It reflects **structural inequalities and social patterns** that are encoded in the data.

**Note:** According to Suresh & Guttag (2021), historical bias is the most fundamental form of bias. It includes **many other biases caused by users or society**. The subtypes below are just a small selection of common examples.

- **Subtypes:**<sup>1</sup>  
  - *Temporal bias*: outdated data that does not reflect current realities  
  - *Content production bias*: some groups produce more or different data (e.g. online content)  
  - *Behavioral bias*: different behavior across platforms or contexts  
  - *Social bias*: others’ behavior influences personal input (e.g. ratings)  
  - *Self-selection bias*: participation in data generation is non-random  
  - *User interaction bias*: feedback loops reinforce earlier behaviors

- **Example:** Word embeddings link "doctor" to male and "nurse" to female, reflecting societal stereotypes.

---

### 2. Representation Bias<sup>3</sup>
Occurs when the data underrepresents parts of the target population, leading to models that generalize poorly for these groups.

- **Subtypes:**  
  - *Population bias*: target population is misdefined (e.g. based on outdated census data)  
  - *Sampling bias*: data collection fails to reflect diversity (e.g. only hospital patients)  
  - *Coverage bias*: not all subgroups are equally included  
  - *Subset bias*: small groups (e.g. pregnant women) are statistically drowned out


- **Example:** ImageNet contains mostly Western-centric images (45% from the U.S. vs. 1% from China), leading to reduced performance for underrepresented regions. → Problematic when a skewed dataset is used for model training

---

### 3. Measurement Bias<sup>3</sup>
Bias in how features or labels are defined, collected, or measured. It often stems from inaccurate proxies or unequal label quality across groups.

- **Subtypes:**<sup>1,</sup><sup>4</sup>
  - *Label bias*: labels don't reflect ground truth equally (e.g. arrest = crime?)  
  - *Omitted variable bias*: important explanatory variables are missing  
  - *Instrument bias*: the measurement tool itself performs differently across groups

- **Example:** Using arrest records as a crime proxy leads to inflated risk scores for overpoliced communities. → Higher false positive rates for Black defendants in COMPAS

---

### 4. Aggregation Bias<sup>3</sup>
Happens when a single model is used across diverse subpopulations and uniform behavior is assumed.

- **Subtype:**<sup>1</sup> 
  - *Simpson’s Paradox*: trends reverse or disappear when data is aggregated

- **Example:** A model trained on general data misclassifies medical conditions in women because the average values are male-dominated.

---

### 5. Learning Bias<sup>3</sup>
Bias introduced through modeling decisions, such as optimizing only for overall accuracy.

- **Example:** A model may ignore underrepresented group patterns because they're harder to learn or contribute little to global accuracy.

- **Note:** This often interacts with representation or measurement bias.

---

### 6. Evaluation Bias<sup>3</sup>
Bias introduced when model evaluation uses benchmarks or metrics that do not reflect the real-world deployment population.

- **Example:** Gender classification performs worst for dark-skinned women due to underrepresentation in benchmark datasets.<sup>5</sup>

- **Impact:** Poor subgroup performance may remain unnoticed if metrics like overall accuracy are used.

---

### 7. Deployment Bias<sup>3</sup>
When a model is applied in a context that differs from its training or evaluation phase — often without appropriate human oversight.

- **Example:** A recidivism prediction model is used to determine prison sentences, but it was designed only to assess risk.

- **Risk:** Even well-performing models can cause harm if deployed carelessly.

---

**Note:**
These bias types are **not isolated**. They often interact and reinforce each other in **feedback loops**. These dynamics will be discussed on page 6 of this notebook.

---


### Exercise: Localize Forms of Bias in ML-Lifecycle

Below is a simplified version of a figure from Suresh & Guttag (2021). The bias names have been replaced with numbers.

> **Task:** Match each number to the correct bias type. When you are done you can check your answers in the user guide.

![](Images/Suresh_Reverse.png)

---

### Bias Amplification<sup>6</sup>

Bias amplification happens when the model not only reflects existing bias in the data but **intensifies** it — producing more skewed predictions than expected. Research by Hall et al. (2022) identifies several key conditions under which amplification is most likely to occur:

#### Strong Group Signals

A **group signal** refers to information in the data that reveals **group membership** (e.g. gender or ethnicity) even if this feature is not explicitly included.

- If the **group signal is strong** and easier to learn than the actual label,  
  the model may rely on **group-related shortcuts** instead of task-relevant patterns.
- This leads to **overgeneralization** and **amplified disparities** in predictions.
- **Example:** If female students in the data graduate more often, and gender can be inferred from features like *field of study* or *high school GPA*, the model may overuse this information. This results in more graduation predictions just because the student appears to be female.
>**Training dynamics:** Bias amplification is often highest at the beginning of training when models rely on simple group cues. It may decrease mid-training as class-specific patterns emerge, and slightly rise again in late phases.

#### Model Capacity (V-Shaped Effect)

**Model capacity** describes a model’s ability to learn complex patterns. It is influenced by architecture, depth, regularization, and other hyperparameters.

- **Low-capacity models** underfit and rely on simple features — often group-related ones.
- **High-capacity models** overfit, learning spurious or biased correlations.
- Amplification is highest at both extremes → the **relationship follows a V-shape**.
- Careful tuning and regularization (e.g. weight decay) can help reduce this effect — but often require trading off accuracy.

#### Data Size and Bias

The structure and size of the training data matter:

- **Small datasets** can lead to overfitting or reliance on noisy group cues.
- **Highly biased datasets** amplify existing imbalances.
- In contrast, **larger and more balanced datasets** reduce amplification risk.

#### Confidence and Calibration

**Poor calibration** means the model’s prediction confidence does **not match** its actual accuracy.

- Overconfident models — especially on **underrepresented groups** — are more likely to amplify bias.
- A model that makes incorrect predictions with **high confidence** can mask its errors and further reinforce group disparities.
- This effect is particularly common in **high-capacity models**.

**Important:**  
The findings above are based on **binary classification** and **image recognition tasks**.  
> **More research is needed** to determine how generalizable these patterns are across domains and applications.

The next section puts the theoretical bias types from this page into practice. We analyze gender classification models to investigate the different forms of bias in a real-world scenario.

---

### Quiz

**1. True or False:**
Deployment bias occurs when the training data is not representative of the target population.
1. [ ] True
2. [ ] False

**2. Which of the following is an example of measurement bias?**
*(Select one option)*

1. [ ] A model trained on mostly Western images underperforms in Asian countries
2. [ ] A recidivism model uses "arrest record" as a label for "criminal behavior"
3. [ ] A model misclassifies women due to male-dominated training data
4. [ ] A model is evaluated only on benchmark datasets that exclude minorities

**3. Which statement about bias amplification is correct?**
*(Select one option)*

1. [ ] It only occurs when the training data is fully biased
2. [ ] It can occur even in small datasets due to overfitting
3. [ ] It always increases with model capacity
4. [ ] It can be avoided by removing all group-related features

---

#### Sources:
1. Mehrabi et al., 2021
2. Howard & Borenstein, 2018
3. Suresh & Guttag, 2021
4. Corbett-Davies et al., 2023
5. Buolamwini & Gebru, 2018
6. Hall et al., 2022