# Part 2: Measuring Fairness

## Fairness and Discrimination
Fairness and discrimination are closely related, but not the same. 

>**Fairness** describes the goal of treating individuals and groups equitably in decision-making processes. 
>**Discrimination** refers to a violation of this goal, often resulting in unjust or unequal outcomes<sup>1</sup>. 

In the context of machine learning, fairness is used to assess whether algorithmic decisions disadvantage certain groups — especially based on sensitive attributes like race, gender, or age.
Several factors influence fairness, such as:

- **Transparency**
- **Accountability**
- **Explainability**
- **Bias**

> **Among these, bias plays the most significant role in contributing to discrimination.**


There is **no universal definition of fairness** in machine learning (neither in other disciplines). Competing perspectives exist, often shaped by legal, cultural, or domain-specific goals.<sup>2</sup>

- **Individual fairness**: Similar individuals should be treated similarly.
- **Group fairness**: Different demographic groups should receive similar outcomes.<sup>3</sup>

Each approach has **trade-offs**. 
What is fair in credit scoring may not be fair in university admissions or criminal justice.<sup>4</sup> 

Discrimination in machine learning is different from traditional human discrimination. Algorithms do not have intent or moral awareness. However, they can still produce discriminatory outcomes when trained on biased or unbalanced data. We can distinguish two main types of algorithmic discrimination:<sup>3</sup>

- **Direct discrimination**: Using protected attributes (e.g. gender or race) explicitly in decision-making.
- **Indirect discrimination**: Using seemingly neutral features (e.g. zip code) that correlate with protected attributes and act as proxies.<sup>2</sup>

Discrimination occurs at multiple levels:<sup>1</sup>

- **Structural**: Systemic inequality embedded in laws or history
- **Organizational**: Biased rules or decision processes in institutions
- **Interpersonal**: Individual-level stereotypes or assumptions

Machine learning systems can unintentionally replicate or amplify discrimination from any of these levels.
Understanding how fairness and discrimination interact is essential for designing ethical models. The next section introduces common fairness metrics that allow us to detect and evaluate discrimination in practice.

---

## Fairness Metrics

As already said, there is no universal definition of fairness in machine learning. Over the past years, many different fairness metrics have been proposed to evaluate algorithmic decision-making. The large number of fairness definitions can be overwhelming, also because there is no clear consensus on when to use which metric. Simply satisfying as many notions as possible is not an option, as some definitions  are **mathematically incompatible**.<sup>5,</sup><sup>6</sup> 
> **Fairness metrics should be seen as diagnostic tools, not automatic solutions. They can help to identify potential sources of unfairness and discrimination but do not directly fix them.** 

Choosing the right metric depends on:
- **The application domain**
- **Ethical priorities**
- **Context-specific trade-offs**

The following section focuses on **observational fairness metrics**. As many of these definitions are derived from the confusion matrix, first a small reminder about the **confusion matrix** and some **core statistical measures**.

The selected fairness metrics presented in this part of the notebook represent the main ideas behind measures of fairness, as many of them are similar in their approach. For a more extensive list of fairness metrics see the paper by Verma & Rubin (2018). Also note that most research on fairness metrics (and also this notebook) focuses on classification algorithms.

---

### Confusion Matrix

The confusion matrix compares the predicted and true class labels. It forms the basis for many fairness and performance metrics:

|                | Predicted Positive | Predicted Negative |
|----------------|--------------------|--------------------|
| **Actual Positive** | True Positive (TP)      | False Negative (FN)     |
| **Actual Negative** | False Positive (FP)     | True Negative (TN)      |

---

### Accuracy

Accuracy is the most basic evaluation metric in classification:

$$
\text{Accuracy} = \frac{TP + TN}{TP + TN + FP + FN}
$$

> **Limitation**: Accuracy can be misleading in imbalanced datasets and tells us nothing about fairness across groups.

---

### Core Statistical Measures

These 8 measures are derived from the confusion matrix and form the basis for many fairness definitions:<sup>7</sup>

| **Measure**                          | **Formula**               | **Description**                                      |
|-------------------------------------|---------------------------|------------------------------------------------------|
| Positive Predictive Value/Precision (PPV)  | TP / (TP + FP)     | How many predicted positives are correct? |
| False Discovery Rate (FDR)          | FP / (TP + FP)            | How many predicted positives are wrong?   |
| Negative Predictive Value (NPV)     | TN / (TN + FN)            | How many predicted negatives are correct? |
| False Omission Rate (FOR)           | FN / (TN + FN)            | How many predicted negatives are wrong?   |
| True Positive Rate/Recall/Sensitivity (TPR)| TP / (TP + FN)     | How many actual positives are caught?     |
| False Negative Rate (FNR)           | FN / (TP + FN)            | How many positives were missed?           |
| False Positive Rate (FPR)           | FP / (FP + TN)            | How many negatives were wrongly predicted as positive?|
| True Negative Rate/Specificity (TNR)| TN / (TN + FP)            | How many actual negatives are caught?     |

> These values are computed **per group** (e.g. male/female) to assess fairness.

---

## Observational Fairness

**Observational fairness** refers to fairness definitions that rely only on observed data — specifically on statistical relationships between:

- **Ŷ**: the model prediction  
- **Y**: the ground truth  
- **A**: the sensitive attribute (e.g. race, gender)

These definitions **do not require access to causal knowledge** or model internals. They are easy to compute and widely used in fairness audits.

Most observational fairness metrics are based on combinations of the 8 statistical measures from the confusion matrix.

We group them into **three main categories**:<sup>1</sup>

- Independence
- Separation
- Sufficiency

---

### Independence

Requires:  
$$
\hat{Y} \perp A
$$

The predicted outcome should be statistically independent of the sensitive attribute.<sup>1</sup>

This means that all groups should receive positive predictions at equal rates — **regardless of their actual outcome (Y)**.<sup>7</sup>

#### Common Metric
- **Statistical Parity / Demographic Parity**<sup>7</sup>
  $$
  P(\hat{Y} = 1 \mid A = 0) = P(\hat{Y} = 1 \mid A = 1)
  $$

#### Pros
- Simple and intuitive
- Easy to implement and visualize

#### Cons
- Ignores true outcome (Y)
- Can result in unfair treatment if base rates differ
- Overlooks explainable or justified outcome differences
- Blind to structural and historical context

> **Example**:
> Consider a machine learning system that predicts the likelihood of reoffending (such as COMPAS).
> If men statistically reoffend more often than women, enforcing equal prediction rates across groups could lead to harsher treatment of women without justification.<sup>4</sup>

---

### Separation

Requires:  
$$
\hat{Y} \perp A \mid Y
$$

Given the true label Y, predictions should be independent of A.

This ensures **equal error rates** across groups.<sup>1</sup>

#### Common Metrics
- **Equalized Odds**:<sup>7</sup>  
  Equal FPR and FNR across groups  
- **Predictive Equality**:<sup>7</sup>
  Equal FPR only
- **Equal Opportunity**:<sup>7</sup>
  Equal FNR only

#### Pros
- Captures error disparities between groups
- Especially relevant when different types of errors (false positives vs. false negatives) have unequal real-world consequences (e.g. false arrests vs. wrongful releases)

#### Cons
- May reduce overall accuracy
- Relies on a valid and unbiased ground truth (Y)
- Cannot be satisfied together with other metrics under realistic conditions

> Trade-offs between fairness and performance can be visualized using **ROC curves**.
> Disparities in error rates often reflect and reinforce historical inequalities.<sup>1</sup>

---

### Sufficiency

Requires:  
$$
Y \perp A \mid \hat{Y}
$$

Given the prediction, the true outcome should be independent of A.

This means that **predictions are equally reliable across groups**.<sup>1</sup>

#### Common Metrics
- **Calibration Concepts:<sup>7</sup>**

   - **Calibration**:
     On average, across the whole population, predicted probabilities match actual outcomes.

    - **Group Calibration**:
     Predicted probabilities match actual outcomes **within each demographic group** (e.g. gender, race).
 
    - **Well-Calibration**:
      The strongest form — predicted probabilities perfectly match actual outcomes at a fine-grained level **within each group** and for **each score value**.
  
  In fairness assessments, **group calibration** is mostly used because it checks whether predictions are **equally interpretable across groups**.
 
> **Example:**
> If a model predicts a 70% risk of default, about 70% of individuals assigned a score of 0.7 should actually default — regardless of their group membership.  
  
- **Predictive Parity**:<sup>7</sup>  
  Equal PPV across groups  
 
#### Pros
- Predictions are equally reliable across groups
- Ensures consistent interpretation of predicted scores
- Often achievable without explicit fairness constraints

#### Cons
- Relies on a valid and unbiased ground truth (Y)
- Can reproduce harmful disparities if the ground truth itself reflects historical bias
- Can conflict with separation-based metrics

> Sufficiency ensures that prediction scores are consistent across groups, but it does not address deeper structural inequalities (but this is true for all statistical fairness definitions).<sup>1</sup>

---

### Incompatibility of Metrics

One challenge in fair machine learning is that **not all fairness metrics can be satisfied at the same time**. In many real-world situations, especially because the sensitive attribute A and the true outcome Y are almost always statistically dependent, fairness definitions make **conflicting assumptions**.
Conflicts arise because:

- **Independence** requires predictions to ignore group membership
$$
\hat{Y} \perp A
$$
- **Separation** requires equal error rates, which depend on group-specific outcome distributions
$$
\hat{Y} \perp A \mid Y
$$
- **Sufficiency** focuses on reliability of predictions across groups
$$
Y \perp A \mid \hat{Y}
$$


These conditions can't hold simultaneously unless:<sup>5,</sup><sup>6</sup>

- Predictions are **perfect**, or
- The sensitive attribute **has no statistical relationship** with the target variable (which is rare)

Example: **COMPAS**<sup>8</sup>

- The model was **calibrated** → it satisfied sufficiency (PPV equal across groups)
- But it had **unequal error rates** (FPR/FNR differed by race) → it violated separation
- This illustrates how calibration alone cannot guarantee fairness.<sup>1</sup>

**Takeaway:** Fairness is not one-size-fits-all. Trade-offs between fairness goals are inevitable. Which metric to use depends on context, goals, and ethical priorities.

> **Note:** Observational fairness metrics are useful diagnostic tools to detect statistical disparities between groups.  
> However, they only rely on observed relationships between predictions, outcomes, and sensitive attributes — and ignore other relevant features that may contribute to unfairness.  
> As a result, they often miss the actual mechanisms behind discrimination. These metrics are blind to structural inequalities, biased decision processes, and causal factors outside the model.  
> They can confirm unequal treatment, but not explain *why* it happens.  
> **Similarity-based** and **causal fairness** approaches aim to address these limitations by evaluating the decision-making process itself and identifying justified and unjustified sources of disparity.

---

## Similarity-Based Fairness

While observational fairness focuses on statistical group-level patterns in model outcomes, **similarity-based fairness** evaluates the fairness of the **decision process itself** by assessing whether similar individuals receive similar outcomes — regardless of their group membership.

> This reflects the intuitive idea that fairness means treating comparable cases consistently, based on individual characteristics.<sup>2</sup>

**Note:** Defining what counts as *similar* can itself be subjective and context-dependent.

---

### Key Methods

#### Causal Discrimination (pairwise test)<sup>7</sup>
- Two individuals who differ **only** in a sensitive feature (e.g. gender) should receive **the same outcome**.
- Captures the idea that sensitive attributes **should not causally influence** decisions.
- Often unrealistic in practice to find two individuals that differ only in one dimension, as sensitive attributes are usually correlated with other features.

#### Fairness through Unawareness<sup>9</sup>
- Ensures fairness by **removing sensitive attributes** from the data (blinding).
- Based on the logic: “if the model doesn’t see it, it can’t discriminate.”
- Limitation: **proxy variables** can still leak bias (e.g. in the U.S. zip code → race).
- Can lead to **miscalibration** or **reduced accuracy** for some groups if relevant factors are ignored.

#### Fairness through Awareness<sup>7</sup>
- Uses a **similarity function** (e.g. distance metric) to compare individuals.
- Individuals who are “close” in feature space should be treated similarly.
- Requires including **all relevant attributes**, including sensitive ones.
- More flexible than blinding, but depends on how similarity is defined.

---

#### Pros
- Evaluates fairness at the **individual level**, not just between groups
- Focuses on the **decision process**, not only on outcomes
- Allows for **context-sensitive** definitions of fairness
- Bridges the gap between observational and causal fairness approaches

#### Cons
- Defining **similarity functions** is subjective and challenging
- Sensitive attributes often correlate with other features, complicating comparisons
- Requires **complex feature engineering** and **domain knowledge**

> Similarity-based fairness emphasizes how decisions are made rather than just focusing on the final outcomes. It requires careful thinking about which individuals should be treated similarly.<sup>7</sup>

---

## Causal Fairness

Causal fairness uses **causal models** to understand how variables (e.g. gender, income, education) influence decisions and to **distinguish fair from unfair causal effects**.

> Instead of just asking *who gets what*, causal models ask *why* certain outcomes occur — and whether sensitive attributes **legitimately** influence decisions.

Causal models are represented as **directed graphs** and allow for:<sup>2</sup>
- Reasoning about **hypothetical scenarios**
- Identifying and blocking **unfair influence paths**
- Designing **interventions** to improve fairness

However, modeling causality for **social attributes** (e.g. race, gender) is challenging:<sup>1</sup>
- These categories are **socially constructed**, not fixed
- Meanings vary across time, cultures, and contexts
- Looping effects exist: being labeled can influence future behavior

---

### Key Methods

#### Counterfactual Fairness<sup>9</sup>
- Asks: *Would the decision have been different if the person had belonged to another group when everything else is equal?*
- If the answer is *no* the decision is fair
- Enables **individual-level fairness auditing**
- Requires detailed causal models and well-defined counterfactuals

#### Path-Specific Fairness<sup>10</sup>
- Recognizes that **some paths** from sensitive attributes to outcomes may be acceptable and others not
- Allows defining which **causal paths are fair**
- Offers a **flexible and context-aware** balance between utility and fairness

---

#### Pros
- Enables **targeted diagnosis** of unfair influence  
- Works even with indirect or subtle bias  
- Supports **intervention design**

#### Cons
- Requires **domain knowledge** and modeling effort  
- Difficult to apply when variables are **entangled** or **not clearly defined**


> Causal fairness provides the **most powerful and flexible tools** for fairness analysis, but also the most demanding in terms of assumptions and modeling effort.

The next section provides a practical example showing how different fairness metrics can stand in **conflict** with each other and why **no single metric** is sufficient for evaluating fairness in complex systems. In practice, a combination of observational, similarity-based, and causal fairness assessments can offer a more comprehensive understanding of bias and discrimination in ML systems.

---

### Quiz

**1. True or False:**
If a model satisfies statistical parity, it also guarantees equal error rates across groups.
1. [ ] True
2. [ ] False

**2. Which of the fairness metrics requires predictions to be independent of the sensitive attribute, regardless of the true outcome?**
*(Select one option)*

1. [ ] Equalized Odds
2. [ ] Statistical Parity
3. [ ] Calibration
4. [ ] Predictive Parity

**3. Which statement about the incompatibility of fairness metrics is correct?**
*(Select one option)*

1. [ ] All fairness metrics can usually be satisfied simultaneously in real-world settings
2. [ ] If predictions are imperfect and sensitive attributes influence the outcome, different fairness goals can be in conflict
3. [ ] Independence, Separation, and Sufficiency are not in conflict when sensitive attributes are statistically related to the outcome
4. [ ] Sufficiency and Separation can always be satisfied together if enough data is available

---

#### Sources:
1. Barocas et al., 2023
2. Mehrabi et al., 2021
3. Calegari et al., 2023
4. Binns, 2018
5. Chouldechova, 2017
6. Kleinberg et al., 2016
7. Verma & Rubin, 2018
8. Angwin et al., 2016
9. Kusner et al., 2017
10. Corbett-Davies et al., 2023