# <font color="#418FDE" size="6.5" uppercase>**Bias And Fairness**</font>

>Last update: 20260201.
    
By the end of this Lecture, you will be able to:
- Identify potential sources of bias in datasets used for machine learning. 
- Explain how biased data can lead to unfair model predictions for certain groups. 
- Propose simple steps a beginner can take to check for and reduce unfairness. 


## **1. Sources of Data Bias**

### **1.1. Uneven Data Sampling**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Machine Learning for Beginners/Module_10/Lecture_A/image_01_01.jpg?v=1769974188" width="250">



>* Some groups appear much more in data
>* Models then favor majority groups, disadvantaging minorities

>* Data sources can quietly exclude many groups
>* Models work best for those most frequently recorded

>* Large, advanced datasets can still be skewed
>* Ask who’s missing and how that harms



In [None]:
#@title Python Code - Uneven Data Sampling

# This script illustrates uneven data sampling clearly.
# We use a tiny synthetic dataset example.
# Focus is on group counts and simple accuracy.

# Import required libraries for data handling.
import numpy as np
import pandas as pd

# Set deterministic random seed for reproducibility.
np.random.seed(42)

# Create a small balanced dataset with two groups.
balanced_size_per_group = 20
ages_group_a = np.random.normal(loc=30, scale=5, size=balanced_size_per_group)

# Create slightly different ages for second group.
ages_group_b = np.random.normal(loc=50, scale=5, size=balanced_size_per_group)

# Create simple outcome higher for older ages.
outcome_a = (ages_group_a > 35).astype(int)
outcome_b = (ages_group_b > 35).astype(int)

# Build balanced dataframe with equal group representation.
data_balanced = pd.DataFrame(
    {
        "age": np.concatenate([ages_group_a, ages_group_b]),
        "group": ["A"] * balanced_size_per_group
        + ["B"] * balanced_size_per_group,
        "outcome": np.concatenate([outcome_a, outcome_b]),
    }
)

# Show group counts for the balanced dataset.
print("Balanced dataset group counts:")
print(data_balanced["group"].value_counts())

# Create an uneven sampled dataset favoring group A heavily.
uneven_size_a = 35
uneven_size_b = 5

# Sample without replacement from original arrays safely.
indices_a = np.random.choice(balanced_size_per_group, size=uneven_size_a, replace=True)
indices_b = np.random.choice(balanced_size_per_group, size=uneven_size_b, replace=True)

# Build uneven dataframe using sampled indices.
data_uneven = pd.DataFrame(
    {
        "age": np.concatenate([
            ages_group_a[indices_a],
            ages_group_b[indices_b],
        ]),
        "group": ["A"] * uneven_size_a + ["B"] * uneven_size_b,
        "outcome": np.concatenate([
            outcome_a[indices_a],
            outcome_b[indices_b],
        ]),
    }
)

# Show group counts for the uneven dataset.
print("\nUneven dataset group counts:")
print(data_uneven["group"].value_counts())

# Define a naive rule based only on majority group pattern.
threshold_age = data_uneven[data_uneven["group"] == "A"]["age"].mean()

# Predict outcome using the single learned threshold.
predictions = (data_balanced["age"] > threshold_age).astype(int)

# Check shapes before computing accuracy values.
if predictions.shape[0] == data_balanced.shape[0]:

    # Compute accuracy separately for each group.
    mask_a = data_balanced["group"] == "A"
    mask_b = data_balanced["group"] == "B"

    # Calculate accuracy for group A.
    acc_a = (predictions[mask_a].values
             == data_balanced.loc[mask_a, "outcome"].values).mean()

    # Calculate accuracy for group B.
    acc_b = (predictions[mask_b].values
             == data_balanced.loc[mask_b, "outcome"].values).mean()

    # Print simple comparison of group accuracies.
    print("\nNaive rule accuracy using uneven sampling threshold:")
    print("Group A accuracy:", round(float(acc_a), 3))
    print("Group B accuracy:", round(float(acc_b), 3))

# Final print summarizing what changed between datasets.
print("\nNotice how uneven sampling changed who the rule fits.")




### **1.2. Legacy Bias in Data**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Machine Learning for Beginners/Module_10/Lecture_A/image_01_02.jpg?v=1769974234" width="250">



>* Historical inequalities get baked into training data
>* Models then repeat and strengthen those past injustices

>* Old hiring and promotion data can encode favoritism
>* Healthcare records may hide illness in underserved groups

>* Historical data can hide deep, persistent bias
>* Question data origins and context to spot unfairness



### **1.3. Unequal Data Measurement**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Machine Learning for Beginners/Module_10/Lecture_A/image_01_03.jpg?v=1769974246" width="250">



>* Measurement methods differ across groups, creating bias
>* Models learn and reinforce these distorted measurements

>* Subjective ratings and records can encode hidden bias
>* Models trained on them may unfairly punish groups

>* Tech, language, environment can distort data capture
>* Uneven measurement creates hidden bias in models



## **2. Unequal Model Outcomes**

### **2.1. Error Rate Gaps**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Machine Learning for Beginners/Module_10/Lecture_A/image_02_01.jpg?v=1769974262" width="250">



>* Overall accuracy can hide group performance differences
>* Biased training data creates unequal error rates

>* Different tasks show unequal false positives or negatives
>* Historical data imbalances cause clustered mistakes by group

>* Error gaps cause unequal treatment and harm
>* They mirror and worsen existing social inequalities



In [None]:
#@title Python Code - Error Rate Gaps

# This script shows simple error rate gaps.
# We compare model mistakes across two groups.
# Focus is on unequal outcomes not algorithms.

# import required built in and numerical libraries.
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

# set deterministic random seed for reproducibility.
np.random.seed(42)

# create tiny dataset with group labels and outcomes.
data = {
    "group": ["A"] * 20 + ["B"] * 20,
    "true_label": [1] * 10 + [0] * 10 + [1] * 10 + [0] * 10,
}

# create biased predictions with higher errors for group B.
predictions = []
for i, g in enumerate(data["group"]):
    true = data["true_label"][i]
    if g == "A":
        flip_prob = 0.1

    else:
        flip_prob = 0.4

    if np.random.rand() < flip_prob:
        pred = 1 - true

    else:
        pred = true

    predictions.append(pred)

# build dataframe from constructed columns and predictions.
df = pd.DataFrame(data)
df["prediction"] = predictions

# verify dataframe shape is as expected.
assert df.shape == (40, 3)


# function to compute error rate for each group.
def compute_error_rate(group_name):
    subset = df[df["group"] == group_name]
    errors = subset["prediction"] != subset["true_label"]
    return errors.mean()


# calculate error rates for both groups separately.
error_A = compute_error_rate("A")
error_B = compute_error_rate("B")

# calculate overall error rate across all individuals.
overall_error = (df["prediction"] != df["true_label"]).mean()

# print concise summary highlighting error rate gaps.
print("Overall error rate:", round(overall_error, 3))
print("Group A error rate:", round(error_A, 3))
print("Group B error rate:", round(error_B, 3))
print("Error rate gap B minus A:", round(error_B - error_A, 3))

# prepare values for simple bar chart comparison.
labels = ["Overall", "Group A", "Group B"]
values = [overall_error, error_A, error_B]

# create bar plot to visualize unequal error rates.
fig, ax = plt.subplots(figsize=(5, 4))
ax.bar(labels, values, color=["gray", "skyblue", "salmon"])

# label axes and title clearly for beginners.
ax.set_ylabel("Error rate (proportion wrong)")
ax.set_title("Error rate gaps between groups A and B")

# add horizontal line showing overall error reference.
ax.axhline(overall_error, color="black", linestyle="--", linewidth=1)

# adjust layout and display the single plot.
plt.tight_layout()
plt.show()




### **2.2. Real World Consequences**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Machine Learning for Beginners/Module_10/Lecture_A/image_02_02.jpg?v=1769974304" width="250">



>* Biased hiring models quietly limit people’s opportunities
>* They reinforce inequality and reduce workplace diversity

>* Biased models misjudge health and crime risks
>* Unequal errors worsen care, punishment, and resources

>* Biased systems harm confidence, opportunities, and participation
>* Communities may distrust technology, worsening social division



### **2.3. Real World Bias Examples**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Machine Learning for Beginners/Module_10/Lecture_A/image_02_03.jpg?v=1769974315" width="250">



>* Hiring algorithms learn past gender-skewed patterns
>* Qualified women and non-binary people get unfairly rejected

>* Biased policing data makes some communities seem riskier
>* Models reinforce bias, driving harsher future treatment

>* Healthcare and credit models can misjudge groups
>* Biased training data amplifies existing social inequalities



## **3. Practical Fairness Checks**

### **3.1. Group Fairness Checks**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Machine Learning for Beginners/Module_10/Lecture_A/image_03_01.jpg?v=1769974327" width="250">



>* Check model performance separately for different groups
>* Compare error patterns to spot possible unfairness

>* Compare key outcome rates across different groups
>* Unequal rates reveal hidden patterns of unfairness

>* Adjust data and thresholds to reduce unfairness
>* Monitor group gaps over time and document limitations



In [None]:
#@title Python Code - Group Fairness Checks

# This script shows simple group fairness checks.
# We use tiny synthetic hiring recommendation data.
# Focus on comparing model errors across demographic groups.

# Required libraries are available in Colab by default.
# import statements use only allowed safe libraries.

# Import numpy and pandas for simple data handling.
import numpy as np
import pandas as pd

# Set deterministic random seed for reproducible results.
rng = np.random.default_rng(seed=42)

# Create a tiny synthetic dataset for hiring.
num_rows = 40
qualified = rng.integers(low=0, high=2, size=num_rows)

# Create a simple gender group column for fairness checks.
gender = rng.choice(["female", "male"], size=num_rows)

# Simulate a biased model score higher for one group.
base_scores = qualified + rng.normal(loc=0.0, scale=0.4, size=num_rows)

# Add a small unfair bonus to one gender group.
bonus = np.where(gender == "male", 0.3, 0.0)
model_score = base_scores + bonus

# Build a pandas DataFrame to hold our dataset.
data = pd.DataFrame({
    "qualified": qualified,
    "gender": gender,
    "score": model_score,
})

# Validate dataset shape before continuing with analysis.
assert data.shape[0] == num_rows

# Set one global decision threshold for interview recommendation.
threshold = 0.8

# Create model decision column based on the threshold.
data["recommended"] = (data["score"] >= threshold).astype(int)

# Define a helper to compute group fairness summary metrics.
def group_fairness_summary(df, group_col):
    # Prepare a list for storing summary rows.
    rows = []

    # Loop through each group value in the column.
    for group_value, group_df in df.groupby(group_col):
        # Skip groups that are too tiny for stable rates.
        if len(group_df) < 3:
            continue

        # Compute basic counts for this group.
        total = len(group_df)
        positives = group_df["qualified"].sum()

        # Avoid division by zero when no positives exist.
        if positives == 0:
            true_positive_rate = np.nan
        else:
            # Share of qualified correctly recommended.
            true_positive_rate = (
                group_df[(group_df["qualified"] == 1) &
                         (group_df["recommended"] == 1)].shape[0]
                / positives
            )

        # Compute false positive rate for unqualified candidates.
        negatives = total - positives
        if negatives == 0:
            false_positive_rate = np.nan
        else:
            false_positive_rate = (
                group_df[(group_df["qualified"] == 0) &
                         (group_df["recommended"] == 1)].shape[0]
                / negatives
            )

        # Compute overall accuracy for this group.
        accuracy = (
            (group_df["qualified"] == group_df["recommended"]).sum()
            / total
        )

        # Append summary row for this group.
        rows.append({
            group_col: group_value,
            "group_size": total,
            "true_positive_rate": round(float(true_positive_rate), 3)
            if not np.isnan(true_positive_rate) else np.nan,
            "false_positive_rate": round(float(false_positive_rate), 3)
            if not np.isnan(false_positive_rate) else np.nan,
            "accuracy": round(float(accuracy), 3),
        })

    # Return a DataFrame with one row per group.
    return pd.DataFrame(rows)

# Compute fairness summary by gender group.
summary = group_fairness_summary(data, "gender")

# Select only a few columns for compact printing.
summary_to_print = summary[["gender", "group_size", "true_positive_rate",
                            "false_positive_rate", "accuracy"]]

# Print a short explanation header for learners.
print("Group fairness check for a simple hiring model:")

# Print the summary table with basic group metrics.
print(summary_to_print.to_string(index=False))

# Print a brief interpretation hint for beginners.
print("Compare rates across groups to spot possible unfair gaps.")




### **3.2. Broadening Data Representation**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Machine Learning for Beginners/Module_10/Lecture_A/image_03_02.jpg?v=1769974376" width="250">



>* Check if training data reflects real users
>* Look for missing groups and potential unfair gaps

>* Add ethical data from underrepresented groups
>* Reduce overfitting and unfair errors across groups

>* Match training data to real usage contexts
>* Fill missing scenarios or narrow model scope



### **3.3. Communicating Model Limitations**

<img src="https://cdn.jsdelivr.net/gh/mhrafiei/contents@main/LFF/Machine Learning for Beginners/Module_10/Lecture_A/image_03_03.jpg?v=1769974387" width="250">



>* State where the model works and fails
>* Help users question predictions, not trust blindly

>* Describe model purpose, data, and weak spots
>* Warn about unfairness for underrepresented applicant groups

>* Explain safe use, limits, and override situations
>* Encourage human review, feedback, and ongoing improvement



# <font color="#418FDE" size="6.5" uppercase>**Bias And Fairness**</font>


In this lecture, you learned to:
- Identify potential sources of bias in datasets used for machine learning. 
- Explain how biased data can lead to unfair model predictions for certain groups. 
- Propose simple steps a beginner can take to check for and reduce unfairness. 

<font color='yellow'>Congratulations on completing this course!</font>