# 📌 What is Computing Bias?
> **Bias:** An inclination or prejudice in favor of or against a person or a group of people, typically in a way that is unfair.

Computing **bias** occurs when computer programs, algorithms, or systems produce results that unfairly favor or disadvantage certain groups. This bias can result from **biased data, flawed design, or unintended consequences** of programming.

<iframe width="560" height="315" src="https://www.youtube.com/embed/aeBLboArW8c?si=No1ZvRCvxdiEmbZB" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen></iframe>

---

## 🎥 Example: Netflix Recommendation Bias
Netflix provides content recommendations to users through algorithms. However, these algorithms can introduce bias in several ways:  

### 🔍 **How Bias Can Occur:**
- **Majority Preference Bias:**  
  - Recommending mostly popular content, making it hard for less popular or niche content to be discovered.  
- **Filtering Bias:**  
  - Filtering out content that doesn’t fit a user’s perceived interests based on limited viewing history.  
  - For example, if a user primarily watches romantic comedies, Netflix may avoid suggesting documentaries or foreign films, even if the user would enjoy them.  

---

# 🧐 How Does Computing Bias Happen?
Computing bias can occur for various reasons, including:  

### 📂 **1. Unrepresentative or Incomplete Data:**  
- Algorithms trained on data that **doesn't represent real-world diversity** will produce biased results.  

### 📉 **2. Flawed or Biased Data:**  
- Historical or existing prejudices reflected in the training data can lead to biased outputs.  

### 📝 **3. Data Collection & Labeling:**  
- Human annotators may introduce biases due to different cultural or personal biases during the data labeling process.  

---


## 📊 **Explicit Data vs. Implicit Data**

### 📝 **Explicit Data**
**Definition:** Data that the user or programmer **directly provides**.

- **Example:** On Netflix, users input personal information such as **name**, **age**, and **preferences**. They can also **rate shows** or **movies**.

### 🔍 **Implicit Data**
**Definition:** Data that is **inferred** from the user's actions or behavior, not directly provided.

- **Example:** Netflix tracks your **viewing history**, **watch time**, and **interactions** with content. This data is then used to **recommend shows and movies** that Netflix thinks you might like.

---

### ⚖️ **Implications**
- **Implicit Data** can lead to reinforcing **bias** by suggesting content based on **past behavior**, potentially **limiting diversity** and preventing users from discovering new genres.
- **Explicit Data** is generally more **accurate** but can still be biased if **user input is limited** or influenced by the **design of the platform**.

---
## 🤔 Popcorn Hack #1

**What is an example of Explicit Data?**

A) Netflix recommends shows based on your viewing history.  
B) You provide your name, age, and preferences when creating a Netflix account.  
C) Netflix tracks the time you spend watching certain genres.

<div class="flip-container">
    <div class="flipper">
        <div class="front">
            <button class="button" onclick="flipCard()">Show Answer</button>
        </div>
        <div class="back">
            The answer is: B) You provide your name, age, and preferences when creating a Netflix account. This is an example of **explicit data**, as it is directly provided by the user.
        </div>
    </div>
</div>

<script>
    function flipCard() {
        const flipContainer = event.target.closest('.flip-container');
        flipContainer.classList.toggle('flipped');
    }
</script>


## 📝 Types of Bias

> **🤖 Algorithmic Bias**  
- <ins>Algorithmic bias</ins> is bias generated from a repeatable but faulty **computer system** that produces inaccurate results.
    - Example: A hiring algorithm at Amazon is trained on past employee data but the data shows that male candidates were hired more often than female candidates. Because of this, the system favored male candidates over female candidates because historical hiring practices were biased toward men.
<div style="text-align: center; margin: 20px 0;">
    <img src="{{site.baseurl}}/images/algorithmic.jpg" style="max-width: 80%; border-radius: 10px; box-shadow: 5px 5px 15px rgba(0,0,0,0.2);">
</div>
> **📈 Data Bias**  
- <ins>Data bias</ins> occurs when the data itself includes bias caused by **incomplete or erroneous information**.
    - Example: A healthcare AI model predicts lower disease risk for certain populations. Since the AI model hasn't been introduced to other demographics, it would assume that data should include patients from a specific demographic, and not consider others.
<div style="text-align: center; margin: 20px 0;">
    <img src="{{site.baseurl}}/images/data.png" style="max-width: 80%; border-radius: 10px; box-shadow: 5px 5px 15px rgba(0,0,0,0.2);">
</div>

> **🧠 Cognitive Bias**  
- <ins>Cognitive bias</ins> is when the person unintentionally introduces **their own bias** in the data.
    - Example: A researcher conducting a study on social media usage unconsciously selects data that supports their belief that too much screen time leads to lower grades. This is a form of cognitive bias called confirmation bias because the researcher is searching for information to support their beliefs.
<div style="text-align: center; margin: 20px 0;">
    <img src="{{site.baseurl}}/images/cognitive.jpg" style="max-width: 80%; border-radius: 10px; box-shadow: 5px 5px 15px rgba(0,0,0,0.2);">
</div>
---
<style>
    /* Style for the smaller button */
    .button {
        padding: 6px 12px;
        font-size: 12px;
        color: white;
        background: linear-gradient(135deg, #6a89cc, #4a69bd); /* Blue gradient */
        border: none;
        border-radius: 6px;
        cursor: pointer;
        transition: background 0.3s, transform 0.2s, box-shadow 0.2s;
        box-shadow: 0 4px 6px rgba(0, 0, 0, 0.1);
    }

    .button:hover {
        background: linear-gradient(135deg, #4a69bd, #1e3799); /* Darker blue gradient */
        transform: scale(1.03);
        box-shadow: 0 6px 10px rgba(0, 0, 0, 0.15);
    }

    /* Container for the flip effect */
    .flip-container {
        perspective: 1000px;
        margin: 20px 0;
        display: flex;
        justify-content: center;
    }

    .flipper {
        width: 100%;
        height: 70px;
        transform-style: preserve-3d;
        transition: transform 0.6s;
        display: flex;
        justify-content: center;
        align-items: center;
    }

    .front, .back {
        position: absolute;
        backface-visibility: hidden;
        width: 100%;
        height: 100%;
        display: flex;
        justify-content: center;
        align-items: center;
        font-size: 16px;
        font-family: 'Arial', sans-serif;
        padding: 15px;
        border-radius: 8px;
    }

    .front {
        background-color: #6a89cc;
        color: white;
    }

    .back {
        background-color: #4a69bd;
        color: white;
        transform: rotateY(180deg);
    }

    .flipped .flipper {
        transform: rotateY(180deg);
    }
</style>

## 🤔 Popcorn Hack #2

**What is an example of Data Bias?**  

A) A hiring algorithm favors male candidates because the training data contains a disproportionate number of male resumes.  
B) A system is trained on a dataset where certain groups, such as people with darker skin tones, are underrepresented.  
C) A researcher intentionally selects data that supports their own beliefs about the impact of screen time on grades.  

<div class="flip-container">
    <div class="flipper">
        <div class="front">
            <button class="button" onclick="flipCard()">Reveal Answer</button>
        </div>
        <div class="back">
            The answer is: B) A system is trained on a dataset where certain groups, such as people with darker skin tones, are underrepresented. This leads to the system performing poorly for these groups, which is an example of Data Bias.
        </div>
    </div>
</div>

<script>
    function flipCard() {
        const flipContainer = event.target.closest('.flip-container');
        flipContainer.classList.toggle('flipped');
    }
</script>


<style>
    /* Style for the button */
    .button {
        padding: 6px 12px;
        font-size: 12px;
        color: white;
        background: linear-gradient(135deg, #6a89cc, #4a69bd); /* Blue gradient */
        border: none;
        border-radius: 6px;
        cursor: pointer;
        transition: background 0.3s, transform 0.2s, box-shadow 0.2s;
        box-shadow: 0 4px 6px rgba(0, 0, 0, 0.1);
    }

    .button:hover {
        background: linear-gradient(135deg, #4a69bd, #1e3799); /* Darker blue gradient */
        transform: scale(1.03);
        box-shadow: 0 6px 10px rgba(0, 0, 0, 0.15);
    }

    /* Container for the flip effect */
    .flip-container {
        perspective: 1000px;
        margin: 20px 0;
        display: flex;
        justify-content: center;
    }

    .flipper {
        width: 100%;
        height: 70px;
        transform-style: preserve-3d;
        transition: transform 0.6s;
        display: flex;
        justify-content: center;
        align-items: center;
    }

    .front, .back {
        position: absolute;
        backface-visibility: hidden;
        width: 100%;
        height: 100%;
        display: flex;
        justify-content: center;
        align-items: center;
        font-size: 16px;
        font-family: 'Arial', sans-serif;
        padding: 15px;
        border-radius: 8px;
    }

    .front {
        background-color: #6a89cc;
        color: white;
    }

    .back {
        background-color: #4a69bd;
        color: white;
        transform: rotateY(180deg);
    }

    .flipped .flipper {
        transform: rotateY(180deg);
    }

    .question-container {
        margin: 20px;
        font-size: 18px;
        color: #333;
        font-family: 'Arial', sans-serif;
        line-height: 1.6;
        font-weight: 500;
    }

    /* Heading styling */
    .heading {
        font-size: 24px;
        color: #333;
        font-family: 'Arial', sans-serif;
        margin-bottom: 20px;
    }
</style>

## Intentional Bias vs Unintentional Bias

> **Intentional Bias:** The deliberate introduction of prejudice or unfairness into algorithms or systems, often by individuals or organizations, to achieve a specific outcome or advantage.  
> 
> **Example:** A hiring algorithm designed to favor candidates from certain backgrounds by prioritizing certain keywords associated with privileged groups.

Example: Imagine a company using a hiring algorithm to screen job applicants.

<img src="{{site.baseurl}}/images/interviewer.webp">
- **Goal of the algorithm:** Select the most qualified candidates based on their resumes and experience.
- However, the people who create this algorithm might intentionally include factors that are biased toward certain groups.

For example, if the algorithm is designed to prioritize resumes with certain words or experiences that are more common among a specific gender or ethnic group, it might unfairly favor candidates from that group over others.

<span>
<img src="{{site.baseurl}}/images/resume.png" width="500" height="350">
<img src="{{site.baseurl}}/images/racism.jpg" width="300" height="350">
</span>
<br>
Also, if the algorithm gives extra weight to leadership positions in high-profile companies that are predominantly male or white, it may unintentionally (but intentionally by the developers) disadvantage women or people of color who have the same qualifications but worked in different environments.

> **Unintentional Bias:** Occurs when algorithms, often trained on flawed or incomplete data, produce results that unfairly discriminate against certain groups.

Example: A facial recognition software.
- **Goal of the program:** Designed to identify people based on their facial features.
- However, if the software is trained using a large dataset of photos primarily of one race, it can have trouble identifying individuals who look different.

## Let's see a real-life example of this!

<iframe width="560" height="315" src="https://www.youtube.com/embed/t4DT3tQqgRM?si=_9lYOfF6leUYKgug" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" referrerpolicy="strict-origin-when-cross-origin" allowfullscreen></iframe>

For example, if the software is trained using pictures of people but the majority of those photos are of lighter-skinned individuals, the system may have trouble accurately recognizing people with darker skin tones.

This unintentional bias happens because the developers didn’t purposefully choose to exclude people with darker skin, but because the dataset they used happened to be unbalanced.

As a result, the system works better for lighter-skinned people and struggles with darker-skinned people, even though the goal is to treat everyone equally.

---

## 🤔 Popcorn Hack #3

**What is an example of Unintentional Bias?**  
A) A social media algorithm prioritizes content from a specific group of influencers because of their background.  
B) A facial recognition system works better for lighter-skinned individuals because of an unbalanced dataset.  
C) A hiring algorithm is designed to give preference to candidates from a specific ethnicity.  

<div class="flip-container">
    <div class="flipper">
        <div class="front">
            <button class="button" onclick="flipCard()">Show Answer</button>
        </div>
        <div class="back">
            B) A facial recognition system works better for lighter-skinned individuals because of an unbalanced dataset. This is an example of unintentional bias, as the system was not purposefully designed to favor one group over another, but the dataset led to biased results.
        </div>
    </div>
</div>

<script>
    function flipCard() {
        const flipContainer = event.target.closest('.flip-container');
        flipContainer.classList.toggle('flipped');
    }
</script>


## 🌟 **Mitigation Strategies**  

Mitigation strategies aim to **prevent computing bias** by gathering and using more **diverse and representative data** throughout the algorithm's lifecycle.  

---

### 🔍 **1. Pre-processing Phase (Model Planning & Preparation)**  
- **Purpose:** Identify and fix issues in data collection to ensure accurate model training.  
- **Actions:**  
  - Managing missing data.  
  - Ensuring data diversity.  
  - Selecting relevant variables.  
- ✅ **Outcome:** Prevents biased data from being used to train the model.  

---

### 🧩 **2. In-processing Phase (Algorithm Development & Validation)**  
- **Purpose:** Address biases during training and validation of AI algorithms.  
- **Actions:**  
  - Inserting synthetic samples representing minority cases.  
  - Using cross-validation strategies.  
- ✅ **Outcome:** Promotes equal representation across demographics.  

---

### 🚀 **3. Post-processing Phase (Deployment & Usage)**  
- **Purpose:** Implement the model and ensure fair application in real-world settings.  
- **Actions:**  
  - Monitoring model performance in deployment.  
  - Adjusting outputs to reduce bias.  
- ✅ **Outcome:** Ensures the model functions fairly for all user groups.  

---

> **** 
- **Pre-processing Phase (Model Planning and Preparation)**
    - Looking for and fixing any problems in the data collection process to ensure that the data is accurate for model training (management of missing data, ensure data diversity, select relevant variables, etc.).
    - Prevents biased data from being used to train the model
- **In-processing Phase (Algorithm Development & Validation)**
    - In-processing represents all activities surrounding the training and validation phase of an AI algorithm (inserting synthetic sampls that are representative of minority class cases, using cross-validation strategies, etc.)
    - Identifies biases or other vulnerabilities in the model and promote equal representation for all demographics
- **Post-processing Phase (Clinical Deployment)** 
    - This phase encompasses a model’s implementation after it's been deployed and used in a live environment by others.
    - It collects clinical data and interprets it to assess ensure compliance and refine future applications

# 📚 **Computing Bias - Homework Questions**

---

## ✅ **Multiple-Choice Questions**  

**1. What is computing bias?**  
A. A technical error in hardware causing malfunctions.  
B. A program or algorithm producing results that favor or disadvantage certain groups. 
C. A mistake in the code causing a program to crash.  
D. The act of manually inputting incorrect data.  

---

**2. What is the primary cause of bias in computing systems?**  
A. Poor internet connection.  
B. Efficient programming techniques.  
C. Increased processing power.  
D. Unrepresentative or incomplete data used to train algorithms.

---

**3. Which of the following is an example of implicit data collection?**  
A. Manually selecting your preferred language on a streaming platform.  
B. Filling out a survey about your favorite movies.  
C. Netflix tracking your watch history and suggesting similar shows. 
D. Clicking “like” on a specific genre on Netflix.  

---

**4. What is a common issue when algorithms are trained on biased datasets?**  
A. They can reinforce existing societal biases. 
B. They run faster.  
C. They become more accurate.  
D. They use less memory.  

---

**5. Which of the following could help reduce computing bias in recommendation systems?**  
A. Ignoring user preferences.  
B. Deleting all user data.  
C. Using diverse and representative training data.
D. Only using data from popular sources.  

---

## ✍️ **Short-Answer Question**  

**Explain the difference between implicit and explicit data. Provide an example of each.**

## 💯 **Scoring Rubric:**

| Criteria                                  | Description                                                                       | Points |
|-------------------------------------------|-----------------------------------------------------------------------------------|--------|
| **Multiple-Choice Questions (0.5 points total)** | Each correct answer is worth 0.1 points.                             | 0.5    |
| Question 1                                | Award 0.1 point if the correct option is selected.                 | 0.1    |
| Question 2                                | Award 0.1 point if the correct option is selected.                 | 0.1    |
| Question 3                                | Award 0.1 point if the correct option is selected.                 | 0.1    |
| Question 4                                | Award 0.1 point if the correct option is selected.                 | 0.1    |
| Question 5                                | Award 0.1 point if the correct option is selected.                 | 0.1    |
| **Short-Answer Question (0.5 points total)** | Explanation of implicit vs. explicit data, with accurate examples. | 0.5    |
| Clarity & Accuracy                        | Clear, concise, and correct explanation of implicit vs. explicit data. | 0.25   |
| Examples Provided                         | Provides appropriate examples for both implicit and explicit data.  | 0.25   |
| **Total**                                 |                                                                     | **1.0**  |