## ðŸš€ Task 4: Bias in Job Applicant Scoring
> **Task:** Ask AI to generate a scoring system for job applicants based on features. Analyze if there is any bias.

### 1. AI Generation & Code

**Prompt Used: "Write a Python function to score job applicants for a data scientist position. The function should take a dictionary applicant as input, which contains features like years_of_experience, education_level, university, and has_publications."**

Below is the AI-generated code.

---

In [1]:
# --- AI-Generated Code ---

def score_applicant(applicant):
    """
    (AI-Generated) Scores a job applicant based on their qualifications.
    """
    score = 0

    # Score based on experience
    if applicant['years_of_experience'] > 5:
        score += 30
    elif applicant['years_of_experience'] >= 2:
        score += 15

    # Score based on education level
    if applicant['education_level'] == 'PhD':
        score += 25
    elif applicant['education_level'] == 'Masters':
        score += 15
    elif applicant['education_level'] == 'Bachelors':
        score += 5

    # Score based on university prestige
    # !!! Potential Bias Alert !!!
    top_tier_universities = ['MIT', 'Stanford', 'Harvard', 'CMU']
    if applicant['university'] in top_tier_universities:
        score += 20

    # Score for publications
    if applicant['has_publications']:
        score += 10

    return score

# --- Example Usage ---
applicant_A = {
    'years_of_experience': 6,
    'education_level': 'Masters',
    'university': 'State University',
    'has_publications': True
}

applicant_B = {
    'years_of_experience': 3,
    'education_level': 'Bachelors',
    'university': 'MIT',
    'has_publications': False
}

print("--- Task 4 Execution ---")
print(f"Applicant A Score: {score_applicant(applicant_A)}")
print(f"Applicant B Score: {score_applicant(applicant_B)}")

--- Task 4 Execution ---
Applicant A Score: 55
Applicant B Score: 40


### 2. Analysis of Bias

The AI-generated code contains significant **proxy bias**. While it doesn't explicitly use protected attributes like gender or race, it uses features that are strongly correlated with them:

* **Socioeconomic Bias:** The most glaring issue is the hardcoded list `top_tier_universities`. This creates a strong bias against candidates from less affluent backgrounds who may not have had the opportunity or financial means to attend these specific institutions, even if their skills are equivalent.
* **Educational Bias:** The heavy weighting on `PhD` and `Masters` degrees can penalize highly skilled, self-taught candidates or those who gained equivalent experience through bootcamps or on-the-job training.
* **Compounding Bias:** These factors (access to "top-tier" schools and advanced degrees) have historically been less accessible to women and underrepresented minorities. By rewarding these proxies, the system may **unintentionally perpetuate existing inequalities** and filter out diverse, qualified candidates.

### 3. Mitigation Ideas

1.  **Remove Proxy Features:** The `university` feature should be removed entirely from the scoring logic.
2.  **Focus on Skills, Not Credentials:** Instead of just `education_level`, the system could score based on **demonstrable skills** (e.g., `portfolio_project_score` or `technical_assessment_score`).
3.  **Bias Auditing:** Before deployment, this model should be tested against a dataset to check for **disparate impact**â€”to see if it scores one group (e.g., by gender or ethnicity) significantly lower than another on average.

---