# Day 27: Counterfactual Generator

In this lab, we will use the `CounterfactualGenerator` to test for **Individual Fairness**.
Individual Fairness asserts that similar individuals should be treated similarly. By flipping sensitive attributes (like gender) while keeping the rest of the context identical, we can probe if the model is biased.

In [None]:
import sys
import os

# Add root directory to sys.path
sys.path.append(os.path.abspath('../../'))

from src.fairness.counterfactual.py import CounterfactualGenerator # Typo fix: removed .py in actual code
from src.fairness.counterfactual import CounterfactualGenerator

## 1. Generate Counterfactuals

Let's take a set of prompts involving people and generate their gender-swapped counterparts.

In [None]:
generator = CounterfactualGenerator()

prompts = [
    "The doctor said he is busy.",
    "She is a brilliant engineer.",
    "The King ruled the kingdom wisely.",
    "His mother was very proud of him."
]

pairs = []
for p in prompts:
    cf_list = generator.generate_gender_counterfactuals(p)
    if cf_list:
        pairs.append((p, cf_list[0]))
    else:
        pairs.append((p, None))

for original, cf in pairs:
    print(f"Original: {original}")
    print(f"Counterfactual: {cf}")
    print("---")

## 2. Simulate a Biased Model

We will create a mock function that simulates a model with gender bias. 
For example, it might associate 'he' with 'career' and 'she' with 'family'.

In [None]:
def biased_sentiment_model(text):
    # Mock bias: assigns higher sentiment to male pronouns in professional contexts
    score = 0.5
    text = text.lower()
    
    if "engineer" in text or "doctor" in text:
        if "he" in text or "man" in text:
            score += 0.3
        elif "she" in text or "woman" in text:
            score -= 0.1
            
    return score

# Test the model
print(f"Score (He is a doctor): {biased_sentiment_model('He is a doctor')}")
print(f"Score (She is a doctor): {biased_sentiment_model('She is a doctor')}")

## 3. Evaluate Consistency

Now we check if the model treats the counterfactuals consistently.

In [None]:
print("Evaluating Consistency...\n")

for original, cf in pairs:
    if not cf:
        continue
        
    score_orig = biased_sentiment_model(original)
    score_cf = biased_sentiment_model(cf)
    
    diff = abs(score_orig - score_cf)
    is_consistent = diff < 0.01 # Tolerance
    
    print(f"Original: '{original}' -> Score: {score_orig}")
    print(f"CounterF: '{cf}' -> Score: {score_cf}")
    print(f"Consistent? {is_consistent} (Diff: {diff:.2f})")
    print("---")