# Lab 6, Module 3: Tabular Saliency via Feature Perturbation

**Estimated time:** 10 minutes

---

## **Opening: From Pixels to Features**

So far, you've explored saliency for:
- **Text:** Which words matter? (Module 1)
- **Images:** Which pixels matter? (Module 2)

Now let's explore saliency for **structured/tabular data**‚Äîthe kind you'd find in spreadsheets and databases.

### **Today's Question:**

> When predicting if a student will pass an exam, which features matter most: **hours studied**, **attendance rate**, **previous GPA**, or... **zip code**?

That last one should raise red flags! Saliency can reveal when models rely on **problematic proxies** for outcomes.

---

# üìò **Tabular Data and Feature Importance**

Unlike text (discrete words) and images (continuous pixels), tabular data has:
- **Named features** with clear meanings (age, income, GPA)
- **Different scales** (years: 18-80, income: $0-$200k)
- **Mixed types** (numerical, categorical)

### **The Method: Feature Perturbation**

The idea is similar to word masking from Module 1:

1. **Get baseline prediction**
2. **Perturb one feature at a time** (set it to mean value)
3. **Measure prediction change**
4. **Large change = important feature**

**Example:**
```
Student: hours_studied=8, attendance=0.9, GPA=3.5 ‚Üí 90% pass probability

Perturb hours_studied ‚Üí 5 (mean):  ‚Üí 60% pass (change: -30%)
Perturb attendance ‚Üí 0.7 (mean):   ‚Üí 85% pass (change: -5%)
Perturb GPA ‚Üí 3.0 (mean):          ‚Üí 80% pass (change: -10%)
```

**Importance ranking:** hours_studied (30%) > GPA (10%) > attendance (5%)

---

## üß± **Building a Student Performance Predictor**

We'll create a toy dataset of student exam outcomes with 4 features:
- **hours_studied:** How many hours they studied (0-10)
- **attendance_rate:** Fraction of classes attended (0-1)
- **homework_completion:** Fraction of homework completed (0-1)
- **previous_gpa:** GPA from previous semester (2.0-4.0)

Target: **passed_exam** (yes/no)

In [None]:
# Install and import libraries
!pip install scikit-learn pandas matplotlib numpy -q

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import train_test_split

print("‚úì Libraries loaded successfully!")

### **Step 1: Generate Synthetic Student Data**

We'll create 50 students with realistic relationships between features and outcomes:

In [None]:
# Set random seed for reproducibility
np.random.seed(42)

# Number of students
n_samples = 50

# Generate features
data = pd.DataFrame({
    'hours_studied': np.random.uniform(0, 10, n_samples),
    'attendance_rate': np.random.uniform(0.3, 1.0, n_samples),
    'homework_completion': np.random.uniform(0.4, 1.0, n_samples),
    'previous_gpa': np.random.uniform(2.0, 4.0, n_samples)
})

# Generate target (pass/fail) based on weighted combination
# More hours + higher GPA ‚Üí more likely to pass
success_score = (
    data['hours_studied'] * 0.4 +
    data['previous_gpa'] * 0.3 +
    data['attendance_rate'] * 0.2 +
    data['homework_completion'] * 0.1 +
    np.random.normal(0, 0.5, n_samples)  # Add noise
)

# Convert to binary outcome
data['passed'] = (success_score > 2.5).astype(int)

print(f"Dataset created: {n_samples} students")
print(f"  Passed: {data['passed'].sum()} ({data['passed'].mean()*100:.1f}%)")
print(f"  Failed: {(1-data['passed']).sum()} ({(1-data['passed'].mean())*100:.1f}%)")
print("\nFirst 5 students:")
print(data.head())

### **Step 2: Train a Simple Decision Tree**

We'll use a decision tree because it's simple and interpretable:

In [None]:
# Prepare features and target
feature_names = ['hours_studied', 'attendance_rate', 'homework_completion', 'previous_gpa']
X = data[feature_names]
y = data['passed']

# Train decision tree
model = DecisionTreeClassifier(max_depth=4, random_state=42)
model.fit(X, y)

# Check accuracy
train_accuracy = model.score(X, y)
print(f"\n‚úì Model trained successfully!")
print(f"Training accuracy: {train_accuracy*100:.1f}%")
print("\nNote: This is on training data only‚Äîthis is a demonstration!")

### **Step 3: Test Predictions on Example Students**

Let's see what the model predicts for some test cases:

In [None]:
# Create example students
examples = pd.DataFrame([
    {'hours_studied': 8.0, 'attendance_rate': 0.9, 'homework_completion': 0.95, 'previous_gpa': 3.5,
     'description': 'High-performing student'},
    {'hours_studied': 2.0, 'attendance_rate': 0.5, 'homework_completion': 0.4, 'previous_gpa': 2.2,
     'description': 'Struggling student'},
    {'hours_studied': 5.0, 'attendance_rate': 0.75, 'homework_completion': 0.7, 'previous_gpa': 3.0,
     'description': 'Average student'}
])

print("Model predictions on example students:\n")
for idx, row in examples.iterrows():
    features = row[feature_names].values.reshape(1, -1)
    prob = model.predict_proba(features)[0]
    prediction = "PASS" if prob[1] > 0.5 else "FAIL"
    
    print(f"{row['description']}:")
    print(f"  Hours studied: {row['hours_studied']}, Attendance: {row['attendance_rate']:.2f}, "
          f"Homework: {row['homework_completion']:.2f}, GPA: {row['previous_gpa']}")
    print(f"  ‚Üí Prediction: {prediction} ({prob[1]*100:.1f}% probability of passing)\n")

---

## üîç **Computing Feature Importance via Perturbation**

Now let's see which features matter most for these predictions:

In [None]:
def compute_feature_importance(sample, model, feature_names, reference_data):
    """
    Compute feature importance by perturbing each feature to its mean value.
    
    Args:
        sample: 1D array of feature values
        model: Trained sklearn model
        feature_names: List of feature names
        reference_data: DataFrame with training data (to compute means)
    
    Returns:
        importance_dict: Dictionary mapping feature names to importance scores
    """
    # Get baseline prediction
    baseline_prob = model.predict_proba([sample])[0][1]
    
    importance = {}
    
    # Perturb each feature
    for i, feature in enumerate(feature_names):
        # Create perturbed sample
        perturbed = sample.copy()
        perturbed[i] = reference_data[feature].mean()  # Set to mean
        
        # Get new prediction
        perturbed_prob = model.predict_proba([perturbed])[0][1]
        
        # Importance = absolute change in probability
        importance[feature] = abs(baseline_prob - perturbed_prob)
    
    return importance, baseline_prob

print("‚úì Feature importance function defined!")

### **Step 4: Visualize Feature Importance**

In [None]:
def visualize_feature_importance(sample_dict, model, feature_names, reference_data, title=""):
    """
    Visualize feature importance as a bar chart.
    """
    sample = np.array([sample_dict[f] for f in feature_names])
    importance, baseline_prob = compute_feature_importance(sample, model, feature_names, reference_data)
    
    # Sort by importance
    sorted_features = sorted(importance.items(), key=lambda x: x[1], reverse=True)
    features = [f[0] for f in sorted_features]
    values = [f[1] for f in sorted_features]
    
    # Create bar chart
    fig, ax = plt.subplots(figsize=(10, 5))
    colors = plt.cm.viridis(np.linspace(0.3, 0.9, len(features)))
    ax.barh(features, values, color=colors)
    ax.set_xlabel('Importance Score (Change in Pass Probability)', fontsize=12)
    ax.set_title(f'Feature Importance: {title}\nBaseline: {baseline_prob*100:.1f}% probability of passing', 
                 fontsize=12)
    ax.grid(axis='x', alpha=0.3)
    plt.tight_layout()
    plt.show()
    
    # Print results
    print(f"\nFeature importance (sorted):")
    for feature, imp in sorted_features:
        print(f"  {feature:25s}: {imp:.3f} (perturbing this changes prediction by {imp*100:.1f}%)")

print("‚úì Visualization function defined!")

---

## üìä **Example 1: High-Performing Student**

In [None]:
student1 = {
    'hours_studied': 8.0,
    'attendance_rate': 0.9,
    'homework_completion': 0.95,
    'previous_gpa': 3.5
}

visualize_feature_importance(student1, model, feature_names, X, "High-Performing Student")

---

## üìù **Question 15 (Observation)**

**Q15.** Which feature had the highest importance for predicting exam success? Does this align with intuition?

*Think about: What would you expect to be most predictive of passing an exam?*

*Record your answer in the Answer Sheet.*

---

## üìä **Example 2: Struggling Student**

In [None]:
student2 = {
    'hours_studied': 2.0,
    'attendance_rate': 0.5,
    'homework_completion': 0.4,
    'previous_gpa': 2.2
}

visualize_feature_importance(student2, model, feature_names, X, "Struggling Student")

### **What to Notice:**

Feature importance can **vary by student**! For some students, hours studied matters most. For others, GPA or attendance might dominate.

---

## üß™ **Experimentation: Perturbation Sensitivity**

Let's see what happens when we manually perturb features by different amounts:

In [None]:
# Take an average student
student_avg = {
    'hours_studied': 5.0,
    'attendance_rate': 0.75,
    'homework_completion': 0.7,
    'previous_gpa': 3.0
}

sample_avg = np.array([student_avg[f] for f in feature_names])
baseline = model.predict_proba([sample_avg])[0][1]

print(f"Average student baseline: {baseline*100:.1f}% pass probability\n")
print("Testing perturbations by ¬±1 standard deviation:\n")

for i, feature in enumerate(feature_names):
    std = X[feature].std()
    
    # Increase by 1 std
    perturbed_up = sample_avg.copy()
    perturbed_up[i] += std
    prob_up = model.predict_proba([perturbed_up])[0][1]
    
    # Decrease by 1 std
    perturbed_down = sample_avg.copy()
    perturbed_down[i] -= std
    prob_down = model.predict_proba([perturbed_down])[0][1]
    
    print(f"{feature:25s}:")
    print(f"  +1 std ‚Üí {prob_up*100:5.1f}% (change: {(prob_up-baseline)*100:+.1f}%)")
    print(f"  -1 std ‚Üí {prob_down*100:5.1f}% (change: {(prob_down-baseline)*100:+.1f}%)")
    print()

---

## üìù **Question 16 (Experimentation)**

**Q16.** Try perturbing different features by ¬±1 standard deviation (see output above). Which perturbation changed the prediction the most?

*Record your answer in the Answer Sheet.*

---

## ‚ö†Ô∏è **Ethical Issue: Problematic Features**

What if we added a **zip code** feature to our model? Let's explore the ethical implications:

In [None]:
# Add a zip code feature (simulate socioeconomic proxy)
# Let's say zip codes 10001-10050 correlate with higher pass rates
data_with_zip = data.copy()
data_with_zip['zip_code'] = np.random.choice(range(10001, 10051), n_samples)

# Make zip code slightly predictive (problematic!)
data_with_zip['zip_code_normalized'] = (data_with_zip['zip_code'] - 10025) / 25

# Retrain model with zip code
feature_names_biased = feature_names + ['zip_code_normalized']
X_biased = data_with_zip[feature_names_biased]
y_biased = data_with_zip['passed']

model_biased = DecisionTreeClassifier(max_depth=4, random_state=42)
model_biased.fit(X_biased, y_biased)

print("‚úì Biased model trained (includes zip code feature)\n")

# Test on student
student_test = {
    'hours_studied': 5.0,
    'attendance_rate': 0.75,
    'homework_completion': 0.7,
    'previous_gpa': 3.0,
    'zip_code_normalized': 0.5  # High-income zip code proxy
}

visualize_feature_importance(student_test, model_biased, feature_names_biased, X_biased, 
                             "Student (with zip code feature)")

### **What's Wrong Here?**

If **zip_code** has high importance, the model is using **geography as a proxy** for success. This could reflect:
- Socioeconomic status
- School district quality
- Historical redlining patterns

**This is problematic!** The model might discriminate based on where students live, not their actual ability.

---

## üìù **Questions 17-18 (Ethics & Application)**

**Q17.** If "zip_code" had high importance, why might this be problematic for a real education system?

*Think about: What does zip code proxy for? Is it fair to judge students by their address? What historical inequities might this reflect?*

*Record your answer in the Answer Sheet.*

---

**Q18.** Name a feature that might be predictive but ethically problematic to use in a real-world model (hiring, lending, admissions).

*Examples to consider: Name, gender, age, race, zip code, university name, etc.*

*Record your answer in the Answer Sheet.*

---

## üîó **How Saliency Helps with Fairness Auditing**

Feature importance analysis is the **first step** in fairness auditing:

1. **Compute saliency** for all features
2. **Identify problematic features** (proxies for protected classes)
3. **Investigate why** those features are important
4. **Remove or mitigate** bias sources
5. **Retrain and verify** improvement

Real-world tools for this:
- **Fairlearn (Microsoft):** Bias detection and mitigation
- **AI Fairness 360 (IBM):** Comprehensive fairness toolkit
- **What-If Tool (Google):** Interactive model probing
- **SHAP:** Advanced feature importance with game theory

---

## üìù **Question 19 (Reflection - will be in Module 4)**

*Note: This question is part of the ethics module.*

---

## ‚úÖ Module 3 Complete!

You've learned:
- **How to compute feature importance** via perturbation
- **Which features drive predictions** in tabular data
- **Why some features are ethically problematic** (proxies for protected classes)
- **How saliency helps with fairness auditing** (detecting bias)
- **The importance of feature selection** in responsible AI

**Key Insight:** Saliency isn't just for debugging‚Äîit's crucial for **detecting and preventing discrimination** in automated decision systems.

**Next up:** Module 4, where you'll reflect on **ethics and explainability in practice**‚Äîbringing together everything you've learned about responsible AI deployment.

---