# Lab 6, Module 4: Ethics & Explainability in Practice

**Estimated time:** 10 minutes

---

## **Opening: Bringing It All Together**

You've now explored saliency across three different modalities:
- **Module 1:** Text (word masking)
- **Module 2:** Images (gradient heatmaps)
- **Module 3:** Tabular data (feature perturbation)

In every case, saliency revealed **what drives model decisions**‚Äîsometimes confirming our intuitions, sometimes revealing problems.

Now let's step back and reflect on the **bigger picture**: Why does explainability matter for responsible AI? When should we demand it? What are its limits?

---

# üìò **The Explainability Imperative**

## **When Explainability Is Required**

Not all AI applications need the same level of explainability. Consider these scenarios:

### **High-Stakes Decisions (Explainability CRITICAL)**

| Domain | Decision | Why Explainability Matters |
|--------|----------|----------------------------|
| **Healthcare** | Diagnosis, treatment recommendations | Doctors need to verify AI reasoning; patients have right to understand |
| **Criminal Justice** | Sentencing, parole, risk assessment | Constitutional rights to due process; detect algorithmic bias |
| **Finance** | Loan approvals, credit scoring | Legal requirements (Equal Credit Opportunity Act); consumer rights |
| **Hiring** | Resume screening, candidate ranking | Anti-discrimination laws; fairness auditing |
| **Education** | College admissions, scholarship awards | Equity concerns; detect bias against protected groups |

### **Low-Stakes Decisions (Explainability Nice-to-Have)**

| Domain | Decision | Why Less Critical |
|--------|----------|--------------------|
| **Entertainment** | Movie recommendations | Wrong recommendation = minor inconvenience |
| **Shopping** | Product suggestions | Easy to ignore, low consequence |
| **Content Ranking** | Social media feed order | User can scroll past, personalization expected |

**Key principle:** The higher the stakes, the greater the need for explainability.

---

## üìù **Question 19 (Ethics)**

**Q19.** Think of a specific high-stakes AI application (medical, legal, financial, etc.). Why would explainability be critical in that context? What could go wrong without it?

*Consider: Legal requirements, ethical obligations, potential for harm, need for trust.*

*Record your answer in the Answer Sheet.*

---

## ‚öñÔ∏è **The Explainability-Performance Tradeoff (Revisited)**

Remember from Module 0 that there's often a tradeoff:

```
Simple models (linear, trees):
  ‚úì High explainability
  ‚úó Lower performance

Complex models (deep learning):
  ‚úì High performance
  ‚úó Lower explainability
```

### **Real-World Case Study: Healthcare AI**

**Scenario:** A hospital wants to predict patient readmission risk.

**Option A: Logistic Regression**
- **Performance:** 75% accuracy
- **Explainability:** Perfect‚Äîdoctors can see exact feature weights
- **Trust:** High‚Äîdoctors understand and accept the model

**Option B: Deep Neural Network**
- **Performance:** 85% accuracy
- **Explainability:** Low‚Äîblack box
- **Trust:** Low‚Äîdoctors suspicious of "magic" predictions

**Option C: Neural Network + Saliency**
- **Performance:** 85% accuracy
- **Explainability:** Moderate‚Äîsaliency maps show important features
- **Trust:** Medium‚Äîdoctors can verify AI focuses on relevant factors

**Key question:** Is 10% extra accuracy worth reduced explainability?

### **The Answer Depends On:**
- **Stakes:** Life-or-death decisions may favor explainability
- **Regulations:** Some domains legally require explainability
- **Trust:** Will users trust and adopt the system?
- **Auditability:** Can we detect when it fails?

---

## üìù **Question 20 (Tradeoffs)**

**Q20.** In a medical diagnosis system, would you choose a 75% accurate explainable model or an 85% accurate black-box model? Explain your reasoning.

*Think about: Patient safety, doctor trust, legal liability, ability to detect errors, right to explanation.*

*Record your answer in the Answer Sheet.*

---

## üîç **When Saliency Reveals Problems**

Throughout this lab, you've seen examples where saliency exposes issues:

### **1. Spurious Correlations (Module 2)**
- **Problem:** Medical AI focuses on hospital logos, not anatomy
- **Saliency reveals:** Heatmap highlights corner watermark
- **Action:** Remove watermarks, retrain, add data augmentation

### **2. Problematic Proxies (Module 3)**
- **Problem:** Loan model uses zip code as proxy for race
- **Saliency reveals:** Zip code is top feature
- **Action:** Remove zip code, audit for other proxies, ensure fairness

### **3. Negation Failures (Module 1)**
- **Problem:** Sentiment model can't handle "not good"
- **Saliency reveals:** "good" has positive saliency even when negated
- **Action:** Use more sophisticated models (BERT, transformers) that understand context

### **4. Background Focus (Module 2)**
- **Problem:** Object classifier focuses on backgrounds (grass for dogs, snow for wolves)
- **Saliency reveals:** Heatmap highlights wrong regions
- **Action:** Collect diverse backgrounds, use data augmentation

**Common theme:** Saliency is a **debugging and auditing tool** that reveals failure modes invisible from accuracy alone.

---

## üìù **Question 21 (Application)**

**Q21.** Imagine you're deploying a resume screening AI for hiring. How would you use saliency to audit for bias before deployment?

*Think about: What features should the model focus on? What features would indicate bias? What would you check in the saliency maps?*

*Record your answer in the Answer Sheet.*

---

## üåç **Real-World Explainability in Practice**

Let's look at how major organizations approach explainability:

### **Google: Model Cards**
- Document model purpose, training data, performance, limitations
- Include fairness metrics across demographics
- Specify intended use cases and misuse risks

### **Microsoft: Responsible AI Standard**
- Requires impact assessments for high-risk AI
- Mandates explainability for consequential decisions
- Provides Fairlearn toolkit for bias detection

### **Financial Services: FCRA & ECOA Compliance**
- Must provide "adverse action notices" for loan denials
- Explain which factors led to decision
- Face legal liability for discriminatory algorithms

### **European Union: GDPR Article 22**
- Right to explanation for automated decisions
- Right to human review
- Restrictions on fully automated consequential decisions

### **Healthcare: FDA AI/ML Guidance**
- Requires explainability for medical device AI
- Continuous monitoring and retraining protocols
- Clinical validation and interpretability standards

---

## üìù **Question 22 (Regulation)**

**Q22.** Why do you think regulations like GDPR require "right to explanation" for automated decisions? What problem is this trying to solve?

*Think about: Power imbalances, accountability, discrimination, human dignity, trust in institutions.*

*Record your answer in the Answer Sheet.*

---

## ‚ö†Ô∏è **Limitations of Saliency (Important!)**

While powerful, saliency has important limitations:

### **1. Saliency ‚â† Causality**
- **What saliency shows:** Feature X affects prediction
- **What it doesn't show:** Whether X *causes* outcome
- **Example:** "Umbrella usage" is salient for "rain prediction"‚Äîbut umbrellas don't cause rain!

### **2. Saliency Can Be Misleading**
- **Adversarial examples:** Small, imperceptible changes can fool models
- **Saliency can be manipulated:** Attackers can create inputs with misleading saliency
- **Example:** Image looks like panda to humans, model sees gibbon‚Äîsaliency highlights "gibbon features" that don't exist

### **3. Saliency Doesn't Explain Interactions**
- **Shows:** Individual feature importance
- **Doesn't show:** How features combine ("age" + "income" together)
- **Example:** "Hours studied" alone is moderate, but "hours studied" + "attendance" together is strong‚Äîsaliency misses this

### **4. Multiple Methods, Inconsistent Results**
- **Gradients:** One saliency map
- **Integrated Gradients:** Different map
- **LIME:** Yet another map
- **SHAP:** Still different
- **Problem:** Which one is "right"? (They all answer slightly different questions)

### **5. Explanations Can Create False Confidence**
- Seeing a saliency map makes people trust the model more
- Even if the saliency is misleading!
- **Danger:** "Explanation theater" without real understanding

---

## üìù **Question 23 (Limitations)**

**Q23.** Given the limitations of saliency, what else would you want to know about a model before deploying it in a high-stakes application?

*Think about: Performance on different groups, failure modes, training data quality, human oversight, recourse mechanisms.*

*Record your answer in the Answer Sheet.*

---

## üéØ **Best Practices for Responsible AI**

Based on everything you've learned, here are key principles:

### **1. Explainability Is One Tool, Not The Solution**
- Use saliency alongside fairness metrics, performance analysis, and human oversight
- Don't rely on explanations alone

### **2. Match Explainability to Stakes**
- High-stakes ‚Üí demand high explainability (even at performance cost)
- Low-stakes ‚Üí can prioritize performance

### **3. Audit Early and Often**
- Check saliency during development, not just deployment
- Look for spurious correlations and problematic features
- Test on diverse data

### **4. Document Everything**
- Training data sources and biases
- Model architecture and hyperparameters
- Performance across demographics
- Known limitations and failure modes

### **5. Enable Human Oversight**
- Don't fully automate consequential decisions
- Provide tools for humans to inspect and override
- Create appeal and recourse mechanisms

### **6. Monitor in Production**
- Saliency at deployment time ‚â† saliency after data drift
- Continuously monitor for shifts in feature importance
- Retrain when needed

### **7. Consider Not Using AI**
- Sometimes the right answer is "don't automate this"
- If you can't explain it, can't audit it, can't ensure fairness ‚Üí don't deploy it

---

## üìù **Question 24 (Synthesis)**

**Q24.** Reflecting on the entire lab (Modules 0-4): What is the most important takeaway about explainability and responsible AI? Why does this matter for your future work with AI systems?

*Think about: What surprised you? What changed your thinking? How will you approach AI differently now?*

*Record your answer in the Answer Sheet.*

---

## üîó **Connection to the Lab Sequence**

Let's see how Lab 6 completes your journey through the DATA 1010 labs:

**Lab 1:** You learned what models are and how they make predictions

**Lab 2:** You learned how models learn through gradient descent

**Lab 3:** You learned how activation functions enable nonlinearity

**Lab 4:** You learned how hidden layers create powerful representations

**Lab 5:** You learned how embeddings encode meaning as geometry

**Lab 6:** You learned how to understand **what models actually do** and **whether they should be trusted**

---

## üöÄ **Where to Go Next**

If you're interested in explainability and responsible AI, explore:

### **Advanced Explainability Methods**
- **SHAP (SHapley Additive exPlanations):** Game theory approach to feature importance
- **LIME (Local Interpretable Model-Agnostic Explanations):** Local approximations
- **Integrated Gradients:** More stable gradient method
- **Attention Visualization:** For transformers and language models

### **Fairness and Bias**
- **Fairlearn (Microsoft):** Bias detection and mitigation toolkit
- **AI Fairness 360 (IBM):** Comprehensive fairness library
- **Aequitas:** Bias and fairness audit toolkit

### **Responsible AI Frameworks**
- **Google Model Cards:** Documentation standard
- **Microsoft Responsible AI Standard:** Company-wide principles
- **Partnership on AI:** Industry consortium

### **Academic Resources**
- **"Interpretable Machine Learning" by Christoph Molnar:** Free online book
- **"Fairness and Machine Learning" by Barocas, Hardt, Narayanan:** Comprehensive textbook
- **ACM FAccT Conference:** Top venue for fairness, accountability, transparency

### **Regulatory Developments**
- **EU AI Act:** Risk-based regulation framework
- **NIST AI Risk Management Framework:** US government guidance
- **Algorithmic Accountability Acts:** Proposed US legislation

---

## ‚úÖ Lab 6 Complete!

Congratulations! You've completed Lab 6 and learned:

### **Technical Skills:**
- ‚úì How to compute saliency for text, images, and tabular data
- ‚úì How to visualize feature importance
- ‚úì How to interpret saliency maps

### **Conceptual Understanding:**
- ‚úì Why explainability matters for responsible AI
- ‚úì When explainability is critical vs. nice-to-have
- ‚úì The interpretability-performance tradeoff
- ‚úì Limitations of saliency methods

### **Ethical Reasoning:**
- ‚úì How to detect spurious correlations
- ‚úì How to identify problematic features
- ‚úì How to audit models for bias
- ‚úì When not to deploy AI systems

**Most importantly:** You understand that building accurate models is not enough‚Äîwe must also ensure they're **fair, transparent, and accountable**.

---

## üìã **Before You Submit**

Make sure you have:

- [ ] Completed Module 0 (What Is Saliency?)
- [ ] Completed Module 1 (Text Saliency)
- [ ] Completed Module 2 (Image Saliency)
- [ ] Completed Module 3 (Tabular Saliency)
- [ ] Completed Module 4 (Ethics & Explainability)
- [ ] Answered all 24 questions (Q1-Q24)
- [ ] Experimented with your own text, images, or data
- [ ] Reflected on the ethical implications
- [ ] Understood connections to Labs 1-5

---

## üéì **Final Reflection**

Take a moment to appreciate what you've learned across all six labs:

You started by manually adjusting weights to fit a line to data (Lab 1). Now you understand how to audit billion-parameter models for bias and fairness (Lab 6).

You've gone from **building** AI to **understanding** AI to **responsibly deploying** AI.

That's a remarkable journey.

**Thank you for taking responsible AI seriously.** The field needs more people like you who ask not just "can we build this?" but "should we build this?" and "how do we build this responsibly?"

---

**Congratulations on completing Lab 6!**

---