# Exercise 01: NLP Applications and Ethics
## AIAT 121 - Natural Language Processing

---

## Learning Objectives

In this exercise, you will:
- Build a real-world NLP application (chatbot or text summarizer)
- Detect and analyze bias in NLP models
- Implement bias mitigation techniques
- Evaluate NLP systems for fairness and ethical concerns

---

## Real-World Context

You are working for a customer service company that wants to deploy an NLP-powered chatbot. However, you need to ensure the system is fair, unbiased, and handles all customer groups equitably.

**Task**: Build an NLP application and conduct an ethical audit.

---

## Task 1: Build NLP Application (40 points)

Choose ONE of the following applications:

**Option A: Chatbot**
- Build a simple chatbot using sequence-to-sequence or transformer models
- Handle customer queries
- Implement conversation flow

**Option B: Text Summarization**
- Implement extractive or abstractive summarization
- Summarize customer feedback or support tickets
- Evaluate summary quality

**Option C: Sentiment Analysis System**
- Build a sentiment analysis system for customer reviews
- Handle multiple languages if possible
- Provide confidence scores

**Requirements:**
- Use appropriate NLP libraries (NLTK, spaCy, Transformers)
- Preprocess text properly
- Train or fine-tune a model
- Evaluate performance with metrics


In [1]:
# Setup
%pip install nltk transformers torch pandas numpy matplotlib seaborn -q

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from collections import Counter

print('✅ Setup complete!')


Note: you may need to restart the kernel to use updated packages.


✅ Setup complete!


## Task 2: Bias Detection (30 points)

Analyze your NLP application for potential biases:

1. **Dataset Bias Analysis**
   - Analyze training data distribution
   - Check for representation gaps
   - Identify demographic biases

2. **Model Bias Detection**
   - Test model on different demographic groups
   - Measure performance disparities
   - Identify biased predictions

3. **Word Embedding Bias**
   - Analyze word embeddings for gender/racial bias
   - Use WEAT (Word Embedding Association Test) if applicable
   - Visualize bias patterns

In [2]:
# TODO: Implement bias detection
# YOUR CODE HERE

# Example: Load and analyze dataset
# dataset = pd.read_csv('your_dataset.csv')
# Analyze demographic distribution, etc.


## Task 3: Bias Mitigation (20 points)

Implement at least ONE bias mitigation technique:

1. **Data-Level Mitigation**
   - Balance dataset representation
   - Augment underrepresented groups
   - Remove biased examples

2. **Model-Level Mitigation**
   - Use debiased embeddings
   - Add fairness constraints
   - Fine-tune with balanced data

3. **Post-Processing Mitigation**
   - Apply fairness filters
   - Adjust predictions for fairness
   - Implement equalized odds

**Document your approach and results.**

In [3]:
# TODO: Implement bias mitigation
# YOUR CODE HERE


## Task 4: Ethical Audit Report (10 points)

Create a comprehensive ethical audit report covering:

1. **Bias Analysis Summary**
   - Key findings
   - Affected groups
   - Impact assessment

2. **Mitigation Results**
   - Techniques applied
   - Effectiveness evaluation
   - Remaining concerns

3. **Recommendations**
   - Best practices for deployment
   - Ongoing monitoring strategies
   - Responsible AI guidelines

**Submission**: Complete notebook with all tasks + ethical audit report

**Grading**: 100 points total
