# 🎓 Capstone: Fine-Tuning a Language Model for Emotion Classification

Welcome to the final workshop notebook! In this exercise, you'll fine-tune a pretrained transformer model on the **Emotion** dataset. This dataset includes short text samples labeled with one of six emotions:

- joy, sadness, anger, fear, surprise, love

You'll use all the skills you've practiced: tokenization, model loading, training, and evaluation — plus new ideas like zero-shot classification and planning your own projects.

## 📂 Part 1: Load and Explore the Dataset

In [2]:
from datasets import load_dataset
dataset = load_dataset('emotion')
dataset['train'][0]

  from .autonotebook import tqdm as notebook_tqdm


{'text': 'i didnt feel humiliated', 'label': 0}

### 🔍 Challenge: How many examples are there per emotion?
Try using `dataset['train'].features` and simple loops or `pandas` to summarize the labels.

## ✂️ Part 2: Tokenize the Dataset

In [4]:
from transformers import DistilBertTokenizerFast
tokenizer = DistilBertTokenizerFast.from_pretrained('distilbert-base-uncased')

def tokenize(example):
    return tokenizer(example['text'], truncation=True, padding='max_length')

tokenized = dataset.map(tokenize, batched=True)
tokenized = tokenized.remove_columns(['text'])
tokenized.set_format('torch')

Map: 100%|██████████| 2000/2000 [00:00<00:00, 6520.09 examples/s]


## 🧠 Part 3: Load Pretrained Model

In [5]:
from transformers import AutoModelForSequenceClassification

# TODO: Load model with 6 output labels (for the 6 emotions in the dataset)
model = AutoModelForSequenceClassification.from_pretrained(
    'distilbert-base-uncased',
    num_labels=6
)

Some weights of DistilBertForSequenceClassification were not initialized from the model checkpoint at distilbert-base-uncased and are newly initialized: ['classifier.bias', 'classifier.weight', 'pre_classifier.bias', 'pre_classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.


## 🚀 Part 4: Fine-Tune the Model

Fill in the arguments below to build your Trainer to fine-tune the model. Some parts are filled in for you already.

**NOTE:** This is a larger dataset! So make sure not to try and train for too many epochs. Start with a small number (2 or 3) and go up from there if you need to.

In [6]:
from transformers import TrainingArguments, Trainer, DataCollatorWithPadding

args = TrainingArguments(
    output_dir='./results',
    eval_strategy='steps', # Evaluate every eval_steps
    eval_steps=100,
    logging_steps=100,
    learning_rate=2e-5,
    per_device_train_batch_size=16,
    num_train_epochs=2,
    weight_decay=0.01,
    logging_dir='./logs'
)

data_collator = DataCollatorWithPadding(tokenizer=tokenizer)

# TODO: Fill in the Trainer constructor
trainer = Trainer(
    model=model,
    args=args,
    train_dataset=tokenized['train'],
    eval_dataset=tokenized['validation'],
    tokenizer=tokenizer,
    data_collator=data_collator
)

trainer.train()

  trainer = Trainer(


Step,Training Loss,Validation Loss
100,No log,1.067747
200,No log,0.628343
300,No log,0.422641
400,No log,0.316672
500,0.698500,0.255906
600,0.698500,0.235214
700,0.698500,0.236051
800,0.698500,0.212493
900,0.698500,0.185259
1000,0.238500,0.184721


TrainOutput(global_step=3000, training_loss=0.2369939422607422, metrics={'train_runtime': 1158.3983, 'train_samples_per_second': 41.437, 'train_steps_per_second': 2.59, 'total_flos': 6358888710144000.0, 'train_loss': 0.2369939422607422, 'epoch': 3.0})

## 📈 Part 5: Evaluate and Predict

In [7]:
def predict_emotion(text):
    inputs = tokenizer(text, return_tensors='pt', truncation=True, padding=True)
    inputs = {k: v.to(model.device) for k, v in inputs.items()}
    outputs = model(**inputs)
    probs = outputs.logits.softmax(dim=-1).squeeze().tolist()
    labels = dataset['train'].features['label'].names
    predicted = labels[probs.index(max(probs))]
    print(f"\n📝 Input: {text}")
    print(f"🤖 Predicted Emotion: {predicted} ({max(probs)*100:.2f}% confidence)")
    print(f"📊 Probabilities: {dict(zip(labels, [f'{p:.3f}' for p in probs]))}")

In [13]:
# write in your own statement to evaluate the model!
predict_emotion("I can't believe you did that!")


📝 Input: I can't believe you did that!
🤖 Predicted Emotion: anger (58.01% confidence)
📊 Probabilities: {'sadness': '0.024', 'joy': '0.343', 'love': '0.004', 'anger': '0.580', 'fear': '0.024', 'surprise': '0.024'}


## 🔮 Part 6: Zero-Shot Classification Comparison

Zero-shot classification uses a model trained on **Natural Language Inference** to label text without any additional training. We'll use `facebook/bart-large-mnli`, which can reason about label descriptions.

In [12]:
from transformers import pipeline
zero_shot = pipeline('zero-shot-classification', model='facebook/bart-large-mnli')

candidate_labels = dataset['train'].features['label'].names
text = "I can’t believe you did that!"
result = zero_shot(text, candidate_labels=candidate_labels)
print(f"\n📝 Input: {text}")
print(f"🤖 Predicted: {result['labels'][0]} ({result['scores'][0]*100:.2f}% confidence)")

Xet Storage is enabled for this repo, but the 'hf_xet' package is not installed. Falling back to regular HTTP download. For better performance, install the package with: `pip install huggingface_hub[hf_xet]` or `pip install hf_xet`
Device set to use cuda:0



📝 Input: I can’t believe you did that!
🤖 Predicted: surprise (66.37% confidence)


### 💬 Discussion: Compare the results from your fine-tuned model and the zero-shot model. Do they agree? Does one make more sense to you than the other?

## Part 7: What’s Next?

You’ve fine-tuned a real language model and explored zero-shot reasoning. What’s next?
Here are some cool project ideas you can try:

| Task | Dataset | Link | Description |
|------|---------|------|-------------|
| Emotion Detection | `emotion` | [🔗](https://huggingface.co/datasets/emotion) | Classify tweets by emotional tone |
| Sarcasm Detection | `sarcasm` | [🔗](https://huggingface.co/datasets/sarcasm) | Detect whether something is sarcastic |
| Toxic Comments | `jigsaw_toxicity_pred` | [🔗](https://huggingface.co/datasets/jigsaw_toxicity_pred) | Label toxic or abusive language |
| Tweet Tasks | `tweet_eval` | [🔗](https://huggingface.co/datasets/tweet_eval) | Sentiment, hate, stance, emojis |
| News Topics | `ag_news` | [🔗](https://huggingface.co/datasets/ag_news) | Classify short news into topics |
| Fake News | `liar` | [🔗](https://huggingface.co/datasets/liar) | Label political statements as true/false |
| Dialogue Emotion | `daily_dialog` | [🔗](https://huggingface.co/datasets/daily_dialog) | Intent and emotion in conversations |
| Hate Speech | `hate_speech18` | [🔗](https://huggingface.co/datasets/hate_speech18) | Detect hate in text posts |

## 🌍 How to Keep Exploring

Now that you've fine-tuned and evaluated your first model, here’s how you can take your skills even further:

- 🔎 **Explore new datasets**: [Hugging Face Datasets Hub](https://huggingface.co/datasets)
- 🧠 **Try different models**: [Hugging Face Models Hub](https://huggingface.co/models)
- 📚 **Read the docs**: [Transformers Documentation](https://huggingface.co/docs/transformers)
- 💻 **Use GPUs for free**: Google Colab provides a free GPU runtime — perfect for experimenting.
- 🚀 **Share your work**: Upload your model to [Hugging Face Hub](https://huggingface.co/docs/hub) or build a simple demo with [Gradio](https://www.gradio.app/) or [Hugging Face Spaces](https://huggingface.co/spaces).

## 💡 Project Ideas by Theme

If you're looking for inspiration, here are a few directions you can take your next model:

| Theme | Ideas |
|-------|-------|
| 💬 Emotion / Mental Health | Detect mood in journaling apps, supportive chatbot for students |
| ⚠️ Content Moderation | Toxic comment filter, hate speech detector, safe content assistant |
| ✨ Creativity | Poetry classifier, sarcasm generator, meme captioner |
| 🎮 Games & Fun | NPC emotion predictor, story tone classifier, in-game chat sentiment |
| 📢 Advocacy | Fake news detector, flag biased language, classify protest messaging |
| 🗣️ Cultural NLP | Analyze slang, code-switching detection, dialect identification |