<a href="https://colab.research.google.com/github/muhammadibrahim313/genai-cohort-labs/blob/main/fine%20tuning.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# 🚀 Complete Fine-Tuning Guide: Teaching AI Your Language
## From Pre-trained Models to Custom Solutions - A Step-by-Step Journey

---

### 👋 Welcome to Fine-Tuning!

Today we're going to take a journey from a general AI model to YOUR specialized AI assistant.

---

### 🎯 What Will We Achieve Today?

By the end of this session, you will:
1. **Understand** what fine-tuning really means (with lots of examples!)
2. **Learn** every step of the fine-tuning process
3. **Build** your own sentiment analyzer for movie reviews
4. **Deploy** your model so anyone can use it
5. **Test** with real examples and see it work!

---

## 🤔 Part 1: What is Fine-Tuning? Let's Really Understand It!

### 🎓 The University Analogy

Imagine you have a university graduate (that's our pre-trained model):
- ✅ They know general stuff: math, science, language
- ❌ But they don't know YOUR specific job

Fine-tuning is like:
- 📚 Giving them specialized training for YOUR company
- 🎯 Teaching them YOUR specific terminology
- 💼 Making them an expert in YOUR field

---

### 🔌 For Electrical Engineers: The Specialist Analogy

Think of it this way:

```
General Electrician (Pre-trained Model):
  - Knows: Basic wiring, safety, circuits
  - Can do: General electrical work
  
        ⬇️ FINE-TUNING ⬇️
        
Power Grid Specialist (Your Fine-tuned Model):
  - Knows: Transformer stations, grid management, load balancing
  - Expert in: YOUR specific area
```

---

## 📊 Real Examples: Before vs After Fine-Tuning

Let's see what happens when we fine-tune:

### Example 1: General vs Specialized

| Input Text | General Model Says | Fine-tuned Model Says | Why? |
|------------|-------------------|----------------------|------|
| "The transformer failed" | "Something changed form" 🤷 | "Equipment malfunction" ⚡ | Knows electrical context |
| "High resistance in circuit" | "Someone is resisting" 🤷 | "Electrical issue detected" ⚡ | Understands terminology |
| "The plot was shocking" | "Electricity involved?" 🤷 | "Surprising story" 🎬 | Trained on movie reviews |

### Example 2: Confidence Levels

For the text: "This movie is absolutely terrible"

- **General Model**:
  - Negative: 65% confidence
  - "I think it's negative but not sure"
  
- **Fine-tuned Model**:
  - Negative: 98% confidence
  - "Definitely negative, I've seen thousands of movie reviews!"

---

## 🔄 The Complete Fine-Tuning Process - Visual Guide

Here's what we'll do today, step by step:

```
STEP 1: GET A SMART MODEL
    📦 Download pre-trained model (already knows language)
         |
         v
STEP 2: PREPARE YOUR DATA
    📝 Get your movie reviews ready
    🏷️ Label them (positive/negative)
         |
         v
STEP 3: TEACH THE MODEL
    🎓 Show examples: "This text = Positive"
    🔁 Repeat many times (epochs)
    📈 Model learns patterns
         |
         v
STEP 4: TEST IT
    🧪 Give new reviews
    ✅ Check if predictions are correct
         |
         v
STEP 5: DEPLOY
    🚀 Upload to cloud
    🌍 Anyone can use it!
```

---

## ❓ Why Not Train From Scratch?

Great question! Let's compare:

### 🆚 Training From Scratch vs Fine-Tuning

| Aspect | Training From Scratch | Fine-Tuning | Winner |
|--------|----------------------|-------------|--------|
| **Data Needed** | 10+ million examples | 500-5000 examples | Fine-tuning 🏆 |
| **Time Required** | Weeks/Months | Hours/Minutes | Fine-tuning 🏆 |
| **Cost** | $10,000+ | $0-10 | Fine-tuning 🏆 |
| **Expertise Needed** | PhD level | Beginner friendly | Fine-tuning 🏆 |
| **Results** | Uncertain | Usually great | Fine-tuning 🏆 |

### 💰 Real Cost Example:
- **From Scratch**: Like building a whole university = $$$$$
- **Fine-tuning**: Like a training workshop = $

---

## 🛠️ Let's Start! First, Install What We Need

### 📦 Required Libraries Explained:

- **transformers**: The magic library from Hugging Face
- **datasets**: For handling our data efficiently
- **torch**: PyTorch for the actual training
- **scikit-learn**: For splitting data and metrics
- **accelerate**: Makes training faster

Let's install everything:

In [None]:
# 📦 INSTALLATION CELL - Run this first!
# This installs all the AI libraries we need

print("🚀 Starting installation of required packages...\n")
print("This will take about 1-2 minutes...\n")

# Install transformers - this is the main library for fine-tuning
!pip install -q transformers
print("✅ Transformers installed - this gives us pre-trained models")

# Install datasets - for efficient data handling
!pip install -q datasets
print("✅ Datasets installed - for managing our training data")

# Install accelerate - makes training faster
!pip install -q accelerate
print("✅ Accelerate installed - for faster training")

# Install evaluate - for measuring performance
!pip install -q evaluate
print("✅ Evaluate installed - to measure how good our model is")

# Install huggingface_hub - for uploading our model
!pip install -q huggingface_hub
print("✅ Hugging Face Hub installed - for model deployment")

# Install scikit-learn - for data splitting and metrics
!pip install -q scikit-learn
print("✅ Scikit-learn installed - for data management")

# Install PyTorch with CUDA support for GPU
# This is the deep learning framework
!pip install -q torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
print("✅ PyTorch installed - the engine that powers everything")

print("\n" + "="*50)
print("🎉 ALL PACKAGES INSTALLED SUCCESSFULLY!")
print("="*50)
print("\n📝 Note: You might see some warnings - that's normal!")

## 📚 Now Let's Import Everything We Need

### What Each Import Does:

- **pandas**: For handling our data like Excel sheets
- **numpy**: For numerical operations
- **torch**: The deep learning engine
- **transformers**: For loading and training models
- **sklearn**: For splitting data and calculating metrics

In [None]:
# 📚 IMPORT CELL - Load all our tools
# Each import is explained so you know what it does!

print("📚 Importing libraries...\n")

# Data handling libraries
import pandas as pd  # Like Excel for Python
import numpy as np   # For numerical operations
print("✅ Data libraries imported")

# Deep learning framework
import torch  # PyTorch - the engine for our AI
print("✅ PyTorch imported")

# Hugging Face transformers - the star of the show!
from transformers import (
    AutoTokenizer,  # Converts text to numbers
    AutoModelForSequenceClassification,  # The actual model
    TrainingArguments,  # Settings for training
    Trainer,  # The training manager
    DataCollatorWithPadding  # Makes all texts same length
)
print("✅ Transformers components imported")

# Dataset handling
from datasets import Dataset, DatasetDict  # Efficient data handling
print("✅ Dataset tools imported")

# For splitting data and measuring performance
from sklearn.model_selection import train_test_split  # Split data into train/test
from sklearn.metrics import accuracy_score, classification_report, confusion_matrix
print("✅ Scikit-learn tools imported")

# Visualization libraries
import matplotlib.pyplot as plt  # For creating plots
import seaborn as sns  # For pretty visualizations
print("✅ Visualization tools imported")

# Ignore warnings to keep output clean
import warnings
warnings.filterwarnings('ignore')
print("✅ Warning filter set")

print("\n" + "="*50)
print("🎉 All libraries imported successfully!")
print("="*50)

## 🖥️ Check If We Have a GPU (Graphics Card)

### Why GPU Matters:
- **GPU** = Graphics Processing Unit (very fast for AI)
- **CPU** = Regular processor (slower for AI)

Think of it like:
- **GPU**: Like having 1000 workers doing simple tasks = FAST! ⚡
- **CPU**: Like having 8 smart workers doing complex tasks = SLOWER 🐢

Don't worry if you don't have GPU - it will still work!

In [None]:
# 🖥️ CHECK HARDWARE CELL - See what computer power we have

print("🔍 Checking available hardware...\n")

# Check if CUDA (GPU support) is available
if torch.cuda.is_available():
    # We have a GPU! This is good news
    device = torch.device('cuda')
    print("🎮 Great news! GPU is available!")
    print(f"📊 GPU Name: {torch.cuda.get_device_name(0)}")
    print(f"💾 GPU Memory: {torch.cuda.get_device_properties(0).total_memory / 1024**3:.1f} GB")
    print("\n⚡ Training will be FAST!")

    # Estimate training time
    estimated_time = "5-10 minutes"
else:
    # No GPU, we'll use CPU
    device = torch.device('cpu')
    print("💻 No GPU found - using CPU")
    print("⏰ Training will be slower but will still work!")

    # Estimate training time
    estimated_time = "15-30 minutes"

print(f"\n⏱️ Estimated training time: {estimated_time}")
print(f"📍 Device selected: {device}")

# Fun fact about GPUs
print("\n💡 Fun Fact:")
print("GPUs were originally made for video games, but turned out")
print("to be perfect for AI because both need lots of parallel processing!")

## 📚 Understanding Pre-trained Models

### 🤖 What Models Can We Choose?

Think of these like different types of vehicles:

| Model | Size | Speed | Accuracy | Best For | Like a... |
|-------|------|-------|----------|----------|----------|
| **DistilBERT** | 66M params | ⚡ Very Fast | Good | Quick tests, demos | Sports car 🏎️ |
| **BERT-base** | 110M params | 🏃 Medium | Better | Production use | SUV 🚙 |
| **RoBERTa** | 125M params | 🐢 Slower | Best | When accuracy matters | Truck 🚛 |
| **ALBERT** | 12M params | ⚡⚡ Fastest | Okay | Mobile apps | Motorcycle 🏍️ |

### 🎯 Why We're Using DistilBERT:

1. **Fast** - Perfect for our 2-hour session
2. **Good Performance** - 97% as good as BERT
3. **Small** - Fits in Google Colab free tier
4. **Easy** - Simple to understand

### 📏 What Does "66M Parameters" Mean?

Parameters are like brain cells:
- More parameters = Smarter but slower
- 66 million = 66,000,000 adjustable values
- Each one learns something about language!

---

---

# 🎬 Part 2: Let's Build Our Movie Sentiment Analyzer!

## Now the fun begins - hands-on time!

---

## 📊 Step 1: Load Our Movie Review Data

### What We're Loading:
- **IMDB Dataset**: 50,000 movie reviews
- **Labels**: Positive or Negative sentiment
- **Goal**: Teach our model to understand movie opinions

### Data Loading Priority:
1. First try: Load cleaned data from previous session
2. Second try: Load original IMDB data
3. Fallback: Create sample data for demo

In [None]:
# 📊 DATA LOADING CELL - Get our movie reviews ready

print("📂 Loading movie review data...\n")

# We'll try three different ways to get data
data_loaded = False

# ATTEMPT 1: Try to load cleaned data from previous session
try:
    df = pd.read_csv('/content/IMDB_cleaned.csv')
    print("✅ Excellent! Found cleaned data from data cleaning session!")
    print("   This data is already preprocessed and ready.")
    text_column = 'cleaned_review'  # Use the cleaned text column
    data_loaded = True
except FileNotFoundError:
    print("❌ No cleaned data found, trying original dataset...")

# ATTEMPT 2: Try to load original IMDB dataset
if not data_loaded:
    try:
        df = pd.read_csv('/content/IMDB Dataset.csv')
        print("✅ Found original IMDB dataset!")
        print("   Note: Using raw reviews (not cleaned)")
        text_column = 'review'  # Use the raw review column
        data_loaded = True
    except FileNotFoundError:
        print("❌ No IMDB dataset found either...")

# ATTEMPT 3: Create sample data for demonstration
if not data_loaded:
    print("⚠️ Creating sample data for demonstration...")
    print("   (Upload 'IMDB Dataset.csv' for real training)\n")

    # Create diverse sample reviews
    positive_reviews = [
        "This movie was absolutely fantastic! Best film I've seen all year!",
        "Amazing cinematography and brilliant acting. A masterpiece!",
        "I loved every minute of it. Highly recommend to everyone!",
        "Outstanding performance by the lead actor. Oscar-worthy!",
        "The story was captivating from start to finish. Brilliant!",
        "One of the best movies ever made. Simply perfect!",
        "Incredible visual effects and great storyline!",
        "This film touched my heart. Beautiful and moving.",
    ]

    negative_reviews = [
        "Terrible movie. Complete waste of time and money.",
        "Boring plot and awful acting. Fell asleep halfway through.",
        "Worst movie I've ever seen. Don't waste your time.",
        "Poor character development and predictable story.",
        "The dialogue was cringe-worthy. Couldn't finish it.",
        "Disappointing. Expected much better from this director.",
        "Slow pacing and confusing plot. Not recommended.",
        "Acting was wooden and story made no sense.",
    ]

    # Combine and repeat to have more samples
    all_reviews = []
    all_sentiments = []

    # Repeat each review to create a larger dataset
    for _ in range(50):  # Repeat 50 times
        all_reviews.extend(positive_reviews)
        all_reviews.extend(negative_reviews)
        all_sentiments.extend(['positive'] * len(positive_reviews))
        all_sentiments.extend(['negative'] * len(negative_reviews))

    # Create DataFrame
    df = pd.DataFrame({
        'review': all_reviews,
        'sentiment': all_sentiments
    })
    text_column = 'review'
    print(f"✅ Created {len(df)} sample reviews for demonstration")

# Display dataset information
print("\n" + "="*60)
print("📊 DATASET INFORMATION")
print("="*60)
print(f"Total reviews: {len(df):,}")
print(f"Columns: {df.columns.tolist()}")
print(f"Text column being used: '{text_column}'")

# Show sentiment distribution
print("\n🎭 Sentiment Distribution:")
sentiment_counts = df['sentiment'].value_counts()
for sentiment, count in sentiment_counts.items():
    percentage = (count / len(df)) * 100
    print(f"  {sentiment}: {count:,} reviews ({percentage:.1f}%)")

# Show sample reviews
print("\n📝 Sample Reviews:")
print("-" * 60)
for i in range(min(3, len(df))):
    review_text = df.iloc[i][text_column]
    sentiment = df.iloc[i]['sentiment']
    # Show first 80 characters of each review
    display_text = review_text[:80] + "..." if len(review_text) > 80 else review_text
    print(f"Review {i+1} ({sentiment}): {display_text}")
print("-" * 60)

## 🔢 Convert Sentiments to Numbers

### Why Convert to Numbers?

Computers don't understand "positive" or "negative" - they only understand numbers!

So we convert:
- **"negative"** → 0
- **"positive"** → 1

It's like translating English to Computer language! 🤖

In [None]:
# 🔢 LABEL CONVERSION CELL - Convert sentiments to numbers

print("🔄 Converting sentiments to numerical labels...\n")

# Create a mapping: negative=0, positive=1
label_mapping = {
    'negative': 0,
    'positive': 1
}

# Apply the mapping to create a new 'label' column
df['label'] = df['sentiment'].map(label_mapping)

# Verify the conversion worked
print("✅ Conversion complete!\n")
print("📊 Label Mapping:")
print("  'negative' → 0")
print("  'positive' → 1")

# Show example of the conversion
print("\n📝 Example of conversion:")
print("-" * 50)
sample_df = df[['sentiment', 'label']].head(5)
for idx, row in sample_df.iterrows():
    print(f"  {row['sentiment']} → {row['label']}")
print("-" * 50)

# Check for any issues
if df['label'].isna().any():
    print("\n⚠️ Warning: Some labels couldn't be converted!")
    print(f"   Missing labels: {df['label'].isna().sum()}")
else:
    print("\n✅ All sentiments successfully converted to numbers!")

## 📉 Select a Sample for Faster Training

### Why Use a Sample?

For learning purposes, we'll use a smaller sample:
- **Full dataset**: 50,000 reviews → Hours of training
- **Our sample**: 2,000 reviews → Minutes of training

In real production, you'd use all the data!

### The Tradeoff:
- **More Data** = Better accuracy but slower
- **Less Data** = Faster but slightly less accurate

For learning, fast is better! ⚡

In [None]:
# 📉 SAMPLING CELL - Take a subset for faster training

print("📊 Preparing data sample for training...\n")

# Decide how many samples to use
# You can change this number!
SAMPLE_SIZE = min(2000, len(df))  # Use 2000 or all data if less than 2000

print(f"Original dataset size: {len(df):,} reviews")
print(f"Sample size for training: {SAMPLE_SIZE:,} reviews")
print(f"Sampling ratio: {(SAMPLE_SIZE/len(df)*100):.1f}%\n")

# Take a random sample, ensuring balanced sentiments
# random_state=42 ensures reproducible results
df_sample = df.sample(n=SAMPLE_SIZE, random_state=42)

# Check the balance of our sample
positive_count = (df_sample['label'] == 1).sum()
negative_count = (df_sample['label'] == 0).sum()

print("🎭 Sample Distribution:")
print(f"  Positive reviews: {positive_count} ({positive_count/SAMPLE_SIZE*100:.1f}%)")
print(f"  Negative reviews: {negative_count} ({negative_count/SAMPLE_SIZE*100:.1f}%)")

# Check if balanced
balance_ratio = min(positive_count, negative_count) / max(positive_count, negative_count)
if balance_ratio > 0.8:
    print("\n✅ Good balance! The sample has similar amounts of positive and negative.")
else:
    print("\n⚠️ Sample is imbalanced. Model might be biased.")

# Check for missing values
missing_reviews = df_sample[text_column].isna().sum()
if missing_reviews > 0:
    print(f"\n⚠️ Found {missing_reviews} missing reviews. Removing them...")
    df_sample = df_sample.dropna(subset=[text_column])
    print(f"✅ Cleaned! Final sample size: {len(df_sample)}")
else:
    print("\n✅ No missing values found!")

print("\n" + "="*50)
print(f"📊 Final sample ready: {len(df_sample)} reviews")
print("="*50)

## 🔪 Step 2: Split Data into Training and Testing Sets

### Why Split the Data?

Imagine you're teaching someone math:
1. **Training Set (80%)**: Problems they study and practice with
2. **Test Set (20%)**: New problems for the final exam

We need to test on NEW data the model hasn't seen!

### The Split:
```
All Data (100%)
    |
    ├── Training Set (80%) - For learning
    |
    └── Test Set (20%) - For evaluation
```

In [None]:
# 🔪 DATA SPLITTING CELL - Divide into train and test

print("✂️ Splitting data into training and testing sets...\n")

# Extract texts and labels from our sample
texts = df_sample[text_column].tolist()  # Convert to list
labels = df_sample['label'].tolist()      # Convert to list

print(f"Total texts: {len(texts)}")
print(f"Total labels: {len(labels)}\n")

# Split the data: 80% for training, 20% for testing
# stratify ensures both sets have same ratio of positive/negative
train_texts, test_texts, train_labels, test_labels = train_test_split(
    texts,                # Our review texts
    labels,               # Our labels (0 or 1)
    test_size=0.2,        # 20% for testing
    random_state=42,      # For reproducible results
    stratify=labels       # Keep same positive/negative ratio
)

# Display the split results
print("📊 DATA SPLIT COMPLETE!")
print("="*50)

print("\n🏋️ TRAINING SET:")
print(f"  Total samples: {len(train_texts)}")
print(f"  Positive reviews: {sum(train_labels)} ({sum(train_labels)/len(train_labels)*100:.1f}%)")
print(f"  Negative reviews: {len(train_labels) - sum(train_labels)} ({(len(train_labels) - sum(train_labels))/len(train_labels)*100:.1f}%)")

print("\n🧪 TEST SET:")
print(f"  Total samples: {len(test_texts)}")
print(f"  Positive reviews: {sum(test_labels)} ({sum(test_labels)/len(test_labels)*100:.1f}%)")
print(f"  Negative reviews: {len(test_labels) - sum(test_labels)} ({(len(test_labels) - sum(test_labels))/len(test_labels)*100:.1f}%)")

# Show example from each set
print("\n📝 Example from TRAINING set:")
print(f"  Text: \"{train_texts[0][:100]}...\"")
print(f"  Label: {train_labels[0]} ({'positive' if train_labels[0] == 1 else 'negative'})")

print("\n📝 Example from TEST set:")
print(f"  Text: \"{test_texts[0][:100]}...\"")
print(f"  Label: {test_labels[0]} ({'positive' if test_labels[0] == 1 else 'negative'})")

print("\n💡 Remember: The model will NEVER see the test set during training!")
print("   We keep it hidden to fairly evaluate performance.")

## 📊 Visualize the Data Split

Let's see our data split visually!

In [None]:
# 📊 VISUALIZATION CELL - Show the split graphically

print("📊 Creating visualization of data split...\n")

# Create subplots
fig, axes = plt.subplots(1, 3, figsize=(15, 5))

# Define colors
colors = ['#e74c3c', '#2ecc71']  # Red for negative, Green for positive

# Plot 1: Overall split
split_sizes = [len(train_texts), len(test_texts)]
axes[0].pie(split_sizes, labels=['Train (80%)', 'Test (20%)'],
            autopct='%1.0f%%', colors=['#3498db', '#9b59b6'],
            startangle=90)
axes[0].set_title('Overall Data Split', fontsize=12, fontweight='bold')

# Plot 2: Training set distribution
train_pos = sum(train_labels)
train_neg = len(train_labels) - train_pos
axes[1].pie([train_neg, train_pos], labels=['Negative', 'Positive'],
            autopct='%1.0f%%', colors=colors, startangle=90)
axes[1].set_title('Training Set Sentiment', fontsize=12, fontweight='bold')

# Plot 3: Test set distribution
test_pos = sum(test_labels)
test_neg = len(test_labels) - test_pos
axes[2].pie([test_neg, test_pos], labels=['Negative', 'Positive'],
            autopct='%1.0f%%', colors=colors, startangle=90)
axes[2].set_title('Test Set Sentiment', fontsize=12, fontweight='bold')

plt.suptitle('📊 Data Split Visualization', fontsize=16, fontweight='bold')
plt.tight_layout()
plt.show()

print("\n✅ Visualization complete!")
print("📝 Note: Both sets have similar sentiment distributions - this is good!")

## 🤖 Step 3: Load the Pre-trained Model

### What's Happening Here?

We're downloading a model that already knows English!

Think of it like:
1. **Hiring an English teacher** (pre-trained model)
2. **Teaching them about movies** (fine-tuning)
3. **They become a movie critic** (specialized model)

### Model Components:

1. **Tokenizer**: Converts text → numbers
   - "great movie" → [2307, 3185]
   
2. **Model**: The actual brain
   - Takes numbers, outputs predictions

Let's load them! 🚀

In [None]:
# 🤖 MODEL LOADING CELL - Download and prepare the pre-trained model

print("🤖 LOADING PRE-TRAINED MODEL\n")
print("="*60)

# Choose which model to use
MODEL_NAME = 'distilbert-base-uncased'

print(f"📦 Model selected: {MODEL_NAME}")
print("\n📝 About this model:")
print("  - Created by: Hugging Face")
print("  - Size: ~250 MB")
print("  - Parameters: 66 million")
print("  - Language: English")
print("  - Special: 40% smaller than BERT, 97% as good!")

print("\n⬇️ Downloading model components...")
print("  (This may take 1-2 minutes on first run)\n")

# STEP 1: Load the tokenizer
print("📝 Loading tokenizer...")
tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)
print("✅ Tokenizer loaded successfully!")
print(f"   Vocabulary size: {tokenizer.vocab_size:,} words")

# STEP 2: Load the model
print("\n🧠 Loading model...")
model = AutoModelForSequenceClassification.from_pretrained(
    MODEL_NAME,
    num_labels=2,  # We have 2 classes: positive and negative
    id2label={0: "NEGATIVE", 1: "POSITIVE"},  # Map numbers to labels
    label2id={"NEGATIVE": 0, "POSITIVE": 1}   # Map labels to numbers
)
print("✅ Model loaded successfully!")

# STEP 3: Move model to appropriate device (GPU or CPU)
print(f"\n🚀 Moving model to {device}...")
model = model.to(device)
print(f"✅ Model is now on {device}")

# Calculate model statistics
total_params = sum(p.numel() for p in model.parameters())
trainable_params = sum(p.numel() for p in model.parameters() if p.requires_grad)

print("\n📊 MODEL STATISTICS:")
print("="*40)
print(f"Total parameters: {total_params:,}")
print(f"Trainable parameters: {trainable_params:,}")
print(f"Frozen parameters: {total_params - trainable_params:,}")
print(f"Model size in memory: ~{total_params * 4 / 1024**2:.1f} MB")

print("\n✨ Model is ready for fine-tuning!")

## 🔤 Understanding Tokenization

### What is Tokenization?

Computers don't understand words - only numbers! Tokenization converts:

```
"This movie is great!"
         ↓
[101, 2023, 3185, 2003, 2307, 999, 102]
```

Each number represents a word or part of a word:
- 101 = [START]
- 2023 = "this"
- 3185 = "movie"
- 2003 = "is"
- 2307 = "great"
- 999 = "!"
- 102 = [END]

Let's see it in action! 👀

In [None]:
# 🔤 TOKENIZATION DEMO CELL - See how text becomes numbers

print("🔤 TOKENIZATION DEMONSTRATION\n")
print("="*60)

# Example sentences to tokenize
example_sentences = [
    "This movie is amazing!",
    "Terrible film",
    "😍 Best movie ever!!! #MustWatch"
]

print("Let's see how the tokenizer converts text to numbers:\n")

for i, sentence in enumerate(example_sentences, 1):
    print(f"Example {i}: \"{sentence}\"")
    print("-" * 50)

    # Tokenize the sentence
    tokens = tokenizer(
        sentence,
        padding=True,      # Add padding if needed
        truncation=True,   # Cut if too long
        return_tensors='pt'  # Return PyTorch tensors
    )

    # Get the token IDs
    token_ids = tokens['input_ids'][0].tolist()

    # Decode each token
    print("  Token breakdown:")
    for token_id in token_ids:
        word = tokenizer.decode([token_id])
        # Clean up special tokens for display
        if word in ['[CLS]', '[SEP]', '[PAD]']:
            print(f"    {token_id:5d} → {word} (special token)")
        else:
            print(f"    {token_id:5d} → '{word}'")

    print(f"\n  Total tokens: {len(token_ids)}")
    print(f"  Full decoded: \"{tokenizer.decode(token_ids)}\"")
    print("\n")

print("💡 Note: [CLS] = start, [SEP] = end, [PAD] = padding")
print("   These special tokens help the model understand structure!")

## 🔢 Step 4: Tokenize All Our Data

### Now Let's Convert ALL Our Reviews!

We need to tokenize:
- Training reviews → numbers for learning
- Test reviews → numbers for evaluation

### Important Settings:
- **max_length=256**: Maximum tokens per review
  - Too short = lose information
  - Too long = slow training
- **padding=True**: Make all reviews same length
- **truncation=True**: Cut reviews that are too long

In [None]:
# 🔢 TOKENIZATION CELL - Convert all texts to numbers

print("🔄 TOKENIZING ALL DATA\n")
print("="*60)

# Define tokenization function
def tokenize_function(texts):
    """
    Convert texts to tokens that the model can understand.

    Args:
        texts: List of review texts
    Returns:
        Dictionary with input_ids and attention_mask
    """
    return tokenizer(
        texts,
        padding=True,      # Add padding to make all same length
        truncation=True,   # Cut texts that are too long
        max_length=256     # Maximum length (in tokens)
    )

# TOKENIZE TRAINING DATA
print("📝 Tokenizing training data...")
print(f"   Processing {len(train_texts)} reviews...")
train_encodings = tokenize_function(train_texts)
print("✅ Training data tokenized!")
print(f"   Shape: {len(train_encodings['input_ids'])} reviews")
print(f"   Max length: {len(train_encodings['input_ids'][0])} tokens\n")

# TOKENIZE TEST DATA
print("📝 Tokenizing test data...")
print(f"   Processing {len(test_texts)} reviews...")
test_encodings = tokenize_function(test_texts)
print("✅ Test data tokenized!")
print(f"   Shape: {len(test_encodings['input_ids'])} reviews")
print(f"   Max length: {len(test_encodings['input_ids'][0])} tokens\n")

# Show what the tokenizer created
print("📊 What the tokenizer created:")
print("  1. input_ids: The token numbers")
print("  2. attention_mask: Which tokens to pay attention to")
print("     (1 = real token, 0 = padding)\n")

# Example of tokenized data
print("📝 Example of tokenized review:")
print("-" * 50)
print(f"Original: \"{train_texts[0][:100]}...\"")
print(f"\nTokenized (first 20 tokens):")
print(f"  {train_encodings['input_ids'][0][:20]}")
print(f"\nAttention mask (first 20):")
print(f"  {train_encodings['attention_mask'][0][:20]}")
print("-" * 50)

print("\n✨ All data successfully tokenized and ready for training!")

## 📦 Create Dataset Objects

### Why Dataset Objects?

The Hugging Face library needs data in a special format.
Think of it like packaging your data in the right box for shipping! 📦

Each dataset contains:
- **input_ids**: The tokenized text
- **attention_mask**: Which parts to focus on
- **labels**: The correct answers (0 or 1)

In [None]:
# 📦 DATASET CREATION CELL - Package data for training

print("📦 Creating Dataset Objects\n")
print("="*60)

# Create training dataset
print("🏋️ Creating training dataset...")
train_dataset = Dataset.from_dict({
    'input_ids': train_encodings['input_ids'],        # The tokenized text
    'attention_mask': train_encodings['attention_mask'],  # What to pay attention to
    'labels': train_labels                            # The correct answers
})
print(f"✅ Training dataset created with {len(train_dataset)} samples")

# Create test dataset
print("\n🧪 Creating test dataset...")
test_dataset = Dataset.from_dict({
    'input_ids': test_encodings['input_ids'],
    'attention_mask': test_encodings['attention_mask'],
    'labels': test_labels
})
print(f"✅ Test dataset created with {len(test_dataset)} samples")

# Show dataset structure
print("\n📊 Dataset Structure:")
print("-" * 40)
print("Each sample contains:")
print("  • input_ids: Token numbers")
print("  • attention_mask: Focus indicators")
print("  • labels: Correct sentiment (0 or 1)")

# Show an example from the dataset
print("\n📝 Example from training dataset:")
print("-" * 40)
sample = train_dataset[0]
print(f"Label: {sample['labels']} ({'POSITIVE' if sample['labels'] == 1 else 'NEGATIVE'})")
print(f"Input IDs (first 10): {sample['input_ids'][:10]}...")
print(f"Attention (first 10): {sample['attention_mask'][:10]}...")

print("\n✅ Datasets are ready for training!")
print("   These will be fed to the model during training.")

## ⚙️ Step 5: Configure Training Settings

### 🎛️ Training Parameters Explained:

Think of these like recipe instructions:

| Parameter | What it means | Our Value | Analogy |
|-----------|---------------|-----------|----------|
| **Epochs** | How many times to see all data | 3 | Reading a book 3 times |
| **Batch Size** | Samples processed together | 16 | Students in a class |
| **Learning Rate** | How fast to learn | 2e-5 | Walking speed vs running |
| **Warmup Steps** | Gentle start | 500 | Stretching before exercise |

### 📊 What Happens During Training:

```
Epoch 1: See all data once → Learn basics
Epoch 2: See all data again → Improve understanding  
Epoch 3: See all data again → Perfect the knowledge
```

In [None]:
# ⚙️ TRAINING CONFIGURATION CELL - Set all training parameters

print("⚙️ CONFIGURING TRAINING PARAMETERS\n")
print("="*60)

# Define all training arguments
training_args = TrainingArguments(
    # WHERE TO SAVE
    output_dir='./results',              # Where to save model checkpoints

    # TRAINING SETTINGS
    num_train_epochs=3,                  # How many times to see all data
    per_device_train_batch_size=16,      # How many samples to process together
    per_device_eval_batch_size=32,       # Batch size for evaluation (can be larger)

    # LEARNING SETTINGS
    warmup_steps=500,                    # Steps for gradual learning rate increase
    weight_decay=0.01,                   # Regularization to prevent overfitting
    learning_rate=2e-5,                  # How fast to learn (0.00002)

    # LOGGING SETTINGS
    logging_dir='./logs',                # Where to save training logs
    logging_steps=10,                    # Log every 10 steps

    # EVALUATION SETTINGS
    evaluation_strategy="epoch",         # Evaluate after each epoch
    save_strategy="epoch",                # Save model after each epoch

    # BEST MODEL SETTINGS
    load_best_model_at_end=True,         # Load the best model at the end
    metric_for_best_model="eval_loss",   # What metric to use for "best"
    greater_is_better=False,              # Lower loss is better

    # OTHER SETTINGS
    push_to_hub=False,                   # Don't auto-upload (we'll do manually)
    report_to="none",                    # Don't use tracking tools
)

# Calculate training statistics
total_training_steps = len(train_dataset) // training_args.per_device_train_batch_size * training_args.num_train_epochs

print("📊 TRAINING CONFIGURATION SUMMARY:")
print("-" * 40)
print(f"📚 Epochs: {training_args.num_train_epochs}")
print(f"📦 Batch size: {training_args.per_device_train_batch_size} samples")
print(f"🎯 Learning rate: {training_args.learning_rate} (0.00002)")
print(f"🔥 Warmup steps: {training_args.warmup_steps}")
print(f"📈 Total training steps: ~{total_training_steps}")

print("\n⏱️ ESTIMATED TRAINING TIME:")
if device.type == 'cuda':
    estimated_time = total_training_steps * 0.5 / 60  # ~0.5 seconds per step on GPU
    print(f"  With GPU: ~{estimated_time:.1f} minutes")
else:
    estimated_time = total_training_steps * 2 / 60  # ~2 seconds per step on CPU
    print(f"  With CPU: ~{estimated_time:.1f} minutes")

print("\n💾 SAVING STRATEGY:")
print(f"  Model will be saved after each epoch to: {training_args.output_dir}")
print(f"  Best model will be selected based on: {training_args.metric_for_best_model}")

print("\n✅ Training configuration complete!")

## 📏 Step 6: Define How to Measure Success

### Metrics We'll Track:

1. **Accuracy**: What percentage did we get right?
   - Example: 90% = 9 out of 10 correct
   
2. **Loss**: How wrong were our predictions?
   - Lower is better
   - Like a golf score!

### Success Criteria:
- **Excellent**: >90% accuracy
- **Good**: 80-90% accuracy
- **Okay**: 70-80% accuracy
- **Needs work**: <70% accuracy

In [None]:
# 📏 METRICS CELL - Define how to measure model performance

print("📏 DEFINING EVALUATION METRICS\n")
print("="*60)

def compute_metrics(eval_pred):
    """
    Calculate metrics to evaluate model performance.

    Args:
        eval_pred: Contains predictions and true labels

    Returns:
        Dictionary with calculated metrics
    """
    # Extract predictions and labels
    predictions, labels = eval_pred

    # Get the predicted class (highest probability)
    # predictions shape: (num_samples, 2) - probabilities for each class
    predictions = np.argmax(predictions, axis=1)

    # Calculate accuracy
    accuracy = accuracy_score(labels, predictions)

    # Return metrics
    return {
        'accuracy': accuracy,
    }

print("✅ Metrics function defined!\n")

print("📊 Metrics we'll track during training:")
print("-" * 40)
print("1. ACCURACY:")
print("   • What it measures: Percentage of correct predictions")
print("   • Range: 0% to 100%")
print("   • Goal: As high as possible!")
print("\n2. LOSS:")
print("   • What it measures: How wrong the predictions are")
print("   • Range: 0 to infinity")
print("   • Goal: As low as possible!")

print("\n🎯 Success Benchmarks:")
print("   >95% accuracy = 🏆 Outstanding!")
print("   90-95% = 🎉 Excellent")
print("   85-90% = 😊 Very Good")
print("   80-85% = 👍 Good")
print("   <80% = 📈 Room for improvement")

print("\n✅ Ready to evaluate model performance!")

## 🎓 Step 7: Create the Trainer

### What is the Trainer?

The Trainer is like a personal coach for your model:
- 📚 Shows training examples
- 📝 Tests understanding
- 📈 Tracks progress
- 💾 Saves best version
- 🎯 Optimizes learning

It handles all the complex training logic for us!

In [None]:
# 🎓 TRAINER CREATION CELL - Set up the training manager

print("🎓 CREATING THE TRAINER\n")
print("="*60)

print("📦 Assembling all components...\n")

# Create the Trainer with all our components
trainer = Trainer(
    model=model,                          # Our DistilBERT model
    args=training_args,                   # Training configuration
    train_dataset=train_dataset,          # Training data
    eval_dataset=test_dataset,            # Test data for evaluation
    tokenizer=tokenizer,                  # Tokenizer for processing text
    data_collator=DataCollatorWithPadding(tokenizer=tokenizer),  # Handles padding
    compute_metrics=compute_metrics,      # Our metrics function
)

print("✅ Trainer successfully created!\n")

print("📊 TRAINER CONFIGURATION:")
print("-" * 40)
print(f"🤖 Model: {MODEL_NAME}")
print(f"📚 Training samples: {len(train_dataset)}")
print(f"🧪 Test samples: {len(test_dataset)}")
print(f"🔄 Epochs to train: {training_args.num_train_epochs}")
print(f"📦 Batch size: {training_args.per_device_train_batch_size}")

print("\n🎯 What the Trainer will do:")
print("  1. Show model training examples")
print("  2. Calculate how wrong predictions are")
print("  3. Adjust model to improve")
print("  4. Repeat for all epochs")
print("  5. Save best version")

print("\n" + "="*60)
print("🚀 READY TO START FINE-TUNING!")
print("="*60)
print("\n⏱️ Training will begin in the next cell...")
print("☕ This is a good time to grab coffee!")

## 🚀 Step 8: FINE-TUNE THE MODEL!

### This is the Main Event! 🎯

What happens during training:
1. **Epoch 1**: Model sees all data once, starts learning patterns
2. **Epoch 2**: Reinforces learning, improves accuracy
3. **Epoch 3**: Fine-tunes understanding, perfects predictions

Watch the loss go down and accuracy go up! 📈

In [None]:
# 🚀 TRAINING CELL - Fine-tune the model!

import time

print("🚀 STARTING FINE-TUNING PROCESS!\n")
print("="*60)
print("📚 Training Details:")
print(f"  • Model: {MODEL_NAME}")
print(f"  • Training samples: {len(train_dataset)}")
print(f"  • Epochs: {training_args.num_train_epochs}")
print(f"  • Device: {device}")
print("="*60)

print("\n⏱️ Starting timer...")
print("🔄 Training in progress...\n")
print("You'll see updates every few steps:")
print("  • loss = how wrong the model is (lower is better)")
print("  • learning_rate = how fast we're learning")
print("  • epoch = which round of training\n")

# Record start time
start_time = time.time()

# START TRAINING!
train_result = trainer.train()

# Calculate training duration
training_time = time.time() - start_time
minutes = int(training_time // 60)
seconds = int(training_time % 60)

# Display results
print("\n" + "="*60)
print("🎉 FINE-TUNING COMPLETE! 🎉")
print("="*60)

print("\n📊 TRAINING SUMMARY:")
print("-" * 40)
print(f"⏱️ Total time: {minutes} minutes {seconds} seconds")
print(f"📉 Final training loss: {train_result.training_loss:.4f}")
print(f"📈 Total steps: {train_result.global_step}")
print(f"🔄 Epochs completed: {training_args.num_train_epochs}")

# Performance interpretation
if train_result.training_loss < 0.3:
    print("\n✨ Excellent! Very low loss - model learned well!")
elif train_result.training_loss < 0.5:
    print("\n👍 Good! Reasonable loss - model is performing well!")
else:
    print("\n📈 Model trained, but might benefit from more epochs!")

print("\n💾 Model checkpoints saved to: ./results")
print("🎯 Best model automatically selected based on validation loss!")

## 🧪 Step 9: Evaluate Model Performance

### Time to Test Our Model! 🎯

Now we test on data the model has NEVER seen before.
This is the real test - like a final exam!

We'll check:
- **Accuracy**: How many did we get right?
- **Confusion Matrix**: Where did we make mistakes?
- **Classification Report**: Detailed performance metrics

In [None]:
# 🧪 EVALUATION CELL - Test the model's performance

print("🧪 EVALUATING MODEL PERFORMANCE\n")
print("="*60)
print("Testing on unseen data...\n")

# Evaluate the model
eval_results = trainer.evaluate()

# Display results
print("📊 EVALUATION RESULTS:")
print("="*40)
print(f"✅ Accuracy: {eval_results['eval_accuracy']*100:.2f}%")
print(f"📉 Loss: {eval_results['eval_loss']:.4f}")
print("="*40)

# Interpret the results
accuracy_percent = eval_results['eval_accuracy'] * 100

print("\n🎯 PERFORMANCE INTERPRETATION:")
if accuracy_percent >= 95:
    print(f"🏆 OUTSTANDING! {accuracy_percent:.1f}% accuracy is exceptional!")
    print("   Your model is performing at professional level!")
elif accuracy_percent >= 90:
    print(f"🎉 EXCELLENT! {accuracy_percent:.1f}% accuracy is very impressive!")
    print("   Your model is ready for production use!")
elif accuracy_percent >= 85:
    print(f"😊 VERY GOOD! {accuracy_percent:.1f}% accuracy is solid!")
    print("   Your model is performing well!")
elif accuracy_percent >= 80:
    print(f"👍 GOOD! {accuracy_percent:.1f}% accuracy is respectable!")
    print("   Your model has learned the patterns!")
else:
    print(f"📈 ROOM FOR IMPROVEMENT: {accuracy_percent:.1f}% accuracy")
    print("   Consider: more data, more epochs, or different parameters")

# Compare to baseline
print("\n📊 CONTEXT:")
print("  • Random guessing: 50% accuracy")
print("  • Untrained model: ~50-60% accuracy")
print(f"  • Your fine-tuned model: {accuracy_percent:.1f}% accuracy")
print(f"  • Improvement: +{accuracy_percent - 50:.1f}% over random!")

## 📊 Confusion Matrix - Where Did We Make Mistakes?

The confusion matrix shows:
- **True Negatives**: Correctly predicted negative
- **True Positives**: Correctly predicted positive
- **False Positives**: Wrongly said positive (was negative)
- **False Negatives**: Wrongly said negative (was positive)

In [None]:
# 📊 CONFUSION MATRIX CELL - Visualize prediction errors

print("📊 CREATING CONFUSION MATRIX\n")
print("="*60)

# Get predictions for test set
print("🔮 Getting model predictions...")
predictions = trainer.predict(test_dataset)
y_pred = np.argmax(predictions.predictions, axis=1)
y_true = predictions.label_ids

# Calculate confusion matrix
cm = confusion_matrix(y_true, y_pred)

# Create visualization
plt.figure(figsize=(10, 8))
sns.heatmap(cm, annot=True, fmt='d', cmap='Blues',
            xticklabels=['Negative', 'Positive'],
            yticklabels=['Negative', 'Positive'],
            cbar_kws={'label': 'Count'},
            annot_kws={'size': 14})

plt.title('Confusion Matrix - How Well Did We Predict?', fontsize=16, fontweight='bold', pad=20)
plt.ylabel('True Label (Actual)', fontsize=12)
plt.xlabel('Predicted Label (Model\'s Guess)', fontsize=12)

# Add text boxes explaining each quadrant
plt.text(0.5, -0.15, 'Perfect diagonal = Perfect predictions!',
         ha='center', transform=plt.gca().transAxes, fontsize=10, style='italic')

plt.tight_layout()
plt.show()

# Calculate and display metrics from confusion matrix
true_neg, false_pos, false_neg, true_pos = cm.ravel()
total = true_neg + false_pos + false_neg + true_pos

print("\n📊 CONFUSION MATRIX BREAKDOWN:")
print("="*50)
print(f"✅ True Negatives:  {true_neg:4d} (Correctly predicted negative)")
print(f"✅ True Positives:  {true_pos:4d} (Correctly predicted positive)")
print(f"❌ False Positives: {false_pos:4d} (Said positive, was negative)")
print(f"❌ False Negatives: {false_neg:4d} (Said negative, was positive)")
print("="*50)
print(f"Total Correct: {true_neg + true_pos}/{total} ({(true_neg + true_pos)/total*100:.1f}%)")
print(f"Total Wrong:   {false_pos + false_neg}/{total} ({(false_pos + false_neg)/total*100:.1f}%)")

## 📋 Detailed Performance Report

### Metrics Explained:
- **Precision**: When we say positive, how often are we right?
- **Recall**: Of all actual positives, how many did we find?
- **F1-Score**: Balance between precision and recall

In [None]:
# 📋 CLASSIFICATION REPORT CELL - Detailed metrics

print("📋 DETAILED CLASSIFICATION REPORT\n")
print("="*60)

# Generate classification report
report = classification_report(y_true, y_pred,
                              target_names=['Negative', 'Positive'],
                              digits=3)

print(report)

print("\n📖 HOW TO READ THIS REPORT:")
print("="*50)
print("PRECISION: Of all reviews we labeled as X, what % were correct?")
print("  • High precision = Few false positives")
print("  • Example: 0.90 = 90% of our 'positive' predictions were right")

print("\nRECALL: Of all actual X reviews, what % did we find?")
print("  • High recall = We found most of them")
print("  • Example: 0.85 = We found 85% of all positive reviews")

print("\nF1-SCORE: Harmonic mean of precision and recall")
print("  • Balance between precision and recall")
print("  • Higher is better (max = 1.0)")

print("\nSUPPORT: How many samples in each category")
print("  • Shows data distribution in test set")

print("\n🎯 QUICK ASSESSMENT:")
# Calculate average F1 score
report_dict = classification_report(y_true, y_pred,
                                   target_names=['Negative', 'Positive'],
                                   output_dict=True)
avg_f1 = report_dict['macro avg']['f1-score']

if avg_f1 > 0.9:
    print(f"  F1-Score of {avg_f1:.3f} = Excellent balanced performance!")
elif avg_f1 > 0.8:
    print(f"  F1-Score of {avg_f1:.3f} = Good balanced performance!")
else:
    print(f"  F1-Score of {avg_f1:.3f} = Room for improvement")

## 🎬 Step 10: Test with Real Movie Reviews!

### Let's See Our Model in Action! 🍿

Time to test with some real reviews and see how confident our model is!

In [None]:
# 🎬 TESTING CELL - Try the model with custom reviews

def predict_sentiment(text, model, tokenizer):
    """
    Predict sentiment for any text.
    Returns sentiment, confidence, and probability scores.
    """
    # Tokenize the input text
    inputs = tokenizer(text, return_tensors="pt",
                      truncation=True, padding=True,
                      max_length=256)

    # Move to correct device
    inputs = {k: v.to(device) for k, v in inputs.items()}

    # Get prediction (no gradient calculation needed)
    with torch.no_grad():
        outputs = model(**inputs)

    # Calculate probabilities
    probs = torch.nn.functional.softmax(outputs.logits, dim=-1)
    prediction = torch.argmax(probs, dim=-1)

    # Get confidence score
    confidence = probs[0][prediction].item()

    # Determine sentiment with emoji
    if prediction.item() == 1:
        sentiment = "POSITIVE 😊"
        emoji = "👍"
    else:
        sentiment = "NEGATIVE 😞"
        emoji = "👎"

    return sentiment, confidence, probs[0].cpu().numpy(), emoji

# Test reviews - mix of easy and challenging
test_reviews = [
    # Clear positive
    "This movie was absolutely phenomenal! Best film I've seen in years!",

    # Clear negative
    "Terrible waste of time. I want my money back. Worst movie ever.",

    # Subtle positive
    "A thoughtful and well-crafted story that stays with you.",

    # Subtle negative
    "The plot was confusing and the pacing felt off throughout.",

    # Mixed sentiment (challenging)
    "Great acting but terrible script. Not sure how I feel about it.",

    # Very short
    "Loved it!",

    # Sarcastic (challenging)
    "Oh great, another superhero movie. Just what we needed.",
]

print("🎬 TESTING WITH CUSTOM MOVIE REVIEWS\n")
print("="*70)

for i, review in enumerate(test_reviews, 1):
    sentiment, confidence, probs, emoji = predict_sentiment(review, model, tokenizer)

    # Display review
    print(f"Review #{i}:")
    print(f"📝 \"{review}\"")
    print()

    # Display prediction
    print(f"🤖 Model says: {sentiment} {emoji}")
    print(f"💪 Confidence: {confidence*100:.1f}%")

    # Visual confidence bar
    bar_length = 20
    filled = int(bar_length * confidence)
    bar = '█' * filled + '░' * (bar_length - filled)
    print(f"📊 [{bar}]")

    # Detailed scores
    print(f"📈 Scores: Negative={probs[0]*100:.1f}% | Positive={probs[1]*100:.1f}%")

    # Interpretation
    if confidence > 0.95:
        print("✨ Very confident prediction!")
    elif confidence > 0.8:
        print("👍 Confident prediction")
    elif confidence > 0.6:
        print("🤔 Somewhat uncertain")
    else:
        print("😕 Very uncertain - could go either way")

    print("-"*70)

print("\n💡 Note: Lower confidence on mixed/sarcastic reviews is expected!")

## 🎮 Interactive Testing - Try Your Own Reviews!

### Your Turn to Test the Model! 🎯

Type any movie review and see what the model thinks!

In [None]:
# 🎮 INTERACTIVE TESTING CELL - User can input their own reviews

print("🎮 INTERACTIVE SENTIMENT ANALYZER")
print("="*60)
print("Test the model with your own movie reviews!")
print("Type 'quit' to exit\n")

# Example prompts to inspire users
print("💡 Try different types of reviews:")
print("  • Clear positive: 'Amazing movie!'")
print("  • Clear negative: 'Boring and slow'")
print("  • Mixed feelings: 'Good acting but weak plot'")
print("  • Your actual opinion about a movie!\n")

# Interactive loop
review_count = 0
while review_count < 5:  # Limit to 5 reviews in notebook
    user_input = input(f"\n👤 Enter review #{review_count + 1} (or 'quit'): ")

    if user_input.lower() == 'quit':
        print("\n👋 Thanks for testing the model!")
        break

    if len(user_input.strip()) < 3:
        print("⚠️ Please enter a longer review (at least 3 characters)")
        continue

    # Get prediction
    sentiment, confidence, probs, emoji = predict_sentiment(user_input, model, tokenizer)

    # Display results
    print(f"\n🤖 ANALYSIS:")
    print(f"   Sentiment: {sentiment} {emoji}")
    print(f"   Confidence: {confidence*100:.1f}%")

    # Visual confidence meter
    bar_length = 30
    filled = int(bar_length * confidence)
    bar = '█' * filled + '░' * (bar_length - filled)
    print(f"   Confidence: [{bar}]")

    # Detailed breakdown
    print(f"\n   📊 Probability Breakdown:")
    print(f"      Negative: {probs[0]*100:.1f}%")
    print(f"      Positive: {probs[1]*100:.1f}%")

    review_count += 1

    if review_count < 5:
        print("\n" + "-"*60)

if review_count == 5:
    print("\n📝 Reached 5 reviews limit. Restart cell to test more!")

print("\n" + "="*60)
print("🎉 Great job testing the model!")

## 💾 Step 11: Save Your Fine-tuned Model

### Time to Save Your Work! 📦

We'll save:
1. The model weights (the learned knowledge)
2. The tokenizer (text processor)
3. Configuration files

This lets you use the model later without retraining!

In [None]:
# 💾 SAVE MODEL CELL - Save your fine-tuned model locally

print("💾 SAVING YOUR FINE-TUNED MODEL\n")
print("="*60)

# Define save directory
save_directory = "./my_movie_sentiment_model"

print(f"📁 Save location: {save_directory}")
print("\n📦 Saving components:")

# Save the model
print("  1. Saving model weights...")
trainer.save_model(save_directory)
print("     ✅ Model saved!")

# Save the tokenizer
print("  2. Saving tokenizer...")
tokenizer.save_pretrained(save_directory)
print("     ✅ Tokenizer saved!")

print("\n✅ Model successfully saved!\n")

# Check what was saved
import os
saved_files = os.listdir(save_directory)

print("📁 SAVED FILES:")
print("-" * 40)
total_size = 0
for file in saved_files:
    file_path = os.path.join(save_directory, file)
    file_size = os.path.getsize(file_path) / (1024*1024)  # Convert to MB
    total_size += file_size
    print(f"  📄 {file:30s} ({file_size:6.1f} MB)")

print("-" * 40)
print(f"  📊 Total size: {total_size:.1f} MB")

print("\n📝 HOW TO LOAD THIS MODEL LATER:")
print("```python")
print("from transformers import AutoModelForSequenceClassification, AutoTokenizer")
print(f"model = AutoModelForSequenceClassification.from_pretrained('{save_directory}')")
print(f"tokenizer = AutoTokenizer.from_pretrained('{save_directory}')")
print("```")

print("\n✨ Your model is saved and ready to use anytime!")

## 🚀 Step 12: Deploy to Hugging Face Hub (Optional)

### Share Your Model with the World! 🌍

By uploading to Hugging Face:
- Anyone can use your model
- You get a model card page
- Free hosting forever
- Version control included

### Prerequisites:
1. Create account at https://huggingface.co
2. Get token from https://huggingface.co/settings/tokens

In [None]:
# 🚀 DEPLOYMENT CELL - Upload to Hugging Face Hub

print("🚀 DEPLOYING TO HUGGING FACE HUB\n")
print("="*60)

print("📝 SETUP INSTRUCTIONS:")
print("1. Go to: https://huggingface.co/join (create account)")
print("2. Go to: https://huggingface.co/settings/tokens")
print("3. Click 'New token'")
print("4. Name it, select 'write' permission")
print("5. Copy the token\n")

# Login to Hugging Face
from huggingface_hub import notebook_login

print("🔐 Please login to Hugging Face:")
print("(Paste your token in the box below)\n")

# This creates a login widget
notebook_login()

In [None]:
# 📤 UPLOAD MODEL CELL - Push to Hugging Face

print("📤 UPLOADING MODEL TO HUGGING FACE\n")
print("="*60)

# Choose a name for your model
model_name = "my-imdb-sentiment-analyzer"  # Change this to your preference!

print(f"📦 Model name: {model_name}")
print("\n🌐 Uploading... (this may take 1-2 minutes)\n")

try:
    # Upload model
    print("📤 Uploading model...")
    model.push_to_hub(model_name, use_temp_dir=True)
    print("✅ Model uploaded!")

    # Upload tokenizer
    print("\n📤 Uploading tokenizer...")
    tokenizer.push_to_hub(model_name, use_temp_dir=True)
    print("✅ Tokenizer uploaded!")

    print("\n" + "="*60)
    print("🎉 SUCCESS! Model deployed to Hugging Face Hub!")
    print("="*60)

    print(f"\n🌐 Your model page: https://huggingface.co/YOUR_USERNAME/{model_name}")
    print("   (Replace YOUR_USERNAME with your Hugging Face username)")

    print("\n📝 HOW OTHERS CAN USE YOUR MODEL:")
    print("```python")
    print("from transformers import pipeline")
    print(f"classifier = pipeline('sentiment-analysis', model='YOUR_USERNAME/{model_name}')")
    print("result = classifier('This movie is amazing!')")
    print("print(result)")
    print("```")

    print("\n🎊 Congratulations! You've deployed an AI model!")

except Exception as e:
    print(f"\n⚠️ Upload failed: {str(e)}")
    print("\nTroubleshooting:")
    print("1. Make sure you're logged in with a valid token")
    print("2. Check your internet connection")
    print("3. Verify your token has 'write' permissions")
    print("\n💡 Your model is still saved locally and works fine!")

## 🎓 Final Summary - What You've Accomplished!

### 🏆 Congratulations! You've Successfully:

✅ **Loaded** a pre-trained DistilBERT model  
✅ **Prepared** movie review data for training  
✅ **Fine-tuned** the model on sentiment analysis  
✅ **Achieved** ~{accuracy}% accuracy!  
✅ **Tested** with custom reviews  
✅ **Saved** your model locally  
✅ **Deployed** to Hugging Face (optional)  

### 📊 Your Model's Journey:

```
Start: Generic language model (50% accuracy on sentiment)
                    ↓
         Fine-tuning with YOUR data
                    ↓
End: Specialized sentiment expert (~90%+ accuracy!)
```

### 🔑 Key Skills You've Learned:

1. **Transfer Learning** - Using pre-trained models
2. **Data Preparation** - Converting text to model format
3. **Fine-tuning** - Specializing models for your task
4. **Evaluation** - Measuring model performance
5. **Deployment** - Sharing models with others

### 🚀 What Can You Do Next?

#### With Your Current Model:
- Build a web app for movie review analysis
- Create an API endpoint
- Analyze movie review datasets
- Add to your portfolio!

#### Apply to Your Field (Electrical Engineering):
- **Equipment logs**: Classify failure types
- **Maintenance reports**: Predict urgency levels
- **Customer feedback**: Analyze satisfaction
- **Technical docs**: Categorize by topic

### 📚 Resources for Continued Learning:

1. **Hugging Face Course** (Free!): https://huggingface.co/course
2. **Model Hub**: https://huggingface.co/models
3. **Documentation**: https://huggingface.co/docs
4. **Community**: https://discuss.huggingface.co

### 💡 Final Thoughts:

> "You've just trained an AI model that understands human sentiment!
> This same technique powers ChatGPT, Google Search, and countless
> AI applications. You're now part of the AI revolution!"

### 🎉 Amazing Work! You're Now an AI Practitioner!

---

## Questions? Let's Discuss! 🙋‍♂️🙋‍♀️

Feel free to ask about:
- How to improve accuracy
- Different model architectures
- Applying to your specific use case
- Troubleshooting issues
- Next steps in your AI journey

**Thank you for joining this session!** 🙏