# 03: Fine-Tuning - Going Deeper

**Course:** 21CSE558T - Deep Neural Network Architectures  
**Module 4:** CNNs & Transfer Learning (Week 12)  
**Estimated Time:** 10-12 minutes  
**Prerequisites:** Notebook 02  
**Goal:** Unfreeze top layers for 92-95% accuracy

---

## 📚 What You'll Learn

In this notebook, you will:
1. Start with feature extraction baseline (90% accuracy)
2. Unfreeze top 20% of ResNet50 layers
3. Fine-tune with lower learning rate (1e-5)
4. Achieve **92-95% accuracy** (extra 2-5% improvement!)

**Key Concept:** _"Carefully adjust pre-trained features for your specific task"_

---

In [None]:
# Setup
import tensorflow as tf
import tensorflow_datasets as tfds
import numpy as np
import matplotlib.pyplot as plt
from tensorflow.keras.applications import ResNet50
from tensorflow.keras import Sequential
from tensorflow.keras.layers import GlobalAveragePooling2D, Dense

print(f"TensorFlow version: {tf.__version__}")
print("Ready for fine-tuning!\n")

## Step 1: Load Dataset (Same as Before)

We'll use the same TF Flowers dataset.

In [None]:
# Load dataset
(train_ds, val_ds), info = tfds.load(
    'tf_flowers',
    split=['train[:80%]', 'train[80%:]'],
    as_supervised=True,
    with_info=True
)

num_classes = info.features['label'].num_classes
class_names = info.features['label'].names

# Preprocess
IMG_SIZE = 224
BATCH_SIZE = 32

def preprocess(image, label):
    image = tf.image.resize(image, (IMG_SIZE, IMG_SIZE))
    # Use ResNet50-specific preprocessing
    image = tf.keras.applications.resnet50.preprocess_input(image)
    return image, label

train_ds = train_ds.map(preprocess).batch(BATCH_SIZE).prefetch(tf.data.AUTOTUNE)
val_ds = val_ds.map(preprocess).batch(BATCH_SIZE).prefetch(tf.data.AUTOTUNE)

print("Dataset ready!")

## Step 2: Build Model with Unfrozen Top Layers

**Key Difference from Notebook 02:**
- Notebook 02: ALL layers frozen
- Notebook 03: Top 20% layers UNFROZEN

**Strategy:**
1. Load ResNet50 with ImageNet weights
2. Set `base_model.trainable = True` (enable training)
3. Freeze ONLY the first 80% of layers
4. Let top 20% layers adjust to our flower images

In [None]:
# Load ResNet50
base_model = ResNet50(
    weights='imagenet',
    include_top=False,
    input_shape=(IMG_SIZE, IMG_SIZE, 3)
)

# Enable training for the base model
base_model.trainable = True

# Count total layers
total_layers = len(base_model.layers)
freeze_until = int(0.8 * total_layers)  # Freeze first 80%

# Freeze first 80% of layers
for layer in base_model.layers[:freeze_until]:
    layer.trainable = False

# Count trainable layers
trainable_layers = sum([layer.trainable for layer in base_model.layers])

print(f"Total layers: {total_layers}")
print(f"Frozen layers: {freeze_until} (80%)")
print(f"Trainable layers: {trainable_layers} (20%)")
print("\nTop 20% of ResNet50 is now unfrozen!")

## Step 3: Build Complete Model

In [None]:
# Build model
model = Sequential([
    base_model,
    GlobalAveragePooling2D(),
    Dense(num_classes, activation='softmax')
], name='ResNet50_FineTuning')

model.summary()

## Step 4: Compile with LOWER Learning Rate

**CRITICAL:** Use 100× lower learning rate!

- Feature extraction (Notebook 02): LR = 1e-3 (default Adam)
- Fine-tuning (Notebook 03): LR = 1e-5 (100× smaller)

**Why lower LR?**
- Pre-trained weights are already good
- We want to make small adjustments, not destroy them
- High LR would break the learned features!

In [None]:
# Compile with LOWER learning rate
model.compile(
    optimizer=tf.keras.optimizers.Adam(1e-5),  # 100× smaller than default!
    loss='sparse_categorical_crossentropy',
    metrics=['accuracy']
)

print("Model compiled with learning rate: 1e-5")
print("(100× lower than feature extraction!)")

## Step 5: Fine-Tune!

**Expected Results:**
- Start: 90% accuracy (inherited from ImageNet features)
- After 10 epochs: 92-95% accuracy
- Improvement: Extra 2-5% from fine-tuning!

**Training time:** ~5-7 minutes (slower than feature extraction)

In [None]:
# Train!
print("🔥 Fine-tuning ResNet50...\n")
print("Watch accuracy improve beyond 90%!\n")

history = model.fit(
    train_ds,
    validation_data=val_ds,
    epochs=10,  # More epochs than feature extraction
    verbose=1
)

print("\n✅ Fine-tuning complete!")

## Step 6: Compare All Three Approaches

In [None]:
# Plot results
plt.figure(figsize=(14, 5))

# Accuracy
plt.subplot(1, 2, 1)
plt.plot(history.history['accuracy'], label='Train', marker='o')
plt.plot(history.history['val_accuracy'], label='Val', marker='s')
plt.axhline(y=0.50, color='red', linestyle='--', label='Scratch (50%)', alpha=0.7)
plt.axhline(y=0.90, color='orange', linestyle='--', label='Feature Extraction (90%)', alpha=0.7)
plt.title('Fine-Tuning: Best Accuracy!', fontsize=14, fontweight='bold')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.legend()
plt.grid(True, alpha=0.3)

# Loss
plt.subplot(1, 2, 2)
plt.plot(history.history['loss'], label='Train', marker='o')
plt.plot(history.history['val_loss'], label='Val', marker='s')
plt.title('Fine-Tuning: Lower Loss', fontsize=14, fontweight='bold')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.legend()
plt.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

# Print final comparison
final_val_acc = history.history['val_accuracy'][-1]

print("\n" + "="*70)
print("🎯 FINAL COMPARISON - ALL THREE APPROACHES")
print("="*70)
print(f"\n{'Approach':<30} {'Val Accuracy':<20} {'Improvement'}")
print("-"*70)
print(f"{'Notebook 01 (Scratch)':<30} {'~50%':<20} {'Baseline'}")
print(f"{'Notebook 02 (Feature Extract)':<30} {'~90%':<20} {'+40%'}")
print(f"{'Notebook 03 (Fine-Tuning)':<30} {f'{final_val_acc:.1%}':<20} {f'+{(final_val_acc - 0.50)*100:.0f}%'}")
print("-"*70)
print(f"\n✨ Fine-tuning achieved: {final_val_acc:.1%}")
print(f"   Extra improvement: {(final_val_acc - 0.90)*100:.1f}% over feature extraction")
print("="*70)

## Step 7: When to Use Each Strategy?

**Decision Matrix:**

| Dataset Size | Time Budget | Accuracy Need | Strategy |
|--------------|-------------|---------------|----------|
| < 1,000 | Low | Moderate | Feature Extraction ⭐ |
| 1,000-10,000 | Medium | High | Fine-Tuning ⭐⭐ |
| > 10,000 | High | Maximum | Fine-Tuning + More layers |
| > 100,000 | High | Maximum | Consider training from scratch |

**For our TF Flowers (3,000 images):**
- ✅ Fine-tuning is PERFECT choice!
- Gets us 92-95% accuracy
- Only takes 5-7 minutes

---

## 🎓 Summary: What You Learned

### Fine-Tuning Strategy:
1. ✅ **Start with pre-trained model:** ResNet50 with ImageNet
2. ✅ **Unfreeze top layers:** Top 20% can adjust
3. ✅ **Use LOWER learning rate:** 1e-5 (100× smaller)
4. ✅ **Train longer:** 10 epochs instead of 5
5. ✅ **Get extra 2-5%:** From 90% to 93%!

### Comparison:
| Aspect | Feature Extraction | Fine-Tuning |
|--------|-------------------|-------------|
| Freeze | All layers | First 80% only |
| Learning Rate | 1e-3 (default) | 1e-5 (lower) |
| Epochs | 5 | 10 |
| Training Time | 3 min | 5-7 min |
| Accuracy | 88-92% | 92-95% |
| Best For | <1K images | 1K-10K images |

### Key Insight:
**"Fine-tuning = gentle adjustments to pre-trained features"**

- Don't destroy ImageNet knowledge
- Just adapt it slightly for your domain
- Lower LR is CRITICAL!

---

## ✅ Key Takeaways

- ✅ **When to fine-tune:** 1K-10K images, need high accuracy
- ✅ **How to fine-tune:** Unfreeze top 20%, use LR = 1e-5
- ✅ **Why lower LR:** Prevent destroying learned features
- ✅ **Expected gain:** Extra 2-5% over feature extraction

---

## 🚀 Next Steps

**Ready to compare different models?**

👉 Open **Notebook 04** to compare:
- VGG16 (simple but large)
- ResNet50 (default choice)
- MobileNetV2 (fast and small)

Learn which model to use for which scenario!

---

**End of Notebook 03**

**Status:** ✅ Fine-tuning mastered!

**Achievement Unlocked:** 🏆 93%+ accuracy with fine-tuning

**Time spent:** ~10-12 minutes

**Next:** Notebook 04 - Model Zoo 🦁