# 01: The Problem - Why Transfer Learning?

**Course:** 21CSE558T - Deep Neural Network Architectures  
**Module 4:** CNNs & Transfer Learning (Week 12)  
**Estimated Time:** 5-7 minutes  
**Goal:** Understand why training from scratch fails

---

## What You'll Learn

1. Train a CNN from scratch on small dataset
2. Observe poor accuracy and overfitting
3. Understand the data hunger problem
4. See why we need transfer learning

---

In [1]:
# Setup
import tensorflow as tf
import tensorflow_datasets as tfds
import numpy as np
import matplotlib.pyplot as plt

print(f"TensorFlow version: {tf.__version__}")
print(f"GPU available: {len(tf.config.list_physical_devices('GPU')) > 0}")
print("\nSetup complete!")

TensorFlow version: 2.16.2
GPU available: False

Setup complete!


## Step 1: Load Dataset

We'll use TensorFlow Flowers dataset:
- Total: 3,670 images
- Classes: 5 (daisy, dandelion, roses, sunflowers, tulips)
- **Question:** Can we train CNN from scratch with 3,000 images?

In [2]:
# Load flowers dataset
print("Loading TF Flowers dataset...")
(train_ds, val_ds), info = tfds.load(
    'tf_flowers',
    split=['train[:80%]', 'train[80%:]'],
    as_supervised=True,
    with_info=True
)

num_classes = info.features['label'].num_classes
class_names = info.features['label'].names

print(f"\nClasses: {num_classes}")
print(f"Names: {class_names}")
print(f"Training samples: ~2,900")
print(f"Validation samples: ~770")

Loading TF Flowers dataset...




[1mDownloading and preparing dataset Unknown size (download: Unknown size, generated: Unknown size, total: Unknown size) to /Users/rameshbabu/tensorflow_datasets/tf_flowers/3.0.1...[0m


Dl Completed...: 0 url [00:00, ? url/s]

Dl Size...: 0 MiB [00:00, ? MiB/s]

Generating splits...:   0%|          | 0/1 [00:00<?, ? splits/s]

Generating train examples...: 0 examples [00:00, ? examples/s]

Shuffling /Users/rameshbabu/tensorflow_datasets/tf_flowers/incomplete.426JAJ_3.0.1/tf_flowers-train.tfrecord*.â€¦

[1mDataset tf_flowers downloaded and prepared to /Users/rameshbabu/tensorflow_datasets/tf_flowers/3.0.1. Subsequent calls will reuse this data.[0m

Classes: 5
Names: ['dandelion', 'daisy', 'tulips', 'sunflowers', 'roses']
Training samples: ~2,900
Validation samples: ~770


## Step 2: Visualize Samples

In [None]:
# Show sample images
plt.figure(figsize=(12, 8))
for i, (image, label) in enumerate(train_ds.take(9)):
    plt.subplot(3, 3, i + 1)
    plt.imshow(image.numpy().astype("uint8"))
    plt.title(class_names[label.numpy()])
    plt.axis('off')
plt.suptitle('Sample Flower Images', fontsize=16)
plt.tight_layout()
plt.show()

print("Beautiful flowers! But only 3,000 images total...")
print("Is this enough? Let's find out!")

## Step 3: Preprocess Data

In [None]:
# Preprocessing
IMG_SIZE = 128
BATCH_SIZE = 32

def preprocess(image, label):
    image = tf.image.resize(image, (IMG_SIZE, IMG_SIZE))
    image = image / 255.0
    return image, label

train_ds = train_ds.map(preprocess).batch(BATCH_SIZE).prefetch(tf.data.AUTOTUNE)
val_ds = val_ds.map(preprocess).batch(BATCH_SIZE).prefetch(tf.data.AUTOTUNE)

print(f"Image size: {IMG_SIZE}x{IMG_SIZE}x3")
print(f"Batch size: {BATCH_SIZE}")

## Step 4: Build Simple CNN

In [None]:
# Build CNN from scratch
model = tf.keras.Sequential([
    tf.keras.layers.Conv2D(32, 3, activation='relu', input_shape=(IMG_SIZE, IMG_SIZE, 3)),
    tf.keras.layers.MaxPooling2D(2),
    tf.keras.layers.Conv2D(64, 3, activation='relu'),
    tf.keras.layers.MaxPooling2D(2),
    tf.keras.layers.Conv2D(128, 3, activation='relu'),
    tf.keras.layers.MaxPooling2D(2),
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(128, activation='relu'),
    tf.keras.layers.Dense(num_classes, activation='softmax')
], name='CNN_From_Scratch')

model.compile(
    optimizer='adam',
    loss='sparse_categorical_crossentropy',
    metrics=['accuracy']
)

model.summary()

## Step 5: Train From Scratch

**Prediction:** With only 3,000 images, expect:
- Low accuracy (45-55%)
- Severe overfitting
- Not good enough for real use!

In [None]:
# Train
print("Training CNN from scratch...\n")
print("This will take 2-3 minutes.\n")

history = model.fit(
    train_ds,
    validation_data=val_ds,
    epochs=10,
    verbose=1
)

print("\nTraining complete!")

## Step 6: Analyze Results

In [None]:
# Plot results
plt.figure(figsize=(14, 5))

plt.subplot(1, 2, 1)
plt.plot(history.history['accuracy'], label='Train', marker='o')
plt.plot(history.history['val_accuracy'], label='Val', marker='s')
plt.title('Accuracy (Training from Scratch)')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.legend()
plt.grid(True, alpha=0.3)

plt.subplot(1, 2, 2)
plt.plot(history.history['loss'], label='Train', marker='o')
plt.plot(history.history['val_loss'], label='Val', marker='s')
plt.title('Loss (Training from Scratch)')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.legend()
plt.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

# Print results
final_train_acc = history.history['accuracy'][-1]
final_val_acc = history.history['val_accuracy'][-1]
gap = final_train_acc - final_val_acc

print("\n" + "="*50)
print("FINAL RESULTS:")
print("="*50)
print(f"Training Accuracy:   {final_train_acc:.2%}")
print(f"Validation Accuracy: {final_val_acc:.2%}")
print(f"Overfitting Gap:     {gap:.2%}")
print("="*50)
print("\nTHE PROBLEM:")
print(f"  Low validation accuracy ({final_val_acc:.0%})")
print(f"  Large overfitting gap ({gap:.0%})")
print("\nDIAGNOSIS: NOT ENOUGH DATA!")
print("We need MILLIONS of images.")
print("We only have 3,000.")
print("\nSOLUTION: Transfer Learning!")

## Step 7: The Data Hunger Problem

In [None]:
# Visualize data requirements
dataset_sizes = [500, 1000, 3000, 10000, 50000, 100000, 1000000]
accuracy_scratch = [25, 30, 50, 65, 78, 85, 92]

plt.figure(figsize=(12, 6))
plt.plot(dataset_sizes, accuracy_scratch, marker='o', linewidth=2, label='From Scratch')
plt.axvline(x=3000, color='red', linestyle='--', linewidth=2, label='Your Data (3,000)')
plt.axhline(y=50, color='orange', linestyle='--', alpha=0.5)

plt.scatter([3000], [50], color='red', s=200, zorder=5)
plt.text(3500, 53, 'You are here: 50% accuracy', fontsize=12, color='red')
plt.text(100000, 87, 'Need 100K+ images for 85%!', fontsize=11, color='green')

plt.xscale('log')
plt.xlabel('Dataset Size (images)')
plt.ylabel('Accuracy (%)')
plt.title('Data Hunger Problem: Training From Scratch')
plt.grid(True, alpha=0.3)
plt.legend()
plt.tight_layout()
plt.show()

print("\nKEY INSIGHT:")
print("For 85% accuracy, you need 100,000+ images (33x more!)")
print("\nREALITY: Most people don't have this much data!")
print("\nSOLUTION: Transfer Learning - use pre-trained models!")
print("\nNext: Notebook 02 will show 90% accuracy with same 3,000 images!")

---

## Summary

### The Problem:
- Small dataset (3,000 images) = poor results
- Only 45-55% validation accuracy
- Severe overfitting
- Need 100,000-1,000,000 images for good results

### The Question:
**How can we build accurate models without millions of images?**

### The Answer:
**Transfer Learning!** See Notebook 02

---

## Next Steps

Open **Notebook 02: Feature Extraction** to see:
- How to use ResNet50 pre-trained on ImageNet
- How to get **88-92% accuracy** with same 3,000 images!
- The improvement: 45% â†’ 90% (2x better!)

**Ready for the magic?** ðŸŽ‰