# Day-67: Transfer Learning with Pretrained Models

Welcome to Day 67! We are diving into the most powerful, time-saving, and industry-standard technique in computer vision: Transfer Learning! Let's smash this killer concept, guys!

Imagine you’re training a CNN from scratch — it’s like teaching a baby to recognize animals from zero. That takes time, tons of images, and lots of computation!

But what if I told you there’s a shortcut?
You can take a model that’s already seen millions of images — like ResNet or MobileNet — and just adapt it to your task. That’s transfer learning — the secret weapon behind 90% of modern computer vision success stories.

## Topics Covered

- What is Transfer Learning?

- Feature Extraction vs Fine-Tuning

- Pretrained Models: ResNet & MobileNet

- Practical Analogy

- Hands-on Code Example (Keras/TensorFlow)

## What is Transfer Learning?

Transfer Learning is the idea of reusing a model that was already trained on a massive, general dataset (like the 14 million images in ImageNet) for a smaller, specific task (like classifying hot dogs vs. not hot dogs).

- `Analogy`: Learning to Drive a Truck. 
    - If you already know how to drive a car (trained on general road rules and mechanics), you don't start from scratch when you learn to drive a truck. 
    - You already have the foundational skills (steering, braking, mirrors). You just need to fine-tune those skills for a bigger vehicle.

## Feature Extraction vs Fine-Tuning

### Method 1: Feature Extraction (The Quick Way)

This is the fastest method, perfect when your new dataset is small and very similar to the original ImageNet data.

`Process`: 
1. We take the entire pre-trained CNN and freeze all the Convolutional layers. 
2. We only train the final FFNN layers (the Dense layers) that we add on top.

`Analogy`:
It’s like using a trained chef’s knife skills but asking them to learn only your recipe — no need to relearn how to chop onions.

`Benefit`: 
1. Since you're only training a tiny fraction of the parameters, it trains incredibly fast and prevents overfitting on small datasets.

### Method 2: Fine-Tuning (The Best Performance)

This method is used when your new dataset is large but still slightly different from ImageNet.

`Process`: 
1. We unfreeze the last few Convolutional blocks of the pre-trained model. 
2. We then train the entire model (both the new Dense layers and the un-frozen Conv layers) using an extremely low learning rate (e.g., $10^{−5}$ ).

`Analogy`: Tweaking the Engine. You let the model slightly adjust the final, high-level features it learned (like dog noses or car doors) to better suit the specific objects in your new dataset.

`Benefit`: This provides the best performance but takes longer to train than simple Feature Extraction.

## Choosing the Right Pre-trained Model (ResNet vs. MobileNet)

When selecting a model, we look at performance versus size:

| Model | Focus | Key Innovation | When to Use |
| :------ | :-------- | :------------------------------------------------------------ | :------------------------------------------------------------- |
| **ResNet (e.g., ResNet50)** | Performance | Uses **Skip Connections (Residual Blocks)** to allow training of 100+ layers without vanishing gradients. | Best accuracy; use when you have sufficient **GPU power**. |
| **MobileNet (e.g., MobileNetV2)** | Efficiency | Uses **Depthwise Separable Convolutions** to drastically reduce computation and model size. | Best for **mobile devices**, **web apps**, and **quick training** where speed matters. |


## Code Example : Feature Extraction & Fine-Tuning

In [4]:
# Day 67 - Transfer Learning using ResNet50 and MobileNetV2
from tensorflow.keras.applications import ResNet50, MobileNetV2
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Dense, GlobalAveragePooling2D
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.optimizers import Adam

# Step 1: Load pretrained model (without top layers)
base_model = ResNet50(weights='imagenet', include_top=False, input_shape=(224,224,3))

# Step 2: Freeze base layers (Feature Extraction)
for layer in base_model.layers:
    layer.trainable = False

# Step 3: Add custom layers
x = base_model.output
x = GlobalAveragePooling2D()(x)
x = Dense(128, activation='relu')(x)
predictions = Dense(2, activation='softmax')(x)

model = Model(inputs=base_model.input, outputs=predictions)

# Step 4: Compile and Train
model.compile(optimizer=Adam(learning_rate=0.001),
              loss='categorical_crossentropy',
              metrics=['accuracy'])

# Data generator
train_datagen = ImageDataGenerator(rescale=1./255)
train_gen = train_datagen.flow_from_directory('data/train', target_size=(224,224), batch_size=32)

model.fit(train_gen, epochs=5)

# Step 5: Fine-tuning (unfreeze top few layers)
for layer in base_model.layers[-10:]:
    layer.trainable = True

model.compile(optimizer=Adam(learning_rate=1e-5),
              loss='categorical_crossentropy',
              metrics=['accuracy'])

model.fit(train_gen, epochs=3)


FileNotFoundError: [WinError 3] The system cannot find the path specified: 'data/train'

## Summary of Day 67

Smashing job today, guys! We've unlocked the secret weapon of modern Deep Learning: Transfer Learning!

We learned the two core methods:

1. Feature Extraction: Freeze the base, train only the new top Dense layer. Fastest method.

2. Fine-Tuning: Unfreeze the top Conv layers and train them with a tiny learning rate. Highest accuracy method.

We also know why models like ResNet and MobileNet are the industry standard for this task!

## What's Next (Day 68)

We now have the architecture (CNN) and the method (Transfer Learning). But what if your dataset is still too small, even for Feature Extraction?

Tomorrow, on Day 68, we will learn the final technique to fight overfitting and massively increase the diversity of your data without collecting a single new image: Data Augmentation! We'll cover Flipping, Cropping, Rotation, and Noise Injection!