# üìò TRANSFER LEARNING ‚Äì PRACTICAL GUIDE (Keras + PyTorch)

### Apply AlexNet, VGG, GoogLeNet, ResNet, Inception, Xception, MobileNet to Real Projects

This notebook collects **ready-to-use templates** for transfer learning using multiple
popular CNN architectures in **TensorFlow/Keras** and **PyTorch**.

Use it as:
- Teaching material for students
- A personal reference while building image projects
- A starting point for custom transfer learning pipelines

---

## üåç 1. What is Transfer Learning?

**Transfer Learning** means:
- Start from a **pretrained model** (usually trained on ImageNet: 1.2M images, 1000 classes).
- Reuse it as a **feature extractor** or fine-tune it for a **new task**.

Typical workflow:
1. Load a pretrained backbone (e.g., ResNet50, VGG16, MobileNetV2).
2. **Freeze** some or all of the base layers.
3. **Replace** the final classifier head with a new one for your number of classes.
4. Train on your custom dataset (cats vs dogs, medical images, etc.).
5. Optionally **unfreeze deeper layers** for fine-tuning.

Transfer learning works best when:
- Your dataset is smaller than ImageNet.
- Your data is somewhat similar to natural images.
- You want good accuracy without training from scratch.

## üß± 2. General Template ‚Äì Keras (TensorFlow)

Below is a **generic transfer learning template** you can adapt for any Keras application model.

In [None]:
from tensorflow.keras.applications import *
from tensorflow.keras.layers import *
from tensorflow.keras.models import Model
from tensorflow.keras.optimizers import Adam

# Example: ResNet50 backbone
num_classes = 5  # change to your number of classes

base = ResNet50(weights='imagenet', include_top=False, input_shape=(224, 224, 3))
base.trainable = False  # Freeze backbone for feature extraction

x = GlobalAveragePooling2D()(base.output)
x = Dense(256, activation='relu')(x)
x = Dropout(0.4)(x)
out = Dense(num_classes, activation='softmax')(x)

model = Model(inputs=base.input, outputs=out)
model.compile(optimizer=Adam(1e-3), loss='categorical_crossentropy', metrics=['accuracy'])

model.summary()

## üß± 3. General Template ‚Äì PyTorch

Generic pattern for using a pretrained model (ResNet50 example) as a feature extractor.

In [None]:
import torch
import torch.nn as nn
from torchvision import models

device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')

num_classes = 5  # change to your number of classes

# Load pretrained ResNet50
model = models.resnet50(weights="IMAGENET1K_V2")

# Freeze all layers
for param in model.parameters():
    param.requires_grad = False

# Replace the final fully-connected layer
model.fc = nn.Sequential(
    nn.Linear(model.fc.in_features, 256),
    nn.ReLU(),
    nn.Dropout(0.4),
    nn.Linear(256, num_classes)
)

criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.fc.parameters(), lr=1e-3)

model = model.to(device)

---
# üß© 4. Transfer Learning per Architecture

For each model below, we show:
- Short notes / when to use
- Keras snippet (if available)
- PyTorch snippet

‚ö† These are **skeletons** ‚Äì you still need to plug in your own dataloaders / generators.

## üîπ A. AlexNet (2012)

**Notes**
- Historically important; first big ImageNet winner using deep CNN.
- Today, mainly used for teaching and small experiments.
- Not built-in in Keras; available in PyTorch.

### ‚úÖ PyTorch ‚Äì AlexNet Transfer Learning


In [None]:
import torch.nn as nn
from torchvision import models

num_classes = 5  # your number of classes

alexnet = models.alexnet(weights="IMAGENET1K_V1")

# Freeze all features
for param in alexnet.features.parameters():
    param.requires_grad = False

# Replace the final classifier layer
in_features = alexnet.classifier[6].in_features
alexnet.classifier[6] = nn.Linear(in_features, num_classes)

criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(alexnet.classifier.parameters(), lr=1e-3)

### ‚úÖ Keras ‚Äì Use VGG16 as AlexNet Alternative

Keras doesn‚Äôt ship AlexNet; for teaching, you can use **VGG16** as a similar
‚Äúold-school‚Äù deep CNN.

In [None]:
from tensorflow.keras.applications import VGG16
from tensorflow.keras.layers import GlobalAveragePooling2D, Dense, Dropout
from tensorflow.keras.models import Model

num_classes = 5

base = VGG16(weights='imagenet', include_top=False, input_shape=(224, 224, 3))
base.trainable = False

x = GlobalAveragePooling2D()(base.output)
x = Dense(256, activation='relu')(x)
x = Dropout(0.5)(x)
out = Dense(num_classes, activation='softmax')(x)

vgg_model = Model(inputs=base.input, outputs=out)

## üîπ B. ZFNet (2013)

ZFNet is not directly available in Keras or torchvision. For practical work,
we usually **replace it with VGG or ResNet**.

**Recommendation**: Use **VGG16/VGG19** / **ResNet50** instead.

## üîπ C. VGG16 / VGG19 (2014)

**Notes**
- Deep, simple architecture (stacked 3√ó3 convs).
- Great as a **feature extractor**.
- Heavy (large number of parameters).

### ‚úÖ Keras ‚Äì VGG16 Transfer Learning

In [None]:
from tensorflow.keras.applications import VGG16

num_classes = 5

vgg_base = VGG16(weights='imagenet', include_top=False, input_shape=(224, 224, 3))
vgg_base.trainable = False

x = GlobalAveragePooling2D()(vgg_base.output)
x = Dense(512, activation='relu')(x)
x = Dropout(0.5)(x)
out = Dense(num_classes, activation='softmax')(x)

vgg_model = Model(inputs=vgg_base.input, outputs=out)

### ‚úÖ PyTorch ‚Äì VGG16 Transfer Learning

In [None]:
from torchvision import models

num_classes = 5

vgg16 = models.vgg16(weights="IMAGENET1K_V1")

# Freeze features
for param in vgg16.features.parameters():
    param.requires_grad = False

# Replace final classifier layer
in_features = vgg16.classifier[6].in_features
vgg16.classifier[6] = nn.Linear(in_features, num_classes)

criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(vgg16.classifier.parameters(), lr=1e-3)

## üîπ D. GoogLeNet / Inception v1

**Notes**
- Introduced **Inception modules** with multi-scale convolutions.
- Efficient for its time.
- In practice, we now prefer **InceptionV3** (newer) or other modern models.

### ‚úÖ PyTorch ‚Äì GoogLeNet Transfer Learning

In [None]:
googlenet = models.googlenet(weights="IMAGENET1K_V1")

# Replace the final FC layer
googlenet.fc = nn.Linear(googlenet.fc.in_features, num_classes)

criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(googlenet.fc.parameters(), lr=1e-3)

### ‚úÖ Keras ‚Äì Use InceptionV3 as Successor

In [None]:
from tensorflow.keras.applications import InceptionV3

inception_base = InceptionV3(weights='imagenet', include_top=False, input_shape=(299, 299, 3))
inception_base.trainable = False

x = GlobalAveragePooling2D()(inception_base.output)
x = Dense(256, activation='relu')(x)
x = Dropout(0.4)(x)
out = Dense(num_classes, activation='softmax')(x)

inception_model = Model(inputs=inception_base.input, outputs=out)

## üîπ E. InceptionV3 (2015)

**Notes**
- High accuracy with good efficiency.
- Uses factorized convolutions (e.g., 3√ó3 split into 1√ó3 + 3√ó1).
- Good general-purpose backbone.

### ‚úÖ Keras ‚Äì InceptionV3 Transfer Learning

In [None]:
from tensorflow.keras.applications import InceptionV3

inception_base = InceptionV3(weights='imagenet', include_top=False, input_shape=(299, 299, 3))
inception_base.trainable = False

x = GlobalAveragePooling2D()(inception_base.output)
x = Dense(512, activation='relu')(x)
x = Dropout(0.5)(x)
out = Dense(num_classes, activation='softmax')(x)

inception_model = Model(inputs=inception_base.input, outputs=out)

### ‚úÖ PyTorch ‚Äì InceptionV3 Transfer Learning

In [None]:
inception3 = models.inception_v3(weights="IMAGENET1K_V1")

# Replace final FC
inception3.fc = nn.Linear(inception3.fc.in_features, num_classes)

criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(inception3.fc.parameters(), lr=1e-3)

## üîπ F. Xception (2017)

**Notes**
- Based on **depthwise separable convolutions**.
- Very strong performance on many image tasks.
- Officially in Keras; PyTorch uses alternative implementations (e.g., from repos).

### ‚úÖ Keras ‚Äì Xception Transfer Learning

In [None]:
from tensorflow.keras.applications import Xception

xception_base = Xception(weights='imagenet', include_top=False, input_shape=(299, 299, 3))
xception_base.trainable = False

x = GlobalAveragePooling2D()(xception_base.output)
x = Dense(512, activation='relu')(x)
x = Dropout(0.5)(x)
out = Dense(num_classes, activation='softmax')(x)

xception_model = Model(inputs=xception_base.input, outputs=out)

### ‚úÖ PyTorch

Xception is not in `torchvision.models` by default.
- You can:
  - Use **MobileNet** / **EfficientNet** as modern light-weight alternatives, or
  - Install a third-party implementation from GitHub.


## üîπ G. ResNet50 / ResNet50V2

**Notes**
- Uses **residual connections** (skip connections).
- Stable training even when very deep.
- Excellent general-purpose backbone.
- Common default choice when you don't know what to pick.


### ‚úÖ Keras ‚Äì ResNet50V2 Transfer Learning

In [None]:
from tensorflow.keras.applications import ResNet50V2

resnet_base = ResNet50V2(weights='imagenet', include_top=False, input_shape=(224, 224, 3))
resnet_base.trainable = False

x = GlobalAveragePooling2D()(resnet_base.output)
x = Dense(512, activation='relu')(x)
x = Dropout(0.5)(x)
out = Dense(num_classes, activation='softmax')(x)

resnet_model = Model(inputs=resnet_base.input, outputs=out)

### ‚úÖ PyTorch ‚Äì ResNet50 Transfer Learning

In [None]:
resnet50 = models.resnet50(weights="IMAGENET1K_V2")

# Freeze all layers
for param in resnet50.parameters():
    param.requires_grad = False

# Replace FC
resnet50.fc = nn.Linear(resnet50.fc.in_features, num_classes)

criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(resnet50.fc.parameters(), lr=1e-3)

## üîπ H. DenseNet

**Notes**
- Each layer receives input from **all previous layers** in a block.
- Very parameter-efficient and strong feature extractor.
- Good alternative when ResNet is overkill.


### ‚úÖ Keras ‚Äì DenseNet121 Transfer Learning

In [None]:
from tensorflow.keras.applications import DenseNet121

densenet_base = DenseNet121(weights='imagenet', include_top=False, input_shape=(224, 224, 3))
densenet_base.trainable = False

x = GlobalAveragePooling2D()(densenet_base.output)
x = Dense(512, activation='relu')(x)
x = Dropout(0.5)(x)
out = Dense(num_classes, activation='softmax')(x)

densenet_model = Model(inputs=densenet_base.input, outputs=out)

### ‚úÖ PyTorch ‚Äì DenseNet121 Transfer Learning

In [None]:
densenet121 = models.densenet121(weights="IMAGENET1K_V1")

# Replace classifier
densenet121.classifier = nn.Linear(densenet121.classifier.in_features, num_classes)

criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(densenet121.classifier.parameters(), lr=1e-3)

## üîπ I. MobileNet (V1/V2/V3)

**Notes**
- Designed for **mobile / edge devices**.
- Uses depthwise separable convolutions.
- Excellent trade-off between speed and accuracy.


### ‚úÖ Keras ‚Äì MobileNetV2 Transfer Learning

In [None]:
from tensorflow.keras.applications import MobileNetV2

mobilenet_base = MobileNetV2(weights='imagenet', include_top=False, input_shape=(224, 224, 3))
mobilenet_base.trainable = False

x = GlobalAveragePooling2D()(mobilenet_base.output)
x = Dense(256, activation='relu')(x)
x = Dropout(0.3)(x)
out = Dense(num_classes, activation='softmax')(x)

mobilenet_model = Model(inputs=mobilenet_base.input, outputs=out)

### ‚úÖ PyTorch ‚Äì MobileNetV2 Transfer Learning

In [None]:
mobilenet_v2 = models.mobilenet_v2(weights="IMAGENET1K_V1")

# Replace classifier
in_features = mobilenet_v2.classifier[1].in_features
mobilenet_v2.classifier[1] = nn.Linear(in_features, num_classes)

criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(mobilenet_v2.classifier.parameters(), lr=1e-3)

---
# üß™ 5. Optional Fine-Tuning Stage

After training only the top classifier head, you might want to **unfreeze some deeper layers**
to adapt the features more to your domain. This is called **fine-tuning**.

Typical workflow:
1. Train with base frozen ‚Üí stabilize classifier.
2. Unfreeze last N layers of the base.
3. Retrain with a **smaller learning rate**.

### üîÅ Keras Fine-Tuning Example


In [None]:
# Assume `base` is your pretrained backbone (e.g., ResNet50) and `model` is built.

# 1. Unfreeze the last 30 layers of the base model
for layer in base.layers[:-30]:
    layer.trainable = False
for layer in base.layers[-30:]:
    layer.trainable = True

# 2. Compile with a lower learning rate for fine-tuning
model.compile(optimizer=Adam(1e-5), loss='categorical_crossentropy', metrics=['accuracy'])

### üîÅ PyTorch Fine-Tuning Example (ResNet)

In [None]:
# Example: fine-tuning last residual block (layer4) of ResNet50

# 1. Freeze all
for param in resnet50.parameters():
    param.requires_grad = False

# 2. Unfreeze layer4
for param in resnet50.layer4.parameters():
    param.requires_grad = True

# 3. Optimize only trainable parameters
optimizer = torch.optim.Adam(filter(lambda p: p.requires_grad, resnet50.parameters()), lr=1e-5)

---
# üìå 6. Data Loading ‚Äì Keras vs PyTorch

Here are quick templates to load image data from **folders** for transfer learning.

### üóÇ Keras ‚Äì ImageDataGenerator from Directories

In [None]:
from tensorflow.keras.preprocessing.image import ImageDataGenerator

img_size = (224, 224)
batch_size = 32

train_datagen = ImageDataGenerator(
    rescale=1/255.,
    rotation_range=20,
    width_shift_range=0.1,
    height_shift_range=0.1,
    horizontal_flip=True,
)

val_datagen = ImageDataGenerator(rescale=1/255.)

train_gen = train_datagen.flow_from_directory(
    'data/train',
    target_size=img_size,
    batch_size=batch_size,
    class_mode='categorical'
)

val_gen = val_datagen.flow_from_directory(
    'data/val',
    target_size=img_size,
    batch_size=batch_size,
    class_mode='categorical'
)

### üóÇ PyTorch ‚Äì Datasets and DataLoaders

In [None]:
from torchvision import datasets, transforms
from torch.utils.data import DataLoader

transform_train = transforms.Compose([
    transforms.Resize((224, 224)),
    transforms.RandomHorizontalFlip(),
    transforms.ToTensor(),
])

transform_val = transforms.Compose([
    transforms.Resize((224, 224)),
    transforms.ToTensor(),
])

train_dataset = datasets.ImageFolder('data/train', transform=transform_train)
val_dataset = datasets.ImageFolder('data/val', transform=transform_val)

train_loader = DataLoader(train_dataset, batch_size=32, shuffle=True)
val_loader = DataLoader(val_dataset, batch_size=32, shuffle=False)

---
# üî• 7. Which Model Should I Use?

| Use Case                       | Recommended Models                          |
|-------------------------------|---------------------------------------------|
| **Small datasets**            | VGG16, ResNet50                             |
| **High accuracy**             | ResNet50V2, InceptionV3, Xception           |
| **Fast & light**              | MobileNetV2, EfficientNetB0                 |
| **Mobile / Edge deployment**  | MobileNetV2/V3, EfficientNet-Lite           |
| **Very deep feature learning**| ResNet101/152, DenseNet121/169              |

General rule of thumb:
- Start with **ResNet50** or **MobileNetV2**.
- If overfitting ‚Üí add regularization / data augmentation.
- If too slow ‚Üí switch to MobileNet / smaller EfficientNet.
- If accuracy too low ‚Üí try deeper ResNet / Xception / InceptionV3.


---
## ‚úÖ Next Steps / Ideas

- Wrap these templates into functions (e.g., `get_model('resnet', num_classes)`).
- Build a **unified training script** that:
  - Takes model name as argument
  - Loads corresponding backbone
  - Trains and logs metrics
- Add **experiment tracking** (Weights & Biases, MLflow, TensorBoard).
- Compare different models on the **same dataset** and plot:
  - Accuracy
  - Training time
  - Inference speed.

Use this notebook as a **base template** for your transfer learning experiments üí™.