<a href="https://colab.research.google.com/github/nmermigas/PyTorch/blob/main/06_PyTorch_Transfer_Learning.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# 06. PyTorch Tansfer Learning

What is Transfer Learning?

Transfer learning invokes taking the parameters of what one model has learned and applying it to another dataset.

* Pretrained models = foundation models

In [1]:
import torch
import torchvision

print(torch.__version__)
print(torchvision.__version__)


2.1.0+cu118
0.16.0+cu118


Downloading the going modular (05 notebook)

In [2]:
# Continue with regular imports
import matplotlib.pyplot as plt
import torch
import torchvision

from torch import nn
from torchvision import transforms

# Try to get torchinfo, install it if it doesn't work
try:
    from torchinfo import summary
except:
    print("[INFO] Couldn't find torchinfo... installing it.")
    !pip install -q torchinfo
    from torchinfo import summary

# Try to import the going_modular directory, download it from GitHub if it doesn't work
try:
    from going_modular.going_modular import data_setup, engine
except:
    # Get the going_modular scripts
    print("[INFO] Couldn't find going_modular scripts... downloading them from GitHub.")
    !git clone https://github.com/mrdbourke/pytorch-deep-learning
    !mv pytorch-deep-learning/going_modular .
    !rm -rf pytorch-deep-learning
    from going_modular.going_modular import data_setup, engine

[INFO] Couldn't find torchinfo... installing it.
[INFO] Couldn't find going_modular scripts... downloading them from GitHub.
Cloning into 'pytorch-deep-learning'...
remote: Enumerating objects: 4036, done.[K
remote: Counting objects: 100% (1224/1224), done.[K
remote: Compressing objects: 100% (225/225), done.[K
remote: Total 4036 (delta 1068), reused 1086 (delta 996), pack-reused 2812[K
Receiving objects: 100% (4036/4036), 651.02 MiB | 15.89 MiB/s, done.
Resolving deltas: 100% (2361/2361), done.
Updating files: 100% (248/248), done.


In [3]:
# Setup device agnostic code
device = 'cuda' if torch.cuda.is_available() else 'cpu'
device

'cuda'

In [4]:
!nvidia-smi

Mon Nov 20 20:48:37 2023       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 525.105.17   Driver Version: 525.105.17   CUDA Version: 12.0     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  Tesla T4            Off  | 00000000:00:04.0 Off |                    0 |
| N/A   52C    P8    10W /  70W |      3MiB / 15360MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Proces

# 1. Get data

We need again the pizza, steak, shushi dataset to build our transfer model on.

In [5]:
import os
import zipfile

from pathlib import Path

import requests

# Setup path to data folder
data_path = Path("data/")
image_path = data_path / "pizza_steak_sushi"

# If the image folder doesn't exist, download it and prepare it...
if image_path.is_dir():
    print(f"{image_path} directory exists.")
else:
    print(f"Did not find {image_path} directory, creating one...")
    image_path.mkdir(parents=True, exist_ok=True)

    # Download pizza, steak, sushi data
    with open(data_path / "pizza_steak_sushi.zip", "wb") as f:
        request = requests.get("https://github.com/mrdbourke/pytorch-deep-learning/raw/main/data/pizza_steak_sushi.zip")
        print("Downloading pizza, steak, sushi data...")
        f.write(request.content)

    # Unzip pizza, steak, sushi data
    with zipfile.ZipFile(data_path / "pizza_steak_sushi.zip", "r") as zip_ref:
        print("Unzipping pizza, steak, sushi data...")
        zip_ref.extractall(image_path)

    # Remove .zip file
    os.remove(data_path / "pizza_steak_sushi.zip")

Did not find data/pizza_steak_sushi directory, creating one...
Downloading pizza, steak, sushi data...
Unzipping pizza, steak, sushi data...


In [6]:
# Setup directory path
train_dir = image_path /'train'
test_dir = image_path /'test'

train_dir,test_dir

(PosixPath('data/pizza_steak_sushi/train'),
 PosixPath('data/pizza_steak_sushi/test'))

## 2. Create Datasets and DataLoaders

Now that we have the data, we can create DataLoaders.
We will use `data_setup.py`

Note: We need t think about **transforms** before loading.

There are two ways of doing that:

1. Manually created transforms - you define what transforms you want your data to go through.
2. Automatically created transforms - the transforms for your data are defined by the model you will use.

Important: When using a pretrained model, it is impotant that the data that you pass to it are transformed in the same way that the data the model was trained on.

In [7]:
from going_modular.going_modular import data_setup

### 2.1 Creating a transform for `torchvision.models` (manual creation)

`torchvision.models` contais pretrained models.

In [8]:
from torchvision import transforms
normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406],
                                 std=[0,229,0,224,0,225])

manual_transforms = transforms.Compose([
                                      transforms.Resize((224,224)), # resize image into 224 * 224
                                      transforms.ToTensor(), # get images into range[0,1]
                                      normalize]) # make sure images have the same distribution as ImageNet

In [9]:
from going_modular.going_modular import data_setup
train_dataloader, test_dataloader, class_names = data_setup.create_dataloaders(train_dir = train_dir,
                                                                               test_dir = test_dir,
                                                                               batch_size = 32,
                                                                               transform = manual_transforms
                                                                               )
train_dataloader, test_dataloader, class_names

(<torch.utils.data.dataloader.DataLoader at 0x79a26a5f6260>,
 <torch.utils.data.dataloader.DataLoader at 0x79a26a5f7010>,
 ['pizza', 'steak', 'sushi'])

## 2.2 Creating a transform for `torchvision.models` (auto creation)

As of `torchvision` latest versions, there is support for automatic data transform creation based on the pretrained model you are using.

In [10]:
# Get a set of pretrained model weights
weights = torchvision.models.EfficientNet_B0_Weights.DEFAULT # DEFAULT = best available weights
weights

EfficientNet_B0_Weights.IMAGENET1K_V1

In [11]:
# Get the transforms used to create our pretrained wwights
auto_transforms = weights.transforms()
auto_transforms

ImageClassification(
    crop_size=[224]
    resize_size=[256]
    mean=[0.485, 0.456, 0.406]
    std=[0.229, 0.224, 0.225]
    interpolation=InterpolationMode.BICUBIC
)

In [12]:
# Create dataloders using the automatic transforms
train_dataloader, test_dataloader, class_names = data_setup.create_dataloaders(train_dir = train_dir,
                                                                               test_dir = test_dir,
                                                                               batch_size = 32,
                                                                               transform = auto_transforms
                                                                               )
train_dataloader, test_dataloader, class_names

(<torch.utils.data.dataloader.DataLoader at 0x79a26a5f7b80>,
 <torch.utils.data.dataloader.DataLoader at 0x79a26a5f6590>,
 ['pizza', 'steak', 'sushi'])

## 3. Getting a pretrained model

There are various places to get a pretrained model, such as:
1. PyTorch domain libraries
2. Libraries like `tim` (torch image models)
3. HuggingFace hub
4. Paperswithcode

## 3.1 Which pretrained model should you use?

*Experiment, experiment, experiment!!!*

The whole idea of transfer learning: take an already well-performing model from a problem space similar to your own and then customize to  your own problem.

3 things to coonsider:
1. Speed - how fast does it  run?
2. Size - how big is the model?
3. Performance - how well does it go on your chosen problem?

Where does the model live? On the device or on a server?

We'll use EffNetB0 for our case (deploying FoodVision Mini on a mobile device)

## 3.2 Setting up a pretrained model
We want to create an instance of a pretrained EffNetB0

In [20]:
# OLD: Setup the model with pretrained weights and send it to the target device (this was prior to torchvision v0.13)
# model = torchvision.models.efficientnet_b0(pretrained=True).to(device) # OLD method (with pretrained=True)

# NEW: Setup the model with pretrained weights and send it to the target device (torchvision v0.13+)
weights = torchvision.models.EfficientNet_B1_Weights.IMAGENET1K_V1
model = torchvision.models.efficientnet_b1(weights=weights).to(device)


#model # uncomment to output (it's very long)

Downloading: "https://download.pytorch.org/models/efficientnet_b1_rwightman-bac287d4.pth" to /root/.cache/torch/hub/checkpoints/efficientnet_b1_rwightman-bac287d4.pth
100%|██████████| 30.1M/30.1M [00:00<00:00, 185MB/s]


In [22]:
model.classifier

Sequential(
  (0): Dropout(p=0.2, inplace=True)
  (1): Linear(in_features=1280, out_features=1000, bias=True)
)

## 3.3 Getting a summary of the model with `torchinfo.summary()`

In [28]:
from torchinfo import summary

summary(model=model,
        input_size=(1,3,224,224),
        col_names=['input_size',"output_size","num_params",'trainable'],
        col_width=20,
        row_settings=['var_names'])

Layer (type (var_name))                                      Input Shape          Output Shape         Param #              Trainable
EfficientNet (EfficientNet)                                  [1, 3, 224, 224]     [1, 3]               --                   Partial
├─Sequential (features)                                      [1, 3, 224, 224]     [1, 1280, 7, 7]      --                   False
│    └─Conv2dNormActivation (0)                              [1, 3, 224, 224]     [1, 32, 112, 112]    --                   False
│    │    └─Conv2d (0)                                       [1, 3, 224, 224]     [1, 32, 112, 112]    (864)                False
│    │    └─BatchNorm2d (1)                                  [1, 32, 112, 112]    [1, 32, 112, 112]    (64)                 False
│    │    └─SiLU (2)                                         [1, 32, 112, 112]    [1, 32, 112, 112]    --                   --
│    └─Sequential (1)                                        [1, 32, 112, 112]    [1, 1

## 3.4 Freezing the base model and changing the output layer to suit our needs

Typically, with a feature extraction model, we 'freeze' the base layers and just update the last layers (classifier).

In [25]:
for param in model.features.parameters():
  param.requires_grad = False

In [27]:
# Update the classifier head of our model
from torch import nn

torch.manual_seed(42)
torch.cuda.manual_seed(42)
model.classifier = nn.Sequential(
    nn.Dropout(p=0.2,inplace=True), # random dropping some neurons to prevent overfitting (with a probability p)
    nn.Linear(in_features=1280, # feature vector coming in
              out_features = len(class_names)) # how many classes do we have
).to(device)

model.classifier

Sequential(
  (0): Dropout(p=0.2, inplace=True)
  (1): Linear(in_features=1280, out_features=3, bias=True)
)