<a href="https://colab.research.google.com/github/catalinapesquet/PINNS-Code-and-Notes/blob/main/06_PyTorch_Transfer_Learning.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# 06. PyTorch Transfer Learning

## What is transfer learning?
Transfer learning allows us to take the patterns (also called weights) another model has learned from another problem and use them for our own problem.

For example, we can take the patterns a computer vision model has learned from datasets such as ImageNet (millions of images of different objects) and use them to power a FoodVision Mini model.

## Why use transfer learning?
There are two main benefits to using transfer learning:

1. Can leverage an existing model (usually a neural network architecture) proven to work on problems similar to our own.
2. Can leverage a working model which has already learned patterns on similar data to our own. This often results in achieving great results with less custom data.


## Where to find pretrainded models

* PyTorch domain libraries: each of domain libraries come with pretrained models
* HuggingFace Hub: pretrained models on many different domains from around the world, plenty of datasets too
* timm(PyTorch Image Models library): all latest and greatest computer vision models in PyTorch code
* Paperswithcode: collection of latest state-of-the-art ML papers with code implementations attached

## 0. Getting setup

Let's get started by importing/downloading the required modules for this section.

To save us writing extra code, we're going to be leveraging some of the Python scripts (such as data_setup.py and engine.py) we created in the previous section, 05. PyTorch Going Modular.

In [1]:
try: # let's us test a block of code for errors
    import torch
    import torchvision
    # assert: test if a condition is true
    assert int(torch.__version__.split(".")[1]) >= 12, "torch version should be 1.12+"
    assert int(torchvision.__version__.split(".")[1]) >= 13, "torchvision version should be 0.13+"
    print(f"torch version: {torch.__version__}")
    print(f"torchvision version: {torchvision.__version__}")
except: # let's us handle the error
    print(f"[INFO] torch/torchvision versions not as required, installing nightly versions.")
    !pip3 install -U torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu113
    import torch
    import torchvision
    print(f"torch version: {torch.__version__}")
    print(f"torchvision version: {torchvision.__version__}")

[INFO] torch/torchvision versions not as required, installing nightly versions.
Looking in indexes: https://pypi.org/simple, https://download.pytorch.org/whl/cu113
torch version: 2.5.1+cu121
torchvision version: 0.20.1+cu121


Regular Imports

In [2]:
import matplotlib.pyplot as plt
import torch
import torchvision

from torch import nn
from torchvision import transforms

# Try to get torchinfo, install it if it doesnt work
try:
  from torchinfo import summary
except:
  print("[INFO] Couldn't find torchinfo...Installing it.")
  !pip install -q torchinfo
  from torchinfo import summary

# Try to import the going_modular directory, download it from GitHub if it doesn't work
try:
    from going_modular.going_modular import data_setup, engine
except:
    # Get the going_modular scripts
    print("[INFO] Couldn't find going_modular scripts... downloading them from GitHub.")
    !git clone https://github.com/mrdbourke/pytorch-deep-learning
    !mv pytorch-deep-learning/going_modular .
    !rm -rf pytorch-deep-learning
    from going_modular.going_modular import data_setup, engine

[INFO] Couldn't find torchinfo...Installing it.
[INFO] Couldn't find going_modular scripts... downloading them from GitHub.
Cloning into 'pytorch-deep-learning'...
remote: Enumerating objects: 4356, done.[K
remote: Counting objects: 100% (185/185), done.[K
remote: Compressing objects: 100% (65/65), done.[K
remote: Total 4356 (delta 154), reused 120 (delta 120), pack-reused 4171 (from 3)[K
Receiving objects: 100% (4356/4356), 654.37 MiB | 22.69 MiB/s, done.
Resolving deltas: 100% (2584/2584), done.
Updating files: 100% (248/248), done.


Let's setup device agnostic code.

In [3]:
device = "cuda" if torch.cuda.is_available() else "cpu"
device

'cuda'

## 1. Get data

We first need a Dataset.
To compare with a model we'll download the same dataset we've been using for FoodVision Mini.

In [4]:
import os
import zipfile

from pathlib import Path

import requests

# Setup path to data folder
data_path = Path("data/")
image_path = data_path / "pizza_steak_sushi"

# If the image folder doesn't exist, download it and prepare it.
if image_path.is_dir():
  print(f"{image_path} directory exists.")
else:
  print(f"Did not find {image_path} directory, creating one...")
  image_path.mkdir(parents=True, exist_ok=True)

  # Download pizza, steak, sushi data
  with open(data_path / "pizza_steak_sushi.zip", "wb") as f:
    request = requests.get("https://github.com/mrdbourke/pytorch-deep-learning/raw/main/data/pizza_steak_sushi.zip")
    print("Downloading pizza, steak, sushi data...")
    f.write(request.content)

  # Unzip pizza, steak, sushi data
  with zipfile.ZipFile(data_path / "pizza_steak_sushi.zip", "r") as zip_ref:
    print("Unzipping pizza, steak, sushi data...")
    zip_ref.extractall(image_path)

  # Remove .zip file
  os.remove(data_path / "pizza_steak_sushi.zip")

Did not find data/pizza_steak_sushi directory, creating one...
Downloading pizza, steak, sushi data...
Unzipping pizza, steak, sushi data...


Now we've got the same dataset we've been using before.
Let's create paths to our training and test directories

In [5]:
# Setup Dirs
train_dir = image_path / "train"
test_dir = image_path / "test"

## Create Datasets and DataLoaders

Since we've downloaded going_modular we can use the data_setup.py script we created before to prepare and setup our DataLoaders.

But since we'll be using a pretrained model from torchvision.models, ther's a specific transform we need to prepare our images first.



### 2.1 Creating a transform for torchvision.models (manual creation)

When using a pretrained model, it's important that the custom data going into the model is prepared in the same way as the original training data that went into the model.



The documentation states:

*All pre-trained models expect input images normalized in the same way, i.e. mini-batches of 3-channel RGB images of shape (3 x H x W), where H and W are expected to be at least 224.*

*The images have to be loaded in to a range of [0, 1] and then normalized using mean = [0.485, 0.456, 0.406] and std = [0.229, 0.224, 0.225].*

We can achieve the above transformations with a combination of:
1. Mini-batches of size [batch_size, 3, height, width]: torchvision.transforms.Resize() + torch.utils.data.DataLoader()
2. Values between 0 and 1: torchvision.transforms.ToTensor()
3. A mean of [0.485, 0.456, 0.406]: torchvision.transforms.Normalize(mean=...)
4. A standard deviation of [0.229, 0.224, 0.225]: torchvision.transforms.Normalize(std=...)

In [6]:
# Create a transforms pipeline manually
manual_transforms = transforms.Compose([
    transforms.Resize((224,224)),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406],
                         std=[0.229, 0.224, 0.225])
])

Let's create training and testing DataLoaders usin create_dataloaders function from the data_setup.py script.

create_dataloader takes in a training and testing path and turns them into PyTorch Datasets and then into PyTorch DataLoaders.

In [7]:
# Create training and testing DataLoaders as well as get a list of class names
train_dataloader, test_dataloader, class_names = data_setup.create_dataloaders(train_dir=train_dir,
                                                                               test_dir=test_dir,
                                                                               transform=manual_transforms, # resize, convert images to between 0 & 1 and normalize them
                                                                               batch_size=32) # set mini-batch size to 32

train_dataloader, test_dataloader, class_names

(<torch.utils.data.dataloader.DataLoader at 0x7e9ee1ab6200>,
 <torch.utils.data.dataloader.DataLoader at 0x7e9e1ff32aa0>,
 ['pizza', 'steak', 'sushi'])

### 2.2 Creating a transform for torchvision.models (auto creation)

When using a pretrained model we need to prepare our custom data in the same way as the original training data that went into the model.

Above we saw how to manually create a transform for a pretrained model.

But as of torchvision v0.13+, an automatic transform creation feature has been added.

In [8]:
weights = torchvision.models.EfficientNet_B0_Weights.DEFAULT

Where:
* EfficientNet_B0_Weights: model architecture weights we'd like to use.
* DEFAULT means the best available weights (the best performance in ImageNet).

Let's try it out.

In [10]:
# Get a set of pretrained model weights
weights = torchvision.models.EfficientNet_B0_Weights.DEFAULT
weights

EfficientNet_B0_Weights.IMAGENET1K_V1

And now to access the transforms associated with our weights, we can use the transforms() method.

In [11]:
# Get the transforms used to create our pretrained weights
auto_transforms = weights.transforms()
auto_transforms

ImageClassification(
    crop_size=[224]
    resize_size=[256]
    mean=[0.485, 0.456, 0.406]
    std=[0.229, 0.224, 0.225]
    interpolation=InterpolationMode.BICUBIC
)

We can use auto_transforms to create DataLoaders with create_dataloaders() just as before.

In [12]:
# Create training and testing DataLoaders as well as get a list of class names
train_dataloader, test_dataloader, class_names = data_setup.create_dataloaders(train_dir=train_dir,
                                                                               test_dir=test_dir,
                                                                               transform=auto_transforms,
                                                                               batch_size=32)

train_dataloader, test_dataloader, class_names

(<torch.utils.data.dataloader.DataLoader at 0x7e9e20621900>,
 <torch.utils.data.dataloader.DataLoader at 0x7e9e20623070>,
 ['pizza', 'steak', 'sushi'])

## 3. Getting a pretrained model

Since we're working on a computer vision problem (image classification with FoodVision Mini), we can find pretrained classification models in torchvision.models.

### 3.1 Which pretrained model should we use ?

Depends on the problem/device we're working on.

You might think better performance is always better, right?

That's true but some better performing models are too big for some devices.

For example, say we'd like to run our model on a mobile-device, we'll have to take into account the limited compute resources on the device, thus we'd be looking for a smaller model.

### 3.2 Setting up a pretrained model

The pretrained model we're going to be using is torchvision.models.efficientnet_b0().

This means the model has already been trained on millions of images and has a good base representation of image data.

We send it to the target device:

In [13]:
# NEW: Setup the model with pretrained weights and send it to the target device (torchvision v0.13+)
weights = torchvision.models.EfficientNet_B0_Weights.DEFAULT # .DEFAULT = best available weights
model = torchvision.models.efficientnet_b0(weights=weights).to(device)

#model # uncomment to output (it's very long)

Downloading: "https://download.pytorch.org/models/efficientnet_b0_rwightman-7f5810bc.pth" to /root/.cache/torch/hub/checkpoints/efficientnet_b0_rwightman-7f5810bc.pth
100%|██████████| 20.5M/20.5M [00:00<00:00, 67.3MB/s]


If we print the model, we get something similar to the following:Lots and lots and lots of layers.

This is one of the benefits of transfer learning, taking an existing model, that's been crafted by some of the best engineers in the world and applying to our own problem.

The efficientnet_b0 comes in three main parts:
1. features: A collection of convolutional layers and other various activation layers to learn a base representation of vision data
2. avgpool: Takes the average of the output of the features layer(s) and turns it into a feature vector.
3. classifier: Turns the feature vector into a vector with the same dimensionality as the number of required output classes