<a href="https://colab.research.google.com/github/anupj/PyTorchForDeepLearningBootcamp/blob/main/06_pytorch_transfer_learning.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# 06. PyTorch Transfer Learning

What is **transfer learning**?
Transfer learning is taking the parameters of what one model has learned on another dataset and applying to our own problem.

* Pretrained model = foundation models

For example, we can take the patterns a computer vision model has learned from datasets such as ImageNet (millions of images of different objects) and use them to power our FoodVision Mini model.

## 0. Getting Setup

In [1]:
import torch
import torchvision

import matplotlib.pyplot as plt
import torch
import torchvision

from torch import nn
from torchvision import transforms

# Try to get torchinfo, install it if it doesn't work
try:
  from torchinfo import summary
except:
  print("[INFO] Couldn't find torchinfo....installing it.")
  !pip install -q torchinfo
  from torchinfo import summary

[INFO] Couldn't find torchinfo....installing it.


Instead of writing all of the code again, lets download the `going_modular` directory that we created in the previous module.

In [2]:
# Try to import the going_modular directory, download it from Github if it doesn't work
try:
  from going_modular.going_modular import data_setup, engine
except:
  # Get the going_modular scripts
  print("[INFO] Couldn't find going_modular scripts...downloading them from Github.")
  !git clone https://github.com/mrdbourke/pytorch-deep-learning
  !mv pytorch-deep-learning/going_modular .
  !rm -rf pytorch-deep-learning
  from going_modular.going_modular import data_setup, engine

[INFO] Couldn't find going_modular scripts...downloading them from Github.
Cloning into 'pytorch-deep-learning'...
remote: Enumerating objects: 4356, done.[K
remote: Counting objects: 100% (185/185), done.[K
remote: Compressing objects: 100% (66/66), done.[K
remote: Total 4356 (delta 154), reused 119 (delta 119), pack-reused 4171 (from 3)[K
Receiving objects: 100% (4356/4356), 654.37 MiB | 27.92 MiB/s, done.
Resolving deltas: 100% (2583/2583), done.
Updating files: 100% (248/248), done.


In [3]:
# Now let's setup device agnostic code
device = "cuda" if torch.cuda.is_available() else "cpu"
device

'cuda'

## 1. Get Data

Let's write some code to download the `pizza_steak_sushi.zip` dataset.

In [4]:
import os
import zipfile

from pathlib import Path

import requests

# Setup path to data folder
data_path = Path("data/")
image_path = data_path / "pizza_steak_sushi"

# If the image folder doesn't exist, download it and prepare it...
if image_path.is_dir():
  print(f"{image_path} directory exists.")
else:
  print(f"Did not find {image_path} directory, creating one...")
  image_path.mkdir(parents=True, exist_ok=True)

  # Download pizza, steak, sushi data
  with open(data_path / "pizza_steak_sushi.zip", "wb") as f:
    request = requests.get("https://github.com/mrdbourke/pytorch-deep-learning/raw/main/data/pizza_steak_sushi.zip")
    print("Downloading pizza, steak, sushi data...")
    f.write(request.content)

  # Unzip pizza, steak, sushi data
  with zipfile.ZipFile(data_path / "pizza_steak_sushi.zip", "r") as zip_ref:
    print("Unzipping pizza, steak, sushi data...")
    zip_ref.extractall(image_path)

  # Remove .zip file
  os.remove(data_path / "pizza_steak_sushi.zip")



Did not find data/pizza_steak_sushi directory, creating one...
Downloading pizza, steak, sushi data...
Unzipping pizza, steak, sushi data...


In [5]:
 # Setup directory path
 train_dir = image_path / "train"
 test_dir = image_path / "test"
 train_dir, test_dir

(PosixPath('data/pizza_steak_sushi/train'),
 PosixPath('data/pizza_steak_sushi/test'))

# Create Datasets and Dataloaders

We will use `data_setup.py` and the `create_dataloaders()` function we made in `05. PyTorch Going Modular`.

There's one thing we have to think about when loading: how to **transform** it?

There are two ways to go about this:
1. Manually created transforms - you define what transforms you want your data to go through
2. Automatically created transforms - the transforms for your data are defined by the model you'd like to use

> Important point: when using a pretrained model, it's important that the data (including your custom data) that you pass through it is transformed the same way that the data the model was trained on.

### 2.1 Creating a transform for `torchvision.models` (manual creation)

When using a pretrained model, it's important that your custom data going into the model is prepared in the same way as the original training data that went into the model.

Prior to torchvision v0.13+, to create a transform for a pretrained model in torchvision.models, the documentation stated:

> All pre-trained models expect input images normalized in the same way, i.e. mini-batches of 3-channel RGB images of shape (3 x H x W), where H and W are expected to be at least 224.

> The images have to be loaded in to a range of `[0, 1]` and then normalized using mean = `[0.485, 0.456, 0.406]` and std = `[0.229, 0.224, 0.225]`.

>You can use the following transform to normalize:
```
normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406],
                                 std=[0.229, 0.224, 0.225])
```

Let's compose a series of `torchvision.transforms` to perform the above steps.



In [6]:
from torchvision import transforms
# Create a transforms pipeline manually (required for torchvision < 0.13)
manual_transforms = transforms.Compose([
    transforms.Resize((224, 224)), # 1. Reshape all images to 224x224 (though some models may require different sizes)
    transforms.ToTensor(), # 2. Turn image values to between 0 & 1
    transforms.Normalize(mean=[0.485, 0.456, 0.406], # 3. A mean of [0.485, 0.456, 0.406] (across each colour channel)
                         std=[0.229, 0.224, 0.225]) # 4. A standard deviation of [0.229, 0.224, 0.225] (across each colour channel),
])

Now we've got a manually created series of transforms ready to prepare our images, let's create training and testing DataLoaders.

We can create these using the `create_dataloaders` function from the `data_setup.py` script we created before.

We'll set `batch_size=32` so our model sees mini-batches of 32 samples at a time.

And we can transform our images using the transform pipeline we created above by setting `transform=manual_transforms`.

In [7]:
# Create training and testing DataLoaders as well as get a list of class names
train_dataloader, test_dataloader, class_names = data_setup.create_dataloaders(train_dir=train_dir,                                                        test_dir=test_dir,                                                           transform=manual_transforms, # resize, convert images to between 0 & 1 and normalize thems
batch_size=32) # set mini-batch size to 32

train_dataloader, test_dataloader, class_names

(<torch.utils.data.dataloader.DataLoader at 0x7b20f3277220>,
 <torch.utils.data.dataloader.DataLoader at 0x7b20f3276ef0>,
 ['pizza', 'steak', 'sushi'])

### 2.2 Creating a transform for torchvision.models (auto creation)
As previously stated, when using a pretrained model, it's important that your custom data going into the model is prepared in the same way as the original training data that went into the model.

Above we saw how to manually create a transform for a pretrained model.

But as of `torchvision v0.13+`, an automatic transform creation feature has been added.

When you setup a model from `torchvision.models` and select the pretrained model weights you'd like to use, for example, say we'd like to use:
```
weights = torchvision.models.EfficientNet_B0_Weights.DEFAULT
```
Where,

`EfficientNet_B0_Weights` is the model architecture weights we'd like to use (there are many different model architecture options in torchvision.models).

`DEFAULT` means the best available weights (the best performance in ImageNet).

>Note: Depending on the model architecture you choose, you may also see other options such as `IMAGENET_V1` and `IMAGENET_V2` where generally the higher version number the better. Though if you want the best available, `DEFAULT` is the easiest option. See the `torchvision.models` documentation for more.



In [8]:
# Get a set of pre-trained models
# .DEFAULT = best available weights from pretraining on ImageNet
weights = torchvision.models.EfficientNet_B0_Weights.DEFAULT

weights

EfficientNet_B0_Weights.IMAGENET1K_V1

In [12]:
# Get the transforms used to create our pretrained weights
auto_transforms  = weights.transforms()
auto_transforms

ImageClassification(
    crop_size=[224]
    resize_size=[256]
    mean=[0.485, 0.456, 0.406]
    std=[0.229, 0.224, 0.225]
    interpolation=InterpolationMode.BICUBIC
)

In [11]:
# Create dataloaders using automatic transforms
train_dataloader, test_dataloader, class_names = data_setup.create_dataloaders(train_dir = train_dir,test_dir = test_dir, transform=auto_transforms, batch_size=32)
train_dataloader, test_dataloader, class_names

(<torch.utils.data.dataloader.DataLoader at 0x7b20f3274b20>,
 <torch.utils.data.dataloader.DataLoader at 0x7b20f3276200>,
 ['pizza', 'steak', 'sushi'])

## 3. Getting a pretrained model

There are various places to get a pretrained model, such as:
1. PyTorch domain libraries
2. Libraries like `timm` (torch image models)
3. HuggingFace Hub
4. Paperswithcode

But how do you choose a model?

*Experiment! Experiment! Experiment!*

Four things to consider:
1. Speed - how fast does it run?
2. Size - how big is the model?
3. Performance - how well does it perform on your chosen problem space?
4. Deployment target -
  - Is it on device? (like a self-driving car)
  - or does it live on a server?

For our case, EffNetB0 is one of our best options in terms of performance vs size.