# 06. PyTorch Transform Learning  

What is transfer learning?

Transfer learning involves taking the parameters of what one model has learned on another dataset and  applying to our own problem.

* Pretreined models = foundation models

In [1]:
import torch
import torchvision

print(torch.__version__)
print(torchvision.__version__)

2.1.0+cu118
0.16.0+cu118


Let's import the code we've written in previous sections so that we don't have  to write all again

In [2]:
# Continue with regular imports
import matplotlib.pyplot as plt
import torch
import torchvision

from torch import nn
from torchvision import transforms

# Try to get torchinfo, install it if it doesn't work
try:
    from torchinfo import summary
except:
    print("[INFO] Couldn't find torchinfo... installing it.")
    !pip install -q torchinfo
    from torchinfo import summary

# Try to import the going_modular directory, download it from GitHub if it doesn't work
try:
    from going_modular.going_modular import data_setup, engine
except:
    # Get the going_modular scripts
    print("[INFO] Couldn't find going_modular scripts... downloading them from GitHub.")
    !git clone https://github.com/mrdbourke/pytorch-deep-learning
    !mv pytorch-deep-learning/going_modular .
    !rm -rf pytorch-deep-learning
    from going_modular.going_modular import data_setup, engine

[INFO] Couldn't find torchinfo... installing it.
[INFO] Couldn't find going_modular scripts... downloading them from GitHub.
Cloning into 'pytorch-deep-learning'...
remote: Enumerating objects: 4036, done.[K
remote: Counting objects: 100% (1224/1224), done.[K
remote: Compressing objects: 100% (225/225), done.[K
remote: Total 4036 (delta 1068), reused 1078 (delta 996), pack-reused 2812[K
Receiving objects: 100% (4036/4036), 651.02 MiB | 32.77 MiB/s, done.
Resolving deltas: 100% (2361/2361), done.
Updating files: 100% (248/248), done.


In [3]:
# Setup device agnostic code
device = 'cuda' if torch.cuda.is_available() else 'cpu'
device

'cpu'

In [4]:
!nvidia-smi

/bin/bash: line 1: nvidia-smi: command not found


## 1. Get data

We need our pizza, steak, sushi data to build a transfer learning model on.


In [5]:
import os
import zipfile
from pathlib import Path
import requests


# Setup data path
data_path = Path('data/')
image_path = data_path / 'pizza_strak_sushi' # Images from a subset of classes from the Food101 dataset

# If the image folder don't exist,  download it and prepare it...
if image_path.is_dir():
  print(f"{image_path} directory exists, skipping re-download.")
else:
  print(f"Did not find {image_path}, donwloading it...")
  image_path.mkdir(parents=True, exist_ok=True)

  # Donwload pizza, steak, shushi data
  with open(data_path/'pizza_steak_sushi.zip', 'wb') as f:
    request = requests.get('https://github.com/mrdbourke/pytorch-deep-learning/raw/main/data/pizza_steak_sushi.zip')
    print("Downloading pizza, steak, sushi data...")
    f.write(request.content)

  # Unzip pizza, steak, shushi data
  with zipfile.ZipFile(data_path/'pizza_steak_sushi.zip', 'r') as zip_ref:
    print("Unzipping pizza, steak, sushi data...")
    zip_ref.extractall(image_path)

  # Remove .zip file
  os.remove(data_path/'pizza_steak_sushi.zip')

Did not find data/pizza_strak_sushi, donwloading it...
Downloading pizza, steak, sushi data...
Unzipping pizza, steak, sushi data...


In [6]:
# Setup directory path
train_dir = image_path/'train'
test_dir = image_path/'test'

train_dir, test_dir

(PosixPath('data/pizza_strak_sushi/train'),
 PosixPath('data/pizza_strak_sushi/test'))

## 2. Create Datasets and DataLoaders

Now we've got some data, want to turn it into PyTorch DataLoaders

To do so, we can use the `data_setup.py` and `create_dataloaders()` function we mande in 05. Pytorch Going Modular

There's one thing we have to think about when loading: how to **transfom** it?

And with `torchvision` 0.13+ there's two ways to do this:

1. Manually created transforms - you define what transforms you want your data to go through.
2. Automatically created transforms - the transform for your data are defined by the model you'd like to use

Important point: When using a pretrained model, it's import that the data(inclunding your custom data) that you pass through it is **transformed** in the same way that the data the model was trained on.

### 2.1 Creating a transform for `torchvision.model`(manual creation)

`torchvision.models` contains pretreined models(models ready fo transfer learling) right within `torchvision`

All pre-trained models expect input images normalized in the same way, i.e. mini-batches of 3-channel RGB images of shape (3 x H x W), where H and W are expected to be at least 224.

The images have to be loaded in to a range of [0, 1] and then normalized using mean = [0.485, 0.456, 0.406] and std = [0.229, 0.224, 0.225].

You can use the following transform to normalize:

- normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])



In [7]:
from torchvision import transforms

normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])

manual_transforms = transforms.Compose([
    transforms.Resize((224, 224)), # Resize image to 224x224, heigthxwidth
    transforms.ToTensor(), # get image into range of [0, 1] in tensor format
    normalize # make sure images have the same distribution as ImageNet(where our pretreined models had been trained)
])

In [8]:
from going_modular.going_modular import data_setup
train_dataloader, test_dataloader, class_names = data_setup.create_dataloaders(train_dir=train_dir,
                                                                               test_dir=test_dir,
                                                                               transform=manual_transforms,
                                                                               batch_size=32)

In [9]:
train_dataloader, test_dataloader, class_names

(<torch.utils.data.dataloader.DataLoader at 0x79981afa2f50>,
 <torch.utils.data.dataloader.DataLoader at 0x79981afa2ef0>,
 ['pizza', 'steak', 'sushi'])

In [10]:
# Get a set of pretreinedmodel weights
weights = torchvision.models.EfficientNet_B0_Weights.DEFAULT # Default -> Best available weights
weights

EfficientNet_B0_Weights.IMAGENET1K_V1

In [11]:
# Get the transforms used to create our pretreined weights
auto_transforms = weights.transforms()
auto_transforms

ImageClassification(
    crop_size=[224]
    resize_size=[256]
    mean=[0.485, 0.456, 0.406]
    std=[0.229, 0.224, 0.225]
    interpolation=InterpolationMode.BICUBIC
)

In [12]:
# Create DataLoaders using the automatic transforms
train_dataloader, test_dataloader, class_names = data_setup.create_dataloaders(train_dir=train_dir,
                                                                               test_dir=test_dir,
                                                                               transform=auto_transforms,
                                                                               batch_size=32)

In [13]:
train_dataloader, test_dataloader, class_names

(<torch.utils.data.dataloader.DataLoader at 0x79981afa14b0>,
 <torch.utils.data.dataloader.DataLoader at 0x79981afa34f0>,
 ['pizza', 'steak', 'sushi'])

## 3. Getting a pretreined model

There are various places to get a pretreined model, such as:
1. PyTorch domain libraries
2. Libraries like `timm` (Torch image models)
3. HugginFace Hub (For plenty of different models)
4. PapersWithCode (For models across diferent problem/domains)


### 3.1 Which pretreined model should you use?

**Experiment, Experiment, Experiment**

The whole idea of transfer learning: take an already well-perfoming model from a
problem space similar to you own and then customize to your own problem

Three things to consider

1. Speed - how fast does it run?
2. Size - how big is the model?
3. Performance - how well does it go own your chosen problem(e.g how well does it
classify food images? for FoodVision Mini?)

Where does the model live?

Is it on device (likea self-driving card)

Or does it live on a server?

Looking at: https://pytorch.org/vision/stable/models#table-of-all-available-classification-weights

Which model should we choose?

For our case(deploying FoodVision Mini on a mobile device), it looks like EffNet50
is one of our best options in terms perfomance vs size.

However, in light of The Bitter Lesson, if we had infinite compute, we'd likely
pick the biggest model + most parameters + most general we could.
http://www.incompleteideas.net/IncIdeas/BitterLesson.html

### 3.2 Setting up a pretreined model

Want to create an instance of a pretreined EffnetB0 - https://pytorch.org/vision/stable/models/generated/torchvision.models.efficientnet_b0.html#torchvision.models.EfficientNet_B0_Weights

In [None]:
!pip install torchvision==0.15.2 # 0.16.0 has error to download models

Collecting torchvision==0.15.2
  Downloading torchvision-0.15.2-cp310-cp310-manylinux1_x86_64.whl (6.0 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m6.0/6.0 MB[0m [31m44.9 MB/s[0m eta [36m0:00:00[0m
Collecting torch==2.0.1 (from torchvision==0.15.2)
  Downloading torch-2.0.1-cp310-cp310-manylinux1_x86_64.whl (619.9 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m619.9/619.9 MB[0m [31m1.6 MB/s[0m eta [36m0:00:00[0m
[?25h

In [None]:
print(f"torch version: {torch.__version__}")
print(f"torchvision version: {torchvision.__version__}")

In [None]:
# Old method of creating a pretreined model(prior to torchvision v0.13)
#model = torchvision.models.efficientnet_b0(pretrained=True)

# New method of creating a pretreined model (torchvision v0.13)
weights = torchvision.models.EfficientNet_B0_Weights.DEFAULT # DEFTAULT -> Best available weights
model = torchvision.models.efficientnet_b0(weights=weights).to(device)

model

In [None]:
# Feature extractor, also called of "Backbone"
model.features

In [None]:
# Compress features in a sigle vector
model.avgpool

In [None]:
model.classifier

### 3.3 Getting a summary of our model with `torchinfo.summary()`


In [None]:
# Print with torchinfo
from torchinfo import summary

summary(model=model,
        input_size=[1,3,224,224], # example of [batch_size, color_channels, height, width]
        col_names=['input_size', 'output_size', 'num_params', 'trainable'],
        col_width=20,
        row_settings=['var_names'])

### 3.4 Freezing the base model and changing the output layer to suit our needs

Freeze a layer means that layer's weights will not update during training

With a feature extractor model, typically you will "freeze" the base layers of
a pretrained/foundation model and update the output layers to suit your own problem

In [None]:
# Freeze all of the base layers in EffNet50
for param in model.features.parameters():
  # print(param)
  param.requires_grad = False

In [None]:
# Print with torchinfo
from torchinfo import summary

summary(model=model,
        input_size=[1,3,224,224], # example of [batch_size, color_channels, height, width]
        col_names=['input_size', 'output_size', 'num_params', 'trainable'],
        col_width=20,
        row_settings=['var_names'])

In [None]:
# Update the classifier head of our model to suit our problem
from torch import nn

torch.manual_seed(42)
torch.cuda.manual_seed(42)

model.classifier = nn.Sequential(
    nn.Dropout(p=0.2, inplace=True), # Bernouli distribution, 20% maybe 0 values
    nn.Linear(in_features=1280, # Feature vector coming in
              out_features=len(class_names)).to(device), # How many classes do we have?
)

model.classifier

## 4 Train mode

In [None]:
# Define loss and optimizer
loss_fn = nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=1e-3)

In [None]:
# Import train function
from going_modular.going_modular import engine

# Set  the manual seed
torch.manual_seed(42)
torch.cuda.manual_seed(42)

# Start the timer
from timeit import default_timer as timer
start_time = timer()

# Setup training and save the results
results = engine.train(model=model,
                       train_dataloader=train_dataloader,
                       test_dataloader=test_dataloader,
                       optimizer=optimizer,
                       loss_fn=loss_fn,
                       epochs=5,
                       device=device)

# End the timer and print out how long it took
end_time = timer()
print(f"[INFO] Total training time: {end_time-start_time}")

In [None]:
results

In [None]:
results

# 5. Evaluate model by plot loss curves

In [None]:
# Get the plot_loss_curves() function from helper_functions.py, download the file if we don't have it
try:
    from helper_functions import plot_loss_curves
except:
    print("[INFO] Couldn't find helper_functions.py, downloading...")
    with open("helper_functions.py", "wb") as f:
        import requests
        request = requests.get("https://raw.githubusercontent.com/mrdbourke/pytorch-deep-learning/main/helper_functions.py")
        f.write(request.content)
    from helper_functions import plot_loss_curves

# Plot the loss curves of our model
plot_loss_curves(results)

## 6. Make predictions on image from the test set

Let's adhere to the data explorer's motto of *Visualize, Visualize, Visualize*!

And make some qualitative predictions on our test set

Some things to keep in mind when making predictions/inference on test data/custom
data.

We have to make sure that our test/custom data is:
* Same shape - images need to be same shape
* Same datatype - custom data should be in the same data type
* Same device - custom data/test data should be on the same device
* Same transform - if you've transformed your train data, ideally you will
transform the test data and custom data the same

To do all of this automagically, let's create a function called `pred_and_plot_image()`, that:

1. Take in a trained model, a list of class names, a filepath to a target image,
an image size, a transform and a target device
2. Open the image with `PIL.Image.Open()`
3. Create a transform if one doesn't exist
4. Make sure the model is on the target device
5. Turn the model to `model.eval()` mode to make sure it's ready for inference
(this will turn off things like `nn.Dropout()`)
6. Transform the target image and make sure it's dimesionality is suited for the model (this mainly relates to batch size)
7. Make a prediction on the image by passing to the model
8. Convert the model's output logits to prediction probabilities using `torch.softmax()`
9. Convert model's probabilities to prediciton labels using `torch.argmax()`
10.Plot the image with `matplotlib` and set  the title to prediction label from step 9 and prediction probabilitie from step 8

In [None]:
import matplotlib.pyplot as plt
from typing import List, Tuple
from PIL import Image
from torchvision import transforms

# 1. Take in params
def pred_and_plot_image(model: nn.Module,
                        image_path: str,
                        class_names:List[str],
                        image_size: Tuple[int, int]=(224, 224),
                        transform:transforms.Compose|None=None,
                        device=device):
  # 2. Open image with pil
  image = Image.open(image_path)

  # 3. Create a transform if one doesn't exist and transform image
  if transform is None:
    normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
    transform = transforms.Compose([
        transforms.Resize(image_size), # Resize image to 224x224, heigthxwidth
        transforms.ToTensor(),
        normalize
    ])

  # 4. Send model to target device
  model.to(device)

  # 5. Eval mode
  model.eval()

  # 6. Transform the image and add batch dimension
  transformed_image = transform(image)
  transformed_image = transformed_image.unsqueeze(dim=0)

  # 7. Pass image through the model
  with torch.inference_mode():
    y_logits = model(transformed_image.to(device))

    # 8. Raw logits to prediction probabilities
    y_pred = torch.softmax(y_logits, dim=1)

    # 9. Prediction probabilities to prediction labels
    label_idx = int(torch.argmax(y_pred, dim=1).item())

    # 10. Plot image
    plt.figure(figsize=(10,7))
    plt.imshow(image)
    plt.title(f"{class_names[label_idx]} - {y_pred[0][label_idx]:.4f}%")
    plt.axis(False)


In [None]:
## Get a random list of paths from the
import random

num_image_to_plot = 3
test_image_path_list =  list(Path(test_dir).glob('*/*.jpg'))
#random_samples_idx = random.sample(range(0, len(test_image_path_list)), k=num_image_to_plot)
random_samples_idx = random.sample(population=test_image_path_list, k=num_image_to_plot)
for sample_path in random_samples_idx:
  pred_and_plot_image(model=model,
                      image_path=str(sample_path),
                      class_names=class_names,
                      image_size=(224, 224),
                      device=device)

### 6.1  Making predictions on a custom image

In [None]:
pred_and_plot_image(model=model,
                    image_path='./test.jpg',
                    class_names=class_names,
                    device=device,
                    image_size=(224, 224))

In [None]:
# Download custom image
import requests

# Setup custom image path
custom_image_path = data_path / "04-pizza-dad.jpeg"

# Download the image if it doesn't already exist
if not custom_image_path.is_file():
    with open(custom_image_path, "wb") as f:
        # When downloading from GitHub, need to use the "raw" file link
        request = requests.get("https://raw.githubusercontent.com/mrdbourke/pytorch-deep-learning/main/images/04-pizza-dad.jpeg")
        print(f"Downloading {custom_image_path}...")
        f.write(request.content)
else:
    print(f"{custom_image_path} already exists, skipping download.")

# Predict on custom image
pred_and_plot_image(model=model,
                    image_path=custom_image_path,
                    class_names=class_names)