 # Home Work 2

In this hw assignment you have to train a classifier to distinguish between different food variations.


## But first... Theory
Solve the theoretical problems. Type the solutions here, using Latex

### Problem 1 (2 points)

Compute the result of convolution operation with kernel K to a matrix X.
Convolution params are: stride=2, dilation=2, padding=1 (padding with zeros)


$        X = \begin{bmatrix}
        1 & 0 & -4 & 2 \\
        5 & 2 & 3 & 0 \\
        -1 & 0 & 1 & 4 \\
        0 & -3 & 2 & -1
    \end{bmatrix}
    K = \begin{bmatrix}
        2 & 1 \\
        -1 & -2
    \end{bmatrix}
$

#### Solution

Add padding, so we have:
$    X^\text{pad} = \begin{bmatrix}
        0 & 0 & 0 & 0 & 0 & 0 \\
        0 & 1 & 0 & -4 & 2 & 0 \\
        0 & 5 & 2 & 3 & 0 & 0 \\
        0 & -1 & 0 & 1 & 4 & 0 \\
        0 & 0 & -3 & 2 & -1 & 0 \\
        0 & 0 & 0 & 0 & 0 & 0
    \end{bmatrix}
$

Add dilation to the kernel K:
$     K^\text{dil} = \begin{bmatrix}
        2 & 0 & 1 \\
        0 & 0 & 0 \\
        -1 & 0 & -2
    \end{bmatrix}
$

Stride means we move the kernel $K$ for 2 elements. Thus we do:

$   \begin{bmatrix}
        0 & 0 & 0  \\
        0 & 1 & 0  \\
        0 & 5 & 2 \\
    \end{bmatrix}
$ *
$   \begin{bmatrix}
        2 & 0 & 1 \\
        0 & 0 & 0 \\
        -1 & 0 & -2
    \end{bmatrix}   
$ = $0 + (-2)*2 = -4$

Move kernel 2 elements right:

$   \begin{bmatrix}
        0 & 0 & 0 \\
        0 & -4 & 2 \\
        2 & 3 & 0 \\
    \end{bmatrix}
$ *
$   \begin{bmatrix}
        2 & 0 & 1 \\
        0 & 0 & 0 \\
        -1 & 0 & -2
    \end{bmatrix}  
$ = $ 0 + (-1)*2 = -2$

\\
Move kernel 2 elements down: $\textbf{I'll not suffer any longer and write this fully}$. We get $2 + 6 = 8$, move kernel 2 elements right again and obtain $4 + 3 + 2 = 9$.

In result we have
$   \begin{bmatrix}
        -4 & -2 \\
        8 & 9
    \end{bmatrix}
$

In [1]:
import numpy as np
import torch
import torch.nn.functional as F

In [2]:
X = np.array([
    [1, 0, -4, 2],
    [5, 2, 3, 0],
    [-1, 0, 1, 4],
    [0, -3, 2, -1]])

K = np.array([
    [2, 1],
    [-1, -2]])

# Add extra dimension for the channel to fit conv2d
X_tensor = torch.Tensor(X).unsqueeze(0).unsqueeze(0)
K_tensor = torch.Tensor(K).unsqueeze(0).unsqueeze(0)

# conv2d
output = F.conv2d(X_tensor, K_tensor, stride=2, padding=1, dilation=2, )
print(output[0,0])

tensor([[-4., -2.],
        [ 8.,  9.]])


### Problem 2 (2 points)

Count the number of trainable params in nn:

        model = nn.Sequential(
            nn.Conv2d(
                in_channels=3, out_channels=16, kernel_size=5,
                stride=2, padding=0, dilation=1, bias=True
            ),
            nn.BatchNorm2d(num_features=16),
            nn.LeakyReLU(0.1),
            nn.Conv2d(
                in_channels=16, out_channels=32, kernel_size=5,
                stride=1, padding=1, dilation=2, bias=False
            ),
            nn.BatchNorm2d(num_features=32),
            nn.Sigmoid(),
        )
  

#### Solution

We'll need to sum up all the parameters in each layer

1. `nn.Conv2d(in_channels=3, out_channels=16, kernel_size=5, stride=2, padding=0, dilation=1, bias=True)`:
   - Number of parameters = (kernel_size^2 * in_channels + bias) * out_channels
   - Number of parameters = $(5^2 \times 3 + 1) \times 16 = 1216$

2. `nn.BatchNorm2d(num_features=16)`:
   - Number of parameters = num_features * 2
   - Number of parameters = $16 \times 2 = 32$

3. `nn.LeakyReLU(0.1)` does not have trainable parameters.

4. `nn.Conv2d(in_channels=16, out_channels=32, kernel_size=5, stride=1, padding=1, dilation=2, bias=False)`:
   - Number of parameters = (kernel_size^2 * in_channels) * out_channels
   - Number of parameters = $(5^2 \times 16) \times 32 = 12800$

5. `nn.BatchNorm2d(num_features=32)`:
   - Number of parameters = num_features * 2
   - Number of parameters = $32 \times 2 = 64$

6. `nn.Sigmoid()` does not have trainable parameters.

Thus, $1216 + 32 + 12800 + 64 = 14112$

In [3]:
import torch.nn as nn

model = nn.Sequential(
    nn.Conv2d(
        in_channels=3, out_channels=16, kernel_size=5,
        stride=2, padding=0, dilation=1, bias=True
    ),
    nn.BatchNorm2d(num_features=16),
    nn.LeakyReLU(0.1),
    nn.Conv2d(
        in_channels=16, out_channels=32, kernel_size=5,
        stride=1, padding=1, dilation=2, bias=False
    ),
    nn.BatchNorm2d(num_features=32),
    nn.Sigmoid(),
)

total_params = 0
for name, parameter in model.named_parameters():
    if not parameter.requires_grad: continue
    param = parameter.numel()
    print(f"{name}:\t{param}")
    total_params += param

print(f"Total number of trainable parameters: {total_params}")

0.weight:	1200
0.bias:	16
1.weight:	16
1.bias:	16
3.weight:	12800
4.weight:	32
4.bias:	32
Total number of trainable parameters: 14112


### Practical problem


Solve multicalss classification problem for Food101 dataset

####  Helper code

In [4]:
import copy
import random
import os
import shutil
import tarfile
from urllib.request import urlretrieve

import numpy as np
import pandas as pd
import cv2
from tqdm import tqdm

import torch
import torch.nn as nn
import torch.optim as optim
import torchvision

from torchvision import datasets, models
from torchvision.transforms import v2
from torch.utils.data import random_split, DataLoader, Dataset

pd.set_option('display.max_colwidth', None)

In [5]:
class TqdmUpTo(tqdm):
    def update_to(self, b=1, bsize=1, tsize=None):
        if tsize is not None:
            self.total = tsize
        self.update(b * bsize - self.n)


def download_url(url, filepath):
    directory = os.path.dirname(os.path.abspath(filepath))
    os.makedirs(directory, exist_ok=True)
    if os.path.exists(filepath):
        print("Filepath already exists. Skipping download.")
        return

    with TqdmUpTo(unit="B", unit_scale=True, unit_divisor=1024, miniters=1, desc=os.path.basename(filepath)) as t:
        urlretrieve(url, filename=filepath, reporthook=t.update_to, data=None)
        t.total = t.n


def extract_archive(filepath):
    extract_dir = os.path.dirname(os.path.abspath(filepath))
    shutil.unpack_archive(filepath, extract_dir)

In [6]:
url = "http://data.vision.ee.ethz.ch/cvl/food-101.tar.gz"

In [7]:
dataset_directory = os.path.join(os.environ["HOME"], "datasets/food101")

In [8]:
filepath = os.path.join(dataset_directory, "food101.tar.gz")
download_url(
    url=url,
    filepath=filepath,
)
%time extract_archive(filepath)

food101.tar.gz: 100%|██████████| 4.65G/4.65G [02:28<00:00, 33.6MB/s]


CPU times: user 49.9 s, sys: 17.9 s, total: 1min 7s
Wall time: 1min 7s


In [9]:
filepath

'/root/datasets/food101/food101.tar.gz'

In [10]:
!ls /root/datasets/food101/food-101/meta

classes.txt  labels.txt  test.json  test.txt  train.json  train.txt


In [11]:
!head -n10 /root/datasets/food101/food-101/meta/classes.txt

apple_pie
baby_back_ribs
baklava
beef_carpaccio
beef_tartare
beet_salad
beignets
bibimbap
bread_pudding
breakfast_burrito


### Dataloaders (1 point)


**Изменил `transforms` на `transforms.v2`**

In [12]:
NUM_WORKERS = os.cpu_count()


def split_data(dataset, val_size=0.2, seed=42):
    """
    """
    generator = torch.Generator().manual_seed(seed)
    train_data, test_data = random_split(dataset, [1 - val_size, val_size], generator=generator)

    return train_data, test_data


def create_dataloaders(
    data_dir: str,
    transform: v2.Compose,
    batch_size: int,
    num_workers: int=NUM_WORKERS):
  """Creates training and testing DataLoaders.

  Takes in a training directory and testing directory path and turns
  them into PyTorch Datasets and then into PyTorch DataLoaders.

  Args:
    train_dir: Path to training directory.
    test_dir: Path to testing directory.
    transform: torchvision transforms to perform on training and testing data.
    batch_size: Number of samples per batch in each of the DataLoaders.
    num_workers: An integer for number of workers per DataLoader.

  Returns:
    A tuple of (train_dataloader, test_dataloader, class_names).
    Where class_names is a list of the target classes.
    Example usage:
      train_dataloader, test_dataloader, class_names = \
        = create_dataloaders(train_dir=path/to/train_dir,
                             test_dir=path/to/test_dir,
                             transform=some_transform,
                             batch_size=32,
                             num_workers=4)
  """
  # Use ImageFolder of define your Dataset class here
    
  test_transform = v2.Compose([v2.Resize((224, 224)),
                              v2.ConvertImageDtype(torch.float32),
                              v2.ToTensor(),
                              v2.Normalize(mean=[0.485, 0.456, 0.406],std=[0.229, 0.224, 0.225])])

  train_dataset = torchvision.datasets.ImageFolder(image_path, transform=transform)
  test_dataset = torchvision.datasets.ImageFolder(image_path, transform=test_transform)

  train_set, _ = split_data(train_dataset)
  _, test_set = split_data(test_dataset)

  with open("/root/datasets/food101/food-101/meta/classes.txt") as fuck:
    class_names = np.array([x.strip() for x in fuck.readlines()])

  # Turn images into data loaders
  train_dataloader = DataLoader(
      train_set,
      batch_size=batch_size,
      shuffle=True,
      num_workers=num_workers,
      pin_memory=True, 
  )
  test_dataloader = DataLoader(
      test_set,
      batch_size=batch_size,
      shuffle=True,
      num_workers=num_workers,
      pin_memory=True,
  )

  return train_dataloader, test_dataloader, class_names

In [13]:
try:
    from torchinfo import summary
except:
    print("[INFO] Couldn't find torchinfo... installing it.")
    !pip install -q torchinfo
    from torchinfo import summary

In [14]:
from pathlib import Path

data_path = Path(dataset_directory)
image_path = data_path / "food-101" / "images"

Define transforms. You can add augmentations for better perfomance.

 You can either use Albumentation or torchvision

Don't forget to normalize images

In [15]:
# Create a transforms pipeline manually (required for torchvision < 0.13)
manual_transforms = v2.Compose([
    #YOUR CODE HERE#, # 1. Reshape data HxW to fit model
    v2.Resize((224, 224)),
    #YOUR CODE HERE#, # 2. Turn image values to between 0 & 1
    v2.ConvertImageDtype(torch.float32),
    v2.RandomHorizontalFlip(p=0.5),
    v2.ColorJitter(brightness=0.1, contrast=0.1, saturation=0.1, hue=0.1),
    v2.RandomRotation(degrees=15),
    v2.RandomAffine(degrees=0, translate=(0.1, 0.1), scale=(0.9, 1.1), shear=5),
    # v2.ToDtype(torch.float32, scale=True),
    v2.ToTensor(),
    v2.Normalize(mean=[0.485, 0.456, 0.406], # 3. A mean of [0.485, 0.456, 0.406] (across each colour channel)
                         std=[0.229, 0.224, 0.225]) # 4. A standard deviation of [0.229, 0.224, 0.225] (across each colour channel),
])



In [16]:
train_data, test_data = split_data(torchvision.datasets.ImageFolder(image_path, transform=manual_transforms))

In [17]:
!ls /root/datasets/food101/food-101/images

apple_pie	    eggs_benedict	     onion_rings
baby_back_ribs	    escargots		     oysters
baklava		    falafel		     pad_thai
beef_carpaccio	    filet_mignon	     paella
beef_tartare	    fish_and_chips	     pancakes
beet_salad	    foie_gras		     panna_cotta
beignets	    french_fries	     peking_duck
bibimbap	    french_onion_soup	     pho
bread_pudding	    french_toast	     pizza
breakfast_burrito   fried_calamari	     pork_chop
bruschetta	    fried_rice		     poutine
caesar_salad	    frozen_yogurt	     prime_rib
cannoli		    garlic_bread	     pulled_pork_sandwich
caprese_salad	    gnocchi		     ramen
carrot_cake	    greek_salad		     ravioli
ceviche		    grilled_cheese_sandwich  red_velvet_cake
cheese_plate	    grilled_salmon	     risotto
cheesecake	    guacamole		     samosa
chicken_curry	    gyoza		     sashimi
chicken_quesadilla  hamburger		     scallops
chicken_wings	    hot_and_sour_soup	     seaweed_salad
chocolate_cake	    hot_dog		     shrimp_and_grits
cho

In [18]:
train_data.dataset, len(train_data.indices), test_data.dataset, len(test_data.indices)

(Dataset ImageFolder
     Number of datapoints: 101000
     Root location: /root/datasets/food101/food-101/images
     StandardTransform
 Transform: Compose(
                  Resize(size=[224, 224], interpolation=InterpolationMode.BILINEAR, antialias=warn)
                  ConvertImageDtype()
                  RandomHorizontalFlip(p=0.5)
                  ColorJitter(brightness=(0.9, 1.1), contrast=(0.9, 1.1), saturation=(0.9, 1.1), hue=(-0.1, 0.1))
                  RandomRotation(degrees=[-15.0, 15.0], interpolation=InterpolationMode.NEAREST, expand=False, fill=0)
                  RandomAffine(degrees=[0.0, 0.0], translate=(0.1, 0.1), scale=(0.9, 1.1), shear=[-5.0, 5.0], interpolation=InterpolationMode.NEAREST, fill=0)
                  ToTensor()
                  Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225], inplace=False)
            ),
 80800,
 Dataset ImageFolder
     Number of datapoints: 101000
     Root location: /root/datasets/food101/food-101/images
  

In [19]:
batch_size = 128
num_workers = 2

 #### Train function (2 points)




In [20]:
!pip -q install torcheval torcheval-nightly

In [21]:
import torch
import torch.nn.functional as F

from tqdm.auto import tqdm
from typing import Dict, List, Tuple

def train_step(model: torch.nn.Module,
               dataloader: torch.utils.data.DataLoader,
               loss_fn: torch.nn.Module,
               optimizer: torch.optim.Optimizer,
               device: torch.device) -> Tuple[float, float]:
    """Trains a PyTorch model for a single epoch.

    Turns a target PyTorch model to training mode and then
    runs through all of the required training steps (forward
    pass, loss calculation, optimizer step).

    Args:
    model: A PyTorch model to be trained.
    dataloader: A DataLoader instance for the model to be trained on.
    loss_fn: A PyTorch loss function to minimize.
    optimizer: A PyTorch optimizer to help minimize the loss function.
    device: A target device to compute on (e.g. "cuda" or "cpu").

    Returns:
    A tuple of training loss and training accuracy metrics.
    In the form (train_loss, train_accuracy). For example:

    (0.1112, 0.8743)
    """
    # Put model in train mode
    model.train()

    # Setup train loss and train accuracy values
    # f1 = multiclass_f1_score()

    # Loop through data loader data batches
    losses, acc = 0., 0.
    for batch, (X, y) in enumerate(dataloader):
        # Send data to target device
        X, y = X.to(device), y.to(device)

        # 1. Forward pass
        output = model(X)
        # output = F.softmax(output, dim=-1)
        # classes = torch.argmax(output, dim=-1)

        # 2. Calculate and accumulate loss
        loss = loss_fn(output, y.long())
        losses += loss.item()

        # 3. Optimizer zero grad
        optimizer.zero_grad()

        # 4. Loss backward
        loss.backward()

        # 5. Optimizer step
        optimizer.step()

        # Calculate and accumulate accuracy metric across all batches
        acc += (output.argmax(dim=1) == y).float().mean().item()
    # Adjust metrics to get average loss and accuracy per batch
    return (losses/len(dataloader), acc/len(dataloader))

@torch.inference_mode()
def test_step(model: torch.nn.Module,
              dataloader: torch.utils.data.DataLoader,
              loss_fn: torch.nn.Module,
              device: torch.device) -> Tuple[float, float]:
    """Tests a PyTorch model for a single epoch.

    Turns a target PyTorch model to "eval" mode and then performs
    a forward pass on a testing dataset.

    Args:
    model: A PyTorch model to be tested.
    dataloader: A DataLoader instance for the model to be tested on.
    loss_fn: A PyTorch loss function to calculate loss on the test data.
    device: A target device to compute on (e.g. "cuda" or "cpu").

    Returns:
    A tuple of testing loss and testing accuracy metrics.
    In the form (test_loss, test_accuracy). For example:

    (0.0223, 0.8985)
    """
    # Put model in eval mode
    model.eval()

    # Do the rest
    # Hint: torch.no_grad() / torch.inference_mode()
    losses, acc = 0., 0.
    for batch, (X, y) in enumerate(dataloader):
        X, y = X.to(device), y.to(device)
        output = model(X)
        # output = F.softmax(output, dim=-1)

        losses += loss_fn(output, y.long()).item()
        acc += (output.argmax(dim=1) == y).float().mean().item()

    return (losses/len(dataloader), acc/len(dataloader))

def train(model: torch.nn.Module,
          train_dataloader: torch.utils.data.DataLoader,
          test_dataloader: torch.utils.data.DataLoader,
          optimizer: torch.optim.Optimizer,
          scheduler: torch.optim.lr_scheduler._LRScheduler,
          loss_fn: torch.nn.Module,
          epochs: int,
          device: torch.device) -> Dict[str, List]:
    """Trains and tests a PyTorch model.

    Passes a target PyTorch models through train_step() and test_step()
    functions for a number of epochs, training and testing the model
    in the same epoch loop.

    Calculates, prints and stores evaluation metrics throughout.

    Args:
    model: A PyTorch model to be trained and tested.
    train_dataloader: A DataLoader instance for the model to be trained on.
    test_dataloader: A DataLoader instance for the model to be tested on.
    optimizer: A PyTorch optimizer to help minimize the loss function.
    loss_fn: A PyTorch loss function to calculate loss on both datasets.
    epochs: An integer indicating how many epochs to train for.
    device: A target device to compute on (e.g. "cuda" or "cpu").

    Returns:
    A dictionary of training and testing loss as well as training and
    testing accuracy metrics. Each metric has a value in a list for
    each epoch.
    In the form: {train_loss: [...],
              train_acc: [...],
              test_loss: [...],
              test_acc: [...]}
    For example if training for epochs=2:
             {train_loss: [2.0616, 1.0537],
              train_acc: [0.3945, 0.3945],
              test_loss: [1.2641, 1.5706],
              test_acc: [0.3400, 0.2973]}
    """
    # Create empty results dictionary
    results = {"train_loss": [],
               "train_acc": [],
               "test_loss": [],
               "test_acc": []
    }

    # Make sure model on target device
    model.to(device)

    # Loop through training and testing steps for a number of epochs
    for epoch in tqdm(range(epochs)):
        train_loss, train_acc = train_step(model=model,
                                          dataloader=train_dataloader,
                                          loss_fn=loss_fn,
                                          optimizer=optimizer,
                                          device=device)

        test_loss, test_acc = test_step(model=model,
                                        dataloader=test_dataloader,
                                        loss_fn=loss_fn,
                                        device=device)
        
        scheduler.step(test_loss)

        # Print out what's happening
        print(
          f"Epoch: {epoch+1} | "
          f"train_loss: {train_loss:.4f} | "
          f"train_acc: {train_acc:.4f} | "
          f"test_loss: {test_loss:.4f} | "
          f"test_acc: {test_acc:.4f}"
        )

        # Update results dictionary
        results["train_loss"].append(train_loss)
        results["train_acc"].append(train_acc)

        results["test_loss"].append(test_loss)
        results["test_acc"].append(test_acc)

        # YOUR #
        # GOES #
        # CODE #
        # HERE #


        torch.save({
                'epoch': epochs,
                'model_state_dict': model.state_dict(),
                'optimizer_state_dict': optimizer.state_dict(),
                'loss': loss_fn,
                }, f"{epochs}_iter.pth")

    # Return the filled results at the end of the epochs
    return results

Choose a model

### ResNet

In [22]:
model = models.resnet50(pretrained=True)
model

Downloading: "https://download.pytorch.org/models/resnet50-0676ba61.pth" to /root/.cache/torch/hub/checkpoints/resnet50-0676ba61.pth
100%|██████████| 97.8M/97.8M [00:00<00:00, 139MB/s]


ResNet(
  (conv1): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
  (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (relu): ReLU(inplace=True)
  (maxpool): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
  (layer1): Sequential(
    (0): Bottleneck(
      (conv1): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
      (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
      (bn3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (relu): ReLU(inplace=True)
      (downsample): Sequential(
        (0): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 

In [23]:
for param in model.layer1.parameters():
    param.requires_grad = False
    
for param in model.layer2.parameters():
    param.requires_grad = False
    
for param in model.layer3.parameters():
    param.requires_grad = False
    
model.fc = nn.Linear(2048, 101)

In [24]:
summary(model,
        input_size=(32, 3, 224, 224), # make sure this is "input_size", not "input_shape" (batch_size, color_channels, height, width)
        verbose=0,
        col_names=["input_size", "output_size", "num_params", "trainable"],
        col_width=20,
        row_settings=["var_names"]
)

Layer (type (var_name))                  Input Shape          Output Shape         Param #              Trainable
ResNet (ResNet)                          [32, 3, 224, 224]    [32, 101]            --                   Partial
├─Conv2d (conv1)                         [32, 3, 224, 224]    [32, 64, 112, 112]   9,408                True
├─BatchNorm2d (bn1)                      [32, 64, 112, 112]   [32, 64, 112, 112]   128                  True
├─ReLU (relu)                            [32, 64, 112, 112]   [32, 64, 112, 112]   --                   --
├─MaxPool2d (maxpool)                    [32, 64, 112, 112]   [32, 64, 56, 56]     --                   --
├─Sequential (layer1)                    [32, 64, 56, 56]     [32, 256, 56, 56]    --                   False
│    └─Bottleneck (0)                    [32, 64, 56, 56]     [32, 256, 56, 56]    --                   False
│    │    └─Conv2d (conv1)               [32, 64, 56, 56]     [32, 64, 56, 56]     (4,096)              False
│    │    └─

### Train (1 point)

In [25]:
from torch.optim.lr_scheduler import ReduceLROnPlateau

In [26]:
loss_fn = nn.CrossEntropyLoss()
# optimizer = torch.optim.Adam(list(model.features.parameters())[-layers_to_unfreeze:] +
#                       list(model.classifier.parameters()), lr=5e-2)
optimizer = torch.optim.Adam(list(model.layer4.parameters()), lr=5e-2)
scheduler = ReduceLROnPlateau(optimizer, mode="max", min_lr=1e-4, factor=0.5, patience=2)
n_epochs = 25

In [27]:
manual_transforms

Compose(
      Resize(size=[224, 224], interpolation=InterpolationMode.BILINEAR, antialias=warn)
      ConvertImageDtype()
      RandomHorizontalFlip(p=0.5)
      ColorJitter(brightness=(0.9, 1.1), contrast=(0.9, 1.1), saturation=(0.9, 1.1), hue=(-0.1, 0.1))
      RandomRotation(degrees=[-15.0, 15.0], interpolation=InterpolationMode.NEAREST, expand=False, fill=0)
      RandomAffine(degrees=[0.0, 0.0], translate=(0.1, 0.1), scale=(0.9, 1.1), shear=[-5.0, 5.0], interpolation=InterpolationMode.NEAREST, fill=0)
      ToTensor()
      Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225], inplace=False)
)

In [28]:
train_dataloader, test_dataloader, class_names = create_dataloaders(None, manual_transforms, batch_size, num_workers)

In [29]:
device = 'cuda' if torch.cuda.is_available() else 'cpu'
device

'cuda'

In [30]:
torch.manual_seed(42)
torch.cuda.manual_seed(42)

# Start the timer
from timeit import default_timer as timer
start_time = timer()

# Setup training and save the results
results = train(model=model,
                       train_dataloader=train_dataloader,
                       test_dataloader=test_dataloader,
                       optimizer=optimizer,
                       scheduler=scheduler,
                       loss_fn=loss_fn,
                       epochs=n_epochs,
                       device=device)

# End the timer and print out how long it took
end_time = timer()
print(f"[INFO] Total training time: {end_time-start_time:.3f} seconds")

  0%|          | 0/25 [00:00<?, ?it/s]

Epoch: 1 | train_loss: 2.2032 | train_acc: 0.4471 | test_loss: 1.7583 | test_acc: 0.5568
Epoch: 2 | train_loss: 1.4863 | train_acc: 0.6104 | test_loss: 1.4084 | test_acc: 0.6401
Epoch: 3 | train_loss: 1.2916 | train_acc: 0.6568 | test_loss: 1.2804 | test_acc: 0.6698
Epoch: 4 | train_loss: 1.1672 | train_acc: 0.6888 | test_loss: 1.2816 | test_acc: 0.6747
Epoch: 5 | train_loss: 0.9711 | train_acc: 0.7381 | test_loss: 1.1184 | test_acc: 0.7118
Epoch: 6 | train_loss: 0.9134 | train_acc: 0.7522 | test_loss: 1.0733 | test_acc: 0.7250
Epoch: 7 | train_loss: 0.8672 | train_acc: 0.7618 | test_loss: 1.0815 | test_acc: 0.7262
Epoch: 8 | train_loss: 0.7576 | train_acc: 0.7907 | test_loss: 1.0238 | test_acc: 0.7408
Epoch: 9 | train_loss: 0.7197 | train_acc: 0.8008 | test_loss: 1.0155 | test_acc: 0.7455
Epoch: 10 | train_loss: 0.6921 | train_acc: 0.8060 | test_loss: 1.0351 | test_acc: 0.7440
Epoch: 11 | train_loss: 0.6228 | train_acc: 0.8276 | test_loss: 0.9809 | test_acc: 0.7562
Epoch: 12 | train_l

### Results (2 points in total)
Plot train and val losses

Inference model on unique subset of images


It must be very obvious that you specificaly using images that model has never seen neither during training nor evaluation steps




#### Graphs (1 point)
Plot graphs for train and val loss



In [31]:
import plotly.express as px
import pandas as pd

In [32]:
df = pd.DataFrame(results)

fig = px.line(df, x=df.index, y=["train_loss", "test_loss"], title="Training and Testing Loss Over Epochs", color_discrete_sequence=["darkred", "green"])
fig.show()

fig = px.line(df, x=df.index, y=["train_acc", "test_acc"], title="Training and Testing Accuracy Over Epochs", color_discrete_sequence=["darkred", "green"])
fig.show()

#### Demonstration (1 point)

Show model prediction on several images