# ResNet

This notebook is only a guide on how to fine-tune the **ResNet50** model, because the actual training of the model took place on **Google Colab**, in order to utilize the GPU. For this reason, only some cells have been executed.

In [1]:
import torch

from footvid.utils.env import check_repository_path


REPOSITORY_PATH = check_repository_path()
print(
    f"Is cuda available? {torch.cuda.is_available()}."
)
DEVICE = torch.device("cuda" if torch.cuda.is_available() else "cpu")
N_POS_TRAIN = 1174
N_NEG_TRAIN = 893

Is cuda available? False.


## Data preparation

Video frames should be prepared according to the previous [01-train-valid-split](https://github.com/mrtovsky/footvid/blob/master/notebooks/01-train-valid-split.ipynb) notebook and the final data **processed** folder structure should look as follows:

In [2]:
!tree -d ../data/processed/

[01;34m../data/processed/[00m
├── [01;34mtrain[00m
│   ├── [01;34mneg[00m
│   └── [01;34mpos[00m
└── [01;34mvalid[00m
    ├── [01;34mneg[00m
    └── [01;34mpos[00m

6 directories


In [3]:
from torchvision import datasets

from footvid.preprocessing import TEST_TRANSFORMS, TRAIN_TRANSFORMS



train_images = datasets.ImageFolder(
    root=REPOSITORY_PATH.joinpath("data", "processed", "train"),
    transform=TRAIN_TRANSFORMS,
)

valid_images = datasets.ImageFolder(
    root=REPOSITORY_PATH.joinpath("data", "processed", "valid"),
    transform=TEST_TRANSFORMS,
)

Let's check which index is corresponding to which class. It's good to know what is being modeled.

In [4]:
print("Train: ", train_images.class_to_idx)
print("Valid: ", valid_images.class_to_idx)

Train:  {'neg': 0, 'pos': 1}
Valid:  {'neg': 0, 'pos': 1}


In [5]:
from torch.utils.data import DataLoader


train_dataloader = DataLoader(
    dataset=train_images,  batch_size=64, shuffle=True, num_workers=2
)
valid_dataloader = DataLoader(
    dataset=valid_images,  batch_size=64, shuffle=False, num_workers=2
)

## Modeling

**PyTorch** native implementation of the **ResNet50** has been slightly modified to meet the requirements of the problem posed. The size of the output has been changed to match the binary classification problem and a hook has been added after the convolution layers which will definitely facilitate the visualization of **Grad-CAM**. The pre-trained weights remained unchanged.

In [6]:
from footvid.models import ResNet


model = ResNet(output_size=1)

# Fine-tuning

Drawing inspiration from [Karpathy et al.](https://static.googleusercontent.com/media/research.google.com/en//pubs/archive/42455.pdf) we want to test model fine-tuning in three variants:
- fine-tune fully-connected layer only,
- fine-tune top2 CNN layers as well as the FCL,
- fine-tune all layers.

Function [footvid.arena.freeze_layers](https://github.com/mrtovsky/footvid/blob/master/footvid/arena.py#L20) will come in handy.

In [7]:
from footvid.arena import freeze_layers


model_fcl = freeze_layers(model, last_layer="avgpool", inplace=False)
model_fcl = model_fcl.to(DEVICE)

In [None]:
import torch.nn as nn
import torch.optim as optim
from torch.utils.tensorboard import SummaryWriter

from footvid.arena import run_experiment, TrainTestDataloaders


artifacts_dir = REPOSITORY_PATH.joinpath("models", "fcl-resnet-fine-tuning")
artifacts_dir.mkdir(exist_ok=True)
optimizer = optim.SGD(model_fcl.parameters(), lr=1e-3)
scheduler = optim.lr_scheduler.MultiStepLR(optimizer, milestones=[5, 10], gamma=0.1)
objective = nn.BCEWithLogitsLoss()
train_test_dataloaders = TrainTestDataloaders(train=train_dataloader, test=valid_dataloader)
writer = SummaryWriter(log_dir=REPOSITORY_PATH.joinpath("logs", "fcl-resnet-fine-tuning"))

run_experiment(
    model=model_fcl,
    dataloaders=train_test_dataloaders,
    device=DEVICE,
    optimizer=optimizer,
    objective=objective,
    epochs=20,
    threshold=N_POS_TRAIN / (N_POS_TRAIN + N_NEG_TRAIN),
    scheduler=scheduler,
    artifacts_dir=artifacts_dir,
    writer=writer,
)