<a href="https://colab.research.google.com/github/jdh4/resnet50/blob/master/day5_computer_vision_hackathon_notebook3_transfer_learning.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#Introduction to Machine Learning  
**Computer Vision Hackathon  
Wintersession  
Tuesday, January 24, 2023**

In this notebook you will use transfer learning to get a very high accuracy on the cats versus dog problem. The idea is to take a large CNN model trained on vast amounts of data and retrain only the top layers while freezing the lower layers. We are transferring the learning done previously to our problem. We will use the ResNet-50 model.

# About Your Colab Session

Learn about the CPU-cores for your session:

In [None]:
cat /proc/cpuinfo

In [None]:
import os
num_cores = min(os.cpu_count(), 2)
print(num_cores)

Let's see which GPU we are using (probably a Tesla T4):

In [None]:
!nvidia-smi

# Data Preparation

In [None]:
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
from torchvision import datasets, transforms, models
from torch.optim.lr_scheduler import StepLR
from PIL import Image

We want to use a GPU when one is available:

In [None]:
use_cuda = torch.cuda.is_available()
print(use_cuda)

In [None]:
torch.manual_seed(42)
device = torch.device("cuda") if use_cuda else torch.device("cpu")

train_kwargs = {'batch_size': 64}
test_kwargs  = {'batch_size': 128}
if use_cuda:
    cuda_kwargs = {'num_workers': num_cores, 'pin_memory': True}
    train_kwargs.update(cuda_kwargs)
    test_kwargs.update(cuda_kwargs)

Download and unpack the data:

In [None]:
!wget https://tigress-web.princeton.edu/~jdh4/cats_vs_dogs.tar
!tar xf cats_vs_dogs.tar

In [None]:
transform=transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize((0.1307,), (0.3081,))])
dataset1 = datasets.ImageFolder(root="./training_set/", transform=transform)
dataset2 = datasets.ImageFolder(root="./test_set/", transform=transform)

train_loader = torch.utils.data.DataLoader(dataset1, shuffle=True, **train_kwargs)
test_loader  = torch.utils.data.DataLoader(dataset2, shuffle=True, **test_kwargs)

There are roughly 4000 cat images and 4000 dog images in the training set. The test set is roughly 1000 images of each. All images have dimensions 128x128. The cat and dogs images are in color so they are composed of three layers (red, green, blue). The MNIST data set was grayscale so only a single layer was needed per image.

In [None]:
img = Image.open("./training_set/dogs/resized-dog.1001.jpg")
print(f"Image height: {img.height}") 
print(f"Image width: {img.width}")
img

In [None]:
img = Image.open("./training_set/cats/resized-cat.1001.jpg")
print(f"Image height: {img.height}") 
print(f"Image width: {img.width}")
img

# Model Definition

Below the model is downloaded. We turn-off gradient tracking for all model parameters except the last two linear layers. The model is moved to the device (which is a GPU is available) and the optimizer is created.

In [None]:
model = models.resnet50(weights='DEFAULT')
for param in model.parameters():
    param.requires_grad = False
# use print(model) to see that the name of the last layer is fc
# we redefine fc in the next line
model.fc = nn.Sequential(nn.Linear(2048, 128), nn.ReLU(inplace=True), nn.Linear(128, 2))
model = model.to(device)
optimizer = optim.Adadelta(model.fc.parameters(), lr=1.0)

In [None]:
from torchsummary import summary
summary(model, input_size=(3, 128, 128))

# Train and Test Methods

In [None]:
def train(model, device, train_loader, optimizer, epoch):
    model.train() # sets the model in training mode (i.e., dropout enabled)
    for batch_idx, (data, target) in enumerate(train_loader):
        data, target = data.to(device), target.to(device)
        optimizer.zero_grad()
        output = model(data)
        loss = F.nll_loss(F.log_softmax(output, dim=1), target)
        loss.backward()
        optimizer.step()
        if batch_idx % 100 == 0:
            print('Train Epoch: {} [{}/{} ({:.0f}%)]\tLoss: {:.6f}'.format(
                epoch, batch_idx * len(data), len(train_loader.dataset),
                100. * batch_idx / len(train_loader), loss.item()))

In [None]:
def test(model, device, test_loader):
    model.eval() # sets the model in evaluation mode (i.e., dropout disabled)
    test_loss = 0
    correct = 0
    with torch.no_grad():
        for data, target in test_loader:
            data, target = data.to(device), target.to(device)
            output = model(data)
            test_loss += F.nll_loss(output, target, reduction='sum').item()  # sum up batch loss
            pred = output.argmax(dim=1, keepdim=True)  # get the index of the max log-probability
            correct += pred.eq(target.view_as(pred)).sum().item()

    test_loss /= len(test_loader.dataset)

    print('\nTest set: Average loss: {:.4f}, Accuracy: {}/{} ({:.0f}%)\n'.format(
        test_loss, correct, len(test_loader.dataset),
        100. * correct / len(test_loader.dataset)))

Train for some number of epochs while reporting the accuracy on the test set periodically:

In [None]:
epochs = 12
scheduler = StepLR(optimizer, step_size=1, gamma=0.7)
for epoch in range(1, epochs + 1):
    train(model, device, train_loader, optimizer, epoch)
    test(model, device, test_loader)
    scheduler.step()