# RTML Midterm With Solution 2022

## Question 1 (20 points)

In Labs 04 and 05, you developed your own PyTorch implementations of YOLOv4 and YOLOR.
Download the image at http://www.cs.ait.ac.th/~mdailey/ait-orientation.jpg and run it
through your YOLOv4 and YOLOR models. Provide your source code to load the model, image,
get the result, and display the result here. Display the resulting bounding boxes.


In [4]:
# Code to run YOLOv4 on one image:

import torch
import pickle as pkl
import cv2
from darknet import MyDarknet
from util import load_classes, prep_image, non_max_suppression, write_dets

num_classes = 80
classes = load_classes("data/coco.names")

print("Loading network.....")
model = MyDarknet('cfg/yolov4.cfg')
model.load_weights('yolov4.weights')

model.net_info["height"] = 608
model.net_info["width"] = 608

model = model.cuda()

model.eval()

img = cv2.imread('ait-orientation.jpg')
img_tensor = prep_image(img, 608).cuda()
img_dim = (img.shape[1], img.shape[0])
prep_img_dim = (img_tensor.shape[3], img_tensor.shape[2])
predictions = model(img_tensor, True)
predictions = non_max_suppression(predictions, conf_thres=0.4, iou_thres=0.5)[0]
print(predictions.shape)

print(img_dim, prep_img_dim)
scaling_factor = max(img_dim) / max(prep_img_dim)
print(scaling_factor)
pad = ((prep_img_dim[1] - img_dim[1] / scaling_factor) / 2.0)
print(pad)
predictions[:,1:5] = predictions[:,1:5] - pad
predictions[:,1:5] *= scaling_factor

colors = pkl.load(open("pallete", "rb"))
for i in range(predictions.shape[0]):
  write_dets(torch.cat((torch.zeros((1)).cuda(), predictions[i,:]), 0), [img], colors, classes)
cv2.imwrite('ait-orientation-det.jpg', img)

I got the following result:

<img src=ait-orientation-det.jpg>

In [None]:
# Code for YOLOR needs just a few changes:

...

model = MyDarknet('cfg/yolor_p6.cfg')
model.load_state_dict(torch.load('yolor_p6.pt')['model'])

...

model.net_info["height"] = 1280
model.net_info["width"] = 1280

...

img_tensor = prep_image(img, 1280).cuda()

...

cv2.imwrite('ait-orientation-det-yolor.jpg', img)

This gave the following results:

<img src=ait-orientation-det-yolor.jpg>

## Question 2 (10 points)

In Labs 02-03, you became familiar with different image classification models and the
technique of retraining/fine-tuning a pre-trained model on a new dataset. Let's create
a ResNet model for classifying images in the CIFAR100 dataset.
   
First, create dataset objects for the CIFAR100 training and test sets. You'll find
documentation at [the torchvision datasets page](https://pytorch.org/vision/stable/datasets.html).
To use the already-downloaded dataset on puffer/gourami/guppy, use the following dataset location:

    train_dataset = torchvision.datasets.CIFAR100('/home/fidji/mdailey/Datasets/CIFAR100', train=True)

Write some code to get one of the samples from the dataset object. Show your code here, and display
the image print its attributes here.


In [5]:
# Code to extract a sample from the dataset

import torchvision

train_dataset = torchvision.datasets.CIFAR100('/datasets/CIFAR100', train=True)
val_dataset = torchvision.datasets.CIFAR100('/datasets/CIFAR100', train=False)

(img, img_class) = train_dataset[0]

img.save('cifar100-train-0.png')
print('Image class:', img_class)

The image is the following:

<img src=cifar100-train-0.png width=200>

and the class for this image is 19 ("cattle").

## Question 3 (20 points)

Next, create data loaders for the training dataset and validation dataset (no need to use the test set).
Use a batch size of 4 and appropriate transforms for the training and validation sets.

Put your code to create the data loaders, sample one minibatch from the training set, and output the
shapes of the tensors comprising the minibatch here.

In [6]:
# Code to create dataloaders, sample a minibatch, and print out tensor shape here

import torch
import torchvision

train_transform = torchvision.transforms.Compose([
  torchvision.transforms.Resize(36),
  torchvision.transforms.RandomHorizontalFlip(0.5),
  torchvision.transforms.RandomCrop(32),
  torchvision.transforms.ToTensor(),
  torchvision.transforms.Normalize((0.485, 0.456, 0.406), (0.229, 0.224, 0.225))
])

val_transform = torchvision.transforms.Compose([
  torchvision.transforms.Resize(36),
  torchvision.transforms.CenterCrop(32),
  torchvision.transforms.ToTensor(),
  torchvision.transforms.Normalize((0.485, 0.456, 0.406), (0.229, 0.224, 0.225))
])

train_dataset = torchvision.datasets.CIFAR100('/datasets/CIFAR100', train=True, transform=train_transform)
val_dataset = torchvision.datasets.CIFAR100('/datasets/CIFAR100', train=False, transform=val_transform)

train_dataloader = torch.utils.data.DataLoader(train_dataset, batch_size=4, shuffle=True, num_workers=2)
val_dataloader = torch.utils.data.DataLoader(val_dataset, batch_size=4, shuffle=False, num_workers=2)

(imgs, labels) = iter(train_dataloader).next()

print('Image tensor batch shape:', imgs.shape, 'labels shape:', labels.shape)

I got output

    Image tensor batch shape: torch.Size([4, 3, 32, 32]) labels shape: torch.Size([4])

## Question 4 (20 points)

Next, create a ResNet-50 model with pretrained weights from ImageNet using the [torchvision ResNet class](https://pytorch.org/vision/stable/models.html#id10).
Remove the classification layer and replace it with a layer appropriate for identification in CIFAR100. Show that your resulting model can process a minibatch
from your validation dataloader and output (incorrect) identities.

In [7]:
# Code to create a ResNet 50 model, remove classification layer, replace with a CIFAR100 identity layer, and run in evaluation model on a validation minibatch

model = torchvision.models.resnet50(pretrained=True).eval()
model.fc = torch.nn.Linear(2048, 100, bias=True)

(val_imgs, val_labels) = iter(val_dataloader).next()

output = model(val_imgs)
_, pred = torch.max(output, 1)

print('First validation minibatch targets', val_labels, 'predictions', pred)

The output is

    First validation minibatch targets tensor([49, 33, 72, 51]) predictions tensor([48, 77, 74, 74])


## Question 5 (20 points)

Next, write a training function, create an optimizer and loss function, and show training loss and validation loss for one epoch.

Show the new ouptut identities for the validation minibatch used in Question 4.

In [8]:
# Code for training and validation for one epoch optimizer, loss function, and new result on validation minibatch

def val(model, loader, criterion):
  running_loss = 0
  running_corrects = 0
  running_n = 0
  model.eval()
  it = 0
  for (imgs, labels) in iter(loader):
    outputs = model(imgs.to(device))
    _, preds = torch.max(outputs, 1)
    loss = criterion(outputs, labels.to(device))
    running_loss += loss.item()
    running_n += imgs.shape[0]
    corrects = (preds.detach().to('cpu') == labels).sum().item()
    running_corrects += corrects
    running_n += imgs.shape[0]
    if it % 100 == 0:
      print('Iter', it, 'loss', running_loss / running_n, 'acc', running_corrects / running_n)
    it += 1
  return running_loss / running_n, running_corrects / running_n

def train(model, loader, criterion, optimizer):
  running_loss = 0
  running_corrects = 0
  running_n = 0
  model.train()
  it = 0
  for (imgs, labels) in iter(loader):
    model.zero_grad()
    outputs = model(imgs.to(device))
    _, preds = torch.max(outputs, 1)
    loss = criterion(outputs, labels.to(device))
    running_loss += loss.item()
    running_n += imgs.shape[0]
    loss.backward()
    optimizer.step()
    corrects = (preds.detach().to('cpu') == labels).sum().item()
    running_corrects += corrects
    running_n += imgs.shape[0]
    if it % 100 == 0:
      print('Iter', it, 'loss', running_loss / running_n, 'acc', running_corrects / running_n)
    it += 1
  return running_loss / running_n, running_corrects / running_n

(val_imgs, val_labels) = iter(val_dataloader).next()
output = model(val_imgs.to(device))
_, pred = torch.max(output, 1)
print('Val predictions before:', pred.detach().cpu())

criterion = torch.nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.01, weight_decay=0.0005)
val_loss, val_acc = val(model, val_dataloader, criterion)
print('Initial val loss', val_loss, ' val accuracy', val_acc)
train_loss, train_acc = train(model, train_dataloader, criterion, optimizer)
print('Epoch 0: train loss', train_loss, 'train accuracy', train_acc)
val_loss, val_acc = val(model, val_dataloader, criterion)
print('Epoch 0: val loss', val_loss, 'val accuracy', val_acc)

(val_imgs, val_labels) = iter(val_dataloader).next()
output = model(val_imgs.to(device))
_, pred = torch.max(output, 1)
print('First validation minibatch targets', val_labels, 'predictions after 1 epoch of training', pred.detach().cpu())

I get the output

    Val predictions before: tensor([79, 93, 98, 49])
    Epoch 0: val_loss 0.5976760782122612 accuracy 0
    Epoch 0: train_loss 0.5824451119947434 accuracy 0.00855
    Epoch 1: val_loss 0.8617366856455803 accuracy 0.0102
    First validation minibatch targets tensor([49, 33, 72, 51]) predictions after epoch of training tensor([47, 64, 59, 19])


## Question 6 (10 points)

Explain how you could use the model you just created as a classifier model in a Control GAN.

The Control GAN utilizes three models, a generator, a discriminator, and a classifier. The generator receives noise and a class. The discriminator receives real or generated inputs without the class and has to classify them as real or fake. The classifier receives generated or real samples and has to classify them. If we used the ResNet50 model as the classifier, it would be very strong from the beginning and should give the generator good feedback for learning to generate samples that are correctly classified according to the conditional input.