<a href="https://colab.research.google.com/github/nalet/bme.Advanced-Topics-in-Machine-Learning/blob/master/Assignments/Assignment2/Assignment2.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Name: Nalet Meinen¶

# Assignment 2 ATML 2020
## Classification with limited data
ImageNet is a well known dataset with 1000 image classes. We will be working on a subset of the dataset (60k images, 100 classes, 600 images per class 80$\times$80 pixels, RGB) and train a model to classify an image into one of the 100 classes. The dataset is located under the "data" directory. Training and validation data splits are under "data/train" and "data/val" directories respectively. Both splits consist of 100 directories, each representing an object category.

In [0]:
import numpy as np
import torch
from torchvision import transforms
from torch.utils.data import DataLoader
import torch.nn as nn
from tqdm.notebook import tqdm
import matplotlib.pyplot as plt

In [2]:
!nvidia-smi

Sat Apr 18 15:19:38 2020       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.64.00    Driver Version: 418.67       CUDA Version: 10.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|   0  Tesla P100-PCIE...  Off  | 00000000:00:04.0 Off |                    0 |
| N/A   38C    P0    26W / 250W |      0MiB / 16280MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|  No ru

In [3]:
import os.path
if not os.path.isfile('data.zip'):
  from google.colab import drive
  drive.mount('/content/drive')
  !cp -r "/content/drive/My Drive/ATML/Assignments/Assignment2/data.zip" "data.zip"
  !unzip -qq data.zip

Go to this URL in a browser: https://accounts.google.com/o/oauth2/auth?client_id=947318989803-6bn6qk8qdgf4n4g3pfee6491hc0brc4i.apps.googleusercontent.com&redirect_uri=urn%3aietf%3awg%3aoauth%3a2.0%3aoob&response_type=code&scope=email%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdocs.test%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive.photos.readonly%20https%3a%2f%2fwww.googleapis.com%2fauth%2fpeopleapi.readonly

Enter your authorization code:
··········
Mounted at /content/drive


## Task 1. Implement ImageNetLimited class for data loading in datasets.py file

In [0]:
from PIL import Image
import glob
import torch
import os.path

class ImageNetLimited(torch.utils.data.Dataset):
    """ImageNet Limited dataset."""
    
    def __init__(self, root_dir, transform, instances=1,selected_classes=None):
        data = []

        for path in tqdm(glob.glob(root_dir + "/**/*")):
          target = int(os.path.basename(os.path.dirname(path)))

          if selected_classes != None and target not in selected_classes:
            continue

          _image = Image.open(path)
          
          for _ in range(instances):
            data.append((transform(_image),target))
        
        self.n = len(data)
        self.data = data

    def __len__(self):
        return self.n

    def __getitem__(self, idx):
        return self.data[idx]
        

In [5]:
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
if device.type == 'cuda':
    print("This nootebook ran on",device.type,"With number of GPU:",torch.cuda.device_count())

train_dir = 'data/train'
validation_dir = 'data/val'

# write your code
_transforms_train = transforms.Compose([
    transforms.RandomRotation([-45,45]),
    transforms.RandomCrop(64),
    transforms.RandomHorizontalFlip(),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
])

_transforms_val = transforms.Compose([
    transforms.CenterCrop(64),
    transforms.ToTensor(),
    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
])

selected_classes = None # [0,1,2,3,4,5,6,7,8,9]

train_dataset = ImageNetLimited(train_dir,_transforms_train,4,selected_classes=selected_classes)
val_dataset = ImageNetLimited(validation_dir,_transforms_val,selected_classes=selected_classes)
print("Train dataset length:",len(train_dataset),"Validation dataset length:",len(val_dataset))

This nootebook ran on cuda With number of GPU: 1


HBox(children=(IntProgress(value=0, max=60000), HTML(value='')))




HBox(children=(IntProgress(value=0, max=29855), HTML(value='')))


Train dataset length: 240000 Validation dataset length: 29855


In [0]:
batch_size = 512
train_dataloader = DataLoader(train_dataset, batch_size=batch_size, shuffle=True, num_workers=4)
val_dataloader = DataLoader(val_dataset, batch_size=batch_size, shuffle=False, num_workers=4)

## Task 2. CNN Architecture
Design and implement a Convolutional Neural Network architecture for image classification in a **ConvNet** class in the notebook. Some examples of popular classification models are: AlexNet, VGG, ResNet, ... Justify your design choices in the report. The input to your model must be an image of size $64 \times 64$ pixels.

In [0]:
class Block(nn.Module):

  def __init__(self, in_planes, planes, stride=1):
    super(Block, self).__init__()
    self.conv1 = nn.Conv2d(in_planes, planes, kernel_size=3, stride=stride, padding=1, bias=False)
    self.bn1 = nn.BatchNorm2d(planes)
    self.conv2 = nn.Conv2d(planes, planes, kernel_size=3, stride=1, padding=1, bias=False)
    self.bn2 = nn.BatchNorm2d(planes)

    self.shortcut = nn.Sequential()
    if stride != 1 or in_planes != planes:
      self.shortcut = nn.Sequential(
          nn.Conv2d(
              in_planes,
              planes,
              kernel_size=1,
              stride=stride,
              bias=False), nn.BatchNorm2d(planes))

  def forward(self, x):
    out = nn.functional.relu(self.bn1(self.conv1(x)))
    out = self.bn2(self.conv2(out))
    out += self.shortcut(x)
    out = nn.functional.relu(out)
    return out

class ConvNet(nn.Module):

  def __init__(self, block=Block, num_blocks=[2, 2, 2, 2]):
    super(ConvNet, self).__init__()
    self.in_planes = 64

    self.conv1 = nn.Conv2d(3, 64, kernel_size=3, stride=1, padding=1, bias=False)
    self.bn1 = nn.BatchNorm2d(64)
    self.layer1 = self._make_layer(block, 64, num_blocks[0], stride=1)
    self.layer2 = self._make_layer(block, 128, num_blocks[1], stride=2)
    self.layer3 = self._make_layer(block, 256, num_blocks[2], stride=2)
    self.layer4 = self._make_layer(block, 512, num_blocks[3], stride=2)
    self.linear = nn.Linear(2048, 100)

  def _make_layer(self, block, planes, num_blocks, stride):
    strides = [stride] + [1] * (num_blocks - 1)
    layers = []
    for stride in strides:
      layers.append(block(self.in_planes, planes, stride))
      self.in_planes = planes
    return nn.Sequential(*layers)

  def forward(self, x):
    out = nn.functional.relu(self.bn1(self.conv1(x)))
    out = self.layer1(out)
    out = self.layer2(out)
    out = self.layer3(out)
    out = self.layer4(out)
    out = nn.functional.avg_pool2d(out, 4)
    out = torch.flatten(out, 1)
    out = self.linear(out)
    return nn.functional.log_softmax(out, dim=1)

## Task 3. Train Model
Implement training and evaluation code for your model. Choose an appropriate loss function and evaluate the model on the validation set using classification accuracy. You are not allowed to use a pre-trained model (must train from scratch on the provided data).<br>
<font color='red'>Your model should achieve an accuracy of at least 40.0% on the validation set (Model with performance smaller than 40.0% will result in 0 points for this task).</font><br>

In [0]:
device = torch.device('cuda') if torch.cuda.is_available() else torch.device('cpu')

def train(model, train_loader, optimizer, loss_fn):
    '''
    Trains the model for one epoch
    '''
    model.train()
    losses = []
    n_correct = 0
    for iteration, (images, labels) in enumerate(tqdm(train_loader)):
        images = images.to(device)
        labels = labels.to(device)
        output = model(images)
        optimizer.zero_grad()
        loss = loss_fn(output, labels)
        loss.backward()
        optimizer.step()
        losses.append(loss.item())
        n_correct += torch.sum(output.argmax(1) == labels).item()
    accuracy = 100.0 * n_correct / len(train_loader.dataset)
    return np.mean(np.array(losses)), accuracy
            
def test(model, test_loader, loss_fn):
    '''
    Tests the model on data from test_loader
    '''
    model.eval()
    test_loss = 0
    n_correct = 0
    with torch.no_grad():
        for images, labels in test_loader:
            images = images.to(device)
            labels = labels.to(device)
            output = model(images)
            loss = loss_fn(output, labels)
            test_loss += loss.item()
            n_correct += torch.sum(output.argmax(1) == labels).item()

    average_loss = test_loss / len(test_loader)
    accuracy = 100.0 * n_correct / len(test_loader.dataset)
    print('Test average loss: {:.4f}, accuracy: {:.3f}'.format(average_loss, accuracy))
    return average_loss, accuracy


def fit(train_dataloader, val_dataloader, model, optimizer, loss_fn, n_epochs, scheduler=None):
    train_losses, train_accuracies = [], []
    val_losses, val_accuracies = [], []

    for epoch in tqdm(range(n_epochs)):
        train_loss, train_accuracy = train(model, train_dataloader, optimizer, loss_fn)
        val_loss, val_accuracy = test(model, val_dataloader, loss_fn)
        train_losses.append(train_loss)
        train_accuracies.append(train_accuracy)
        val_losses.append(val_loss)
        val_accuracies.append(val_accuracy)
        if scheduler:
            scheduler.step() # argument only needed for ReduceLROnPlateau
        print('Epoch {}/{}: train_loss: {:.4f}, train_accuracy: {:.4f}, val_loss: {:.4f}, val_accuracy: {:.4f}'.format(epoch+1, n_epochs,
                                                                                                          train_losses[-1],
                                                                                                          train_accuracies[-1],
                                                                                                          val_losses[-1],
                                                                                                          val_accuracies[-1]))
    
    return train_losses, train_accuracies, val_losses, val_accuracies

In [9]:
model = ConvNet()
model = model.to(device)
learning_rate = 0.02
optimizer = torch.optim.SGD(model.parameters(), learning_rate, momentum=0.9, weight_decay=5e-4)
#optimizer = torch.optim.Adam(model.parameters(), lr=0.0001)
#optimizer = torch.optim.Adagrad(model.parameters(), lr=learning_rate)
n_epochs = 10
#loss_fn = nn.CrossEntropyLoss()
loss_fn = nn.NLLLoss()
print(model)
curves_conv2 = fit(train_dataloader, val_dataloader, model, optimizer, loss_fn, n_epochs)

ConvNet(
  (conv1): Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
  (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
  (layer1): Sequential(
    (0): Block(
      (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (shortcut): Sequential()
    )
    (1): Block(
      (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
      (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
      (bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, tra

HBox(children=(IntProgress(value=0, max=10), HTML(value='')))

HBox(children=(IntProgress(value=0, max=469), HTML(value='')))


Test average loss: 3.0886, accuracy: 22.639
Epoch 1/10: train_loss: 3.5379, train_accuracy: 15.5875, val_loss: 3.0886, val_accuracy: 22.6394


HBox(children=(IntProgress(value=0, max=469), HTML(value='')))


Test average loss: 2.8026, accuracy: 29.878
Epoch 2/10: train_loss: 2.6976, train_accuracy: 30.2167, val_loss: 2.8026, val_accuracy: 29.8777


HBox(children=(IntProgress(value=0, max=469), HTML(value='')))


Test average loss: 2.5940, accuracy: 34.122
Epoch 3/10: train_loss: 2.2611, train_accuracy: 39.4642, val_loss: 2.5940, val_accuracy: 34.1216


HBox(children=(IntProgress(value=0, max=469), HTML(value='')))


Test average loss: 2.3275, accuracy: 39.869
Epoch 4/10: train_loss: 1.9374, train_accuracy: 46.7592, val_loss: 2.3275, val_accuracy: 39.8694


HBox(children=(IntProgress(value=0, max=469), HTML(value='')))


Test average loss: 2.4470, accuracy: 38.191
Epoch 5/10: train_loss: 1.6750, train_accuracy: 52.9229, val_loss: 2.4470, val_accuracy: 38.1913


HBox(children=(IntProgress(value=0, max=469), HTML(value='')))


Test average loss: 2.1338, accuracy: 44.773
Epoch 6/10: train_loss: 1.4387, train_accuracy: 58.9275, val_loss: 2.1338, val_accuracy: 44.7731


HBox(children=(IntProgress(value=0, max=469), HTML(value='')))


Test average loss: 2.4088, accuracy: 42.593
Epoch 7/10: train_loss: 1.2274, train_accuracy: 64.4971, val_loss: 2.4088, val_accuracy: 42.5925


HBox(children=(IntProgress(value=0, max=469), HTML(value='')))


Test average loss: 2.4442, accuracy: 41.869
Epoch 8/10: train_loss: 1.0284, train_accuracy: 69.8275, val_loss: 2.4442, val_accuracy: 41.8690


HBox(children=(IntProgress(value=0, max=469), HTML(value='')))


Test average loss: 2.6240, accuracy: 41.849
Epoch 9/10: train_loss: 0.8246, train_accuracy: 75.6679, val_loss: 2.6240, val_accuracy: 41.8489


HBox(children=(IntProgress(value=0, max=469), HTML(value='')))


Test average loss: 2.4723, accuracy: 43.259
Epoch 10/10: train_loss: 0.6458, train_accuracy: 80.9483, val_loss: 2.4723, val_accuracy: 43.2591



## Task 4. Ablations
Try to find the best performing model by tuning the model design and hyper-parameters on the validation set. Perform ablation experiments to illustrate the effect of the most important hyper-parameters. Some examples of ablations: training parameters (e.g., optimizer, learning rates, batch size), network architecture (e.g., number of layers, number of units, activation function, normalization layers), model regularization (e.g., data augmentation, dropout, weight decay, early stopping), test-time augmentation, etc...  <br>**Perform at least 5 ablations and report the performance of each on the validation set.**

In [0]:
# write your code

## Task 5. Model Errors
Evaluate the trained model on the validation set and plot 10 random mistakes that your model made.

In [0]:
# write your code

## Task 6. Competition time!
Read the images from "data/test" folder. There are no labels for these images. Run your best model on these images and save the image IDs (names) and predicted label in a file LastName.csv. You will receive a link via email to upload the CSV file to  an online system which will give you the score of your model on the held-out test set. Top 5 students with at least 40% classification accuracy will obtain bonus points.

In [0]:
# write your code