<a href="https://colab.research.google.com/github/Radi4/DL_colab/blob/master/hometask_part2.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Deep learning for computer vision


This notebook will teach you to build and train convolutional networks for image recognition. Brace yourselves.

# Tiny ImageNet dataset
This week, we shall focus on the image recognition problem on Tiny Image Net dataset
* 100k images of shape 3x64x64
* 200 different classes: snakes, spaiders, cats, trucks, grasshopper, gull, etc.


In [0]:
import subprocess
import os
from collections import defaultdict

In [2]:
!wget https://raw.githubusercontent.com/yandexdataschool/Practical_DL/spring2019/week03_convnets/tiny_img.py -O tiny_img.py

--2019-03-14 07:26:39--  https://raw.githubusercontent.com/yandexdataschool/Practical_DL/spring2019/week03_convnets/tiny_img.py
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 151.101.0.133, 151.101.64.133, 151.101.128.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|151.101.0.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 3378 (3.3K) [text/plain]
Saving to: ‘tiny_img.py’


2019-03-14 07:26:39 (76.3 MB/s) - ‘tiny_img.py’ saved [3378/3378]



In [3]:
from tiny_img import download_tinyImg200
data_path = '.'
download_tinyImg200(data_path)

./tiny-imagenet-200.zip


In [4]:
import subprocess
import os
from collections import defaultdict
from tqdm import tqdm_notebook, trange, tqdm

if os.path.exists('./tiny-imagenet-200/val/val_annotations.txt'):
    classes = defaultdict(list)

    with open('./tiny-imagenet-200/val/val_annotations.txt', 'r') as f:
        for line in f:
            line = line.strip()
            name, clas, data = line.split('\t', 2)
            classes[clas].append(([name, data]))
    
    subprocess.call(['rm', '-r', './tiny-imagenet-200/val/val_annotations.txt'])

    val_dir = os.path.join('tiny-imagenet-200', 'val', 'images')

    for clas in tqdm_notebook(classes):
    
        subprocess.call(['mkdir', os.path.join('tiny-imagenet-200', 'val', clas)])
        subprocess.call(['mkdir', os.path.join('tiny-imagenet-200', 'val', clas, 'images')])

        new_dir = os.path.join('tiny-imagenet-200', 'val', clas)
        new_file = os.path.join(new_dir, clas + '_boxes.txt')
        new_dir = os.path.join(new_dir, 'images')

        subprocess.call(['touch', new_file])
        with open(new_file, 'w') as out:
            for name, data in classes[clas]:
                out.write(name + '\t' + data + '\n')
                subprocess.call(['mv', os.path.join(val_dir, name), new_dir + '/'])
    
    subprocess.call(['rm', '-r', val_dir])

HBox(children=(IntProgress(value=0, max=200), HTML(value='')))




In [0]:
import torchvision
import torch, torch.nn as nn
import torch.nn.functional as F
from torch.autograd import Variable
from torchvision import transforms
import numpy as np

## Image examples ##



<tr>
    <td> <img src="tinyim3.png" alt="Drawing" style="width:90%"/> </td>
    <td> <img src="tinyim2.png" alt="Drawing" style="width:90%"/> </td>
</tr>


<tr>
    <td> <img src="tiniim.png" alt="Drawing" style="width:90%"/> </td>
</tr>

# Building a network

Simple neural networks with layers applied on top of one another can be implemented as `torch.nn.Sequential` - just add a list of pre-built modules and let it train.

In [0]:
np.random.seed(42)
torch.manual_seed(42)
torch.backends.cudnn.deterministic = True
torch.backends.cudnn.benchmark = False

In [0]:
# a special module that converts [batch, channel, w, h] to [batch, units]
class Flatten(nn.Module):
    def forward(self, input):
        return input.view(input.size(0), -1)

As in our basic tutorial, we train our model with negative log-likelihood aka crossentropy.

In [0]:
def compute_loss(X_batch, y_batch):
    X_batch = Variable(torch.FloatTensor(X_batch)).cuda()
    y_batch = Variable(torch.LongTensor(y_batch)).cuda()
    logits = model(X_batch)
    return F.cross_entropy(logits, y_batch).mean()

### Training on minibatches
* We got 100k images, that's way too many for a full-batch SGD. Let's train on minibatches instead
* Below is a function that splits the training sample into minibatches

## Task 3: Data Augmentation

** Augmenti - A spell used to produce water from a wand (Harry Potter Wiki) **

<img src="HagridsHut_PM_B6C28_Hagrid_sHutFireHarryFang.jpg" style="width:80%">

There's a powerful torch tool for image preprocessing useful to do data preprocessing and augmentation.

Here's how it works: we define a pipeline that
* makes random crops of data (augmentation)
* randomly flips image horizontally (augmentation)
* then normalizes it (preprocessing)

When testing, we don't need random crops, just normalize with same statistics.

In [0]:
import torchvision
from torchvision import transforms
means = np.array((0.4914, 0.4822, 0.4465))
stds = np.array((0.2023, 0.1994, 0.2010))

transform_augment = transforms.Compose([
    torchvision.transforms.RandomHorizontalFlip(),
    torchvision.transforms.RandomCrop(64),
    transforms.ToTensor(),
    torchvision.transforms.Normalize(means, stds),
])

In [0]:
dataset = torchvision.datasets.ImageFolder('tiny-imagenet-200/train', transform=transform_augment)

In [0]:
train_dataset, val_dataset = torch.utils.data.random_split(dataset, [90000, 10000])

In [0]:
model = nn.Sequential()
model.add_module('conv1', nn.Conv2d(in_channels=3, out_channels=64, kernel_size=3, padding = 1, bias = False))
model.add_module('norm1', nn.BatchNorm2d(64))
model.add_module('relu1', nn.ELU())
model.add_module('pool1', nn.MaxPool2d(2))
model.add_module('conv2', nn.Conv2d(in_channels=64, out_channels=128, kernel_size=3, padding = 1, bias = False))
model.add_module('norm2', nn.BatchNorm2d(128))
model.add_module('relu2', nn.ELU())
model.add_module('pool2', nn.MaxPool2d(2))
model.add_module('conv3', nn.Conv2d(in_channels=128, out_channels=256, kernel_size=3, padding = 1, bias = False))
model.add_module('norm3', nn.BatchNorm2d(256))
model.add_module('relu3', nn.ELU())
model.add_module('pool3', nn.MaxPool2d(2))
model.add_module('flat', Flatten())
model.add_module('linear_-3', nn.Linear(16384, 4096 * 2))
model.add_module('norm_-2', nn.BatchNorm1d(4096 * 2))
model.add_module('relu_-2', nn.ReLU())
model.add_module('droput_-2', nn.Dropout(0.35))
model.add_module('linear_-2', nn.Linear(4096 * 2, 500))
model.add_module('norm_-1', nn.BatchNorm1d(500))
model.add_module('relu_-1', nn.ReLU())
model.add_module('droput_-1', nn.Dropout(0.35))


model.add_module('dense1_logits', nn.Linear(500, 200)) # logits for 200 classes
model = model.cuda()

In [80]:
from torchsummary import summary
summary(model, (3, 64, 64))

----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
            Conv2d-1           [-1, 64, 64, 64]           1,728
       BatchNorm2d-2           [-1, 64, 64, 64]             128
               ELU-3           [-1, 64, 64, 64]               0
         MaxPool2d-4           [-1, 64, 32, 32]               0
            Conv2d-5          [-1, 128, 32, 32]          73,728
       BatchNorm2d-6          [-1, 128, 32, 32]             256
               ELU-7          [-1, 128, 32, 32]               0
         MaxPool2d-8          [-1, 128, 16, 16]               0
            Conv2d-9          [-1, 256, 16, 16]         294,912
      BatchNorm2d-10          [-1, 256, 16, 16]             512
              ELU-11          [-1, 256, 16, 16]               0
        MaxPool2d-12            [-1, 256, 8, 8]               0
          Flatten-13                [-1, 16384]               0
           Linear-14                 [-

In [0]:
opt = torch.optim.Adam(model.parameters(), lr = 1e-4)

train_loss = []
val_accuracy = []

In [0]:
import time
num_epochs = 40 # total amount of full passes over training data
batch_size = 100  # number of samples processed in one SGD iteration
train_dataset, val_dataset = torch.utils.data.random_split(dataset, [90000, 10000])

train_batch_gen = torch.utils.data.DataLoader(train_dataset, 
                                              batch_size=batch_size,
                                              shuffle=True,
                                              num_workers=1)
val_batch_gen = torch.utils.data.DataLoader(val_dataset, 
                                              batch_size=batch_size,
                                              shuffle=True,
                                              num_workers=1)

for epoch in tqdm(range(num_epochs)):
    # In each epoch, we do a full pass over the training data:
    start_time = time.time()
    model.train(True) # enable dropout / batch_norm training behavior
    for (X_batch, y_batch) in train_batch_gen:
        # train on batch
        loss = compute_loss(X_batch, y_batch)
        loss.backward()
        opt.step()
        opt.zero_grad()
        train_loss.append(loss.cpu().data.numpy())
    print(epoch)    
    model.train(False) # disable dropout / use averages for batch_norm
    for X_batch, y_batch in val_batch_gen:
        logits = model(Variable(torch.FloatTensor(X_batch)).cuda())
        y_pred = logits.max(1)[1].data
        val_accuracy.append(np.mean( (y_batch.cpu() == y_pred.cpu()).numpy() ))

    # Then we print the results for this epoch:
    print("Epoch {} of {} took {:.3f}s".format(
        epoch + 1, num_epochs, time.time() - start_time))
    print("  training loss (in-iteration): \t{:.6f}".format(
        np.mean(train_loss[-len(train_dataset) // batch_size :])))
    print("  validation accuracy: \t\t\t{:.2f} %".format(
        np.mean(val_accuracy[-len(val_dataset) // batch_size :]) * 100))










  0%|          | 0/40 [00:00<?, ?it/s][A[A[A[A[A[A[A[A[A

0











  2%|▎         | 1/40 [03:43<2:25:26, 223.77s/it][A[A[A[A[A[A[A[A[A

Epoch 1 of 40 took 223.765s
  training loss (in-iteration): 	4.629813
  validation accuracy: 			17.27 %
1











  5%|▌         | 2/40 [07:28<2:21:50, 223.96s/it][A[A[A[A[A[A[A[A[A

Epoch 2 of 40 took 224.387s
  training loss (in-iteration): 	3.861960
  validation accuracy: 			23.42 %
2











  8%|▊         | 3/40 [11:13<2:18:23, 224.42s/it][A[A[A[A[A[A[A[A[A

Epoch 3 of 40 took 225.483s
  training loss (in-iteration): 	3.436411
  validation accuracy: 			26.78 %
3











 10%|█         | 4/40 [14:59<2:14:52, 224.79s/it][A[A[A[A[A[A[A[A[A

Epoch 4 of 40 took 225.654s
  training loss (in-iteration): 	3.125570
  validation accuracy: 			27.65 %
4











 12%|█▎        | 5/40 [18:44<2:11:10, 224.87s/it][A[A[A[A[A[A[A[A[A

Epoch 5 of 40 took 225.051s
  training loss (in-iteration): 	2.836522
  validation accuracy: 			30.69 %


We need for test data __only normalization__, not cropping and rotation

In [0]:
transform_test = transforms.Compose([
    transforms.ToTensor(),
    transforms.Normalize(means, stds),
])

batch_size = 100
real_test_dataset = torchvision.datasets.ImageFolder('tiny-imagenet-200/val', transform=transform_test)
test_batch_gen = torch.utils.data.DataLoader(real_test_dataset, 
                                              batch_size=batch_size,
                                              shuffle=True,
                                              num_workers=1)

model.train(False) # disable dropout / use averages for batch_norm
test_batch_acc = []
for X_batch, y_batch in test_batch_gen:
    logits = model(Variable(torch.FloatTensor(X_batch)).cuda())
    y_pred = logits.max(1)[1].data
    test_batch_acc.append(np.mean( (y_batch.cpu() == y_pred.cpu()).numpy() ))


test_accuracy = np.mean(test_batch_acc)

print("Final results:")
print("  test accuracy:\t\t{:.2f} %".format(
    test_accuracy * 100))

if test_accuracy * 100 > 70:
    print("U'r freakin' amazin'!")
elif test_accuracy * 100 > 50:
    print("Achievement unlocked: 110lvl Warlock!")
elif test_accuracy * 100 > 40:
    print("Achievement unlocked: 80lvl Warlock!")
elif test_accuracy * 100 > 30:
    print("Achievement unlocked: 70lvl Warlock!")
elif test_accuracy * 100 > 20:
    print("Achievement unlocked: 60lvl Warlock!")
else:
    print("We need more magic! Follow instructons below")

Final results:
  test accuracy:		38.62 %
Achievement unlocked: 70lvl Warlock!


## The Quest For A Better Network

See `practical_dl/homework02` for a full-scale assignment.