# Homework 2, *part 2* (60 points)

In this assignment you will build a heavy convolutional neural net (CNN) to solve Tiny ImageNet image classification. Try to achieve as high accuracy as possible.

## Deliverables

* This file,
* a "checkpoint file" from `torch.save(model.state_dict(), ...)` that contains model's weights (which a TA should be able to load to verify your accuracy).

## Grading

* 9 points for reproducible training code and a filled report below.
* 12 points for building a network that gets above 20% accuracy.
* 6.5 points for beating each of these milestones on the validation set:
  * 25.0%
  * 30.0%
  * 32.5%
  * 35.0%
  * 37.5%
  * 40.0%
    
## Restrictions

* Don't use pretrained networks.

## Tips

* One change at a time: never test several new things at once.
* Google a lot.
* Use GPU.
* Use regularization: L2, batch normalization, dropout, data augmentation.
* Use Tensorboard ([non-Colab](https://github.com/lanpa/tensorboardX) or [Colab](https://medium.com/@tommytao_54597/use-tensorboard-in-google-colab-16b4bb9812a6)) or a similar interactive tool for viewing progress.

In [13]:
import numpy as np
import matplotlib.pyplot as plt
import torch 
import torchvision
import torch.nn as nn
import pandas as pd
import os
from tqdm import tqdm_notebook as tqdm

from torch.utils.data import Dataset, DataLoader

%matplotlib inline

In [2]:
import tiny_imagenet
tiny_imagenet.download(".")

./tiny-imagenet-200 already exists, not downloading


In [3]:
path = "./tiny-imagenet-200"
folders = pd.read_csv(os.path.join(path, "wnids.txt"), header=None, names=["folders"])
f = list(folders['folders'])

Training and validation images are now in `tiny-imagenet-200/train` and `tiny-imagenet-200/val`.

## Data

In [5]:
train_data = torchvision.datasets.ImageFolder('./tiny-imagenet-200/train/', 
                                              transform=torchvision.transforms.ToTensor())
val_data = torchvision.datasets.ImageFolder('./tiny-imagenet-200/val/', 
                                            transform=torchvision.transforms.ToTensor())


batch_size = 64

train_loader = torch.utils.data.DataLoader(
    train_data, batch_size=batch_size, num_workers=2, shuffle=True, pin_memory=True)

val_loader = torch.utils.data.DataLoader(
    val_data, batch_size=batch_size, num_workers=2, shuffle=True, pin_memory=True)


## Model

In [6]:
class Flatten(nn.Module):
    def forward(self, input):
        return input.view(input.size(0), -1)

In [7]:
flat_model = nn.Sequential()

# reshape from "images" to flat vectors
flat_model.add_module('flatten', Flatten())

# dense "head"
flat_model.add_module('dense1', nn.Linear(3 * 64 * 64, 1064))
flat_model.add_module('dense2', nn.Linear(1064, 512))
flat_model.add_module('dropout0', nn.Dropout(0.05)) 
flat_model.add_module('dense3', nn.Linear(512, 256))
flat_model.add_module('dropout1', nn.Dropout(0.05))
flat_model.add_module('dense4', nn.Linear(256, 64))
flat_model.add_module('dropout2', nn.Dropout(0.05))
flat_model.add_module('dense1_relu', nn.ReLU())
flat_model.add_module('dense2_logits', nn.Linear(64, 200))

In [8]:
cnn = nn.Sequential(
    nn.Conv2d(in_channels=3, out_channels=2048, kernel_size=(3,3)),
    nn.Conv2d(in_channels=2048, out_channels=1024, kernel_size=(3,3)),
    nn.Conv2d(in_channels=1024, out_channels=512, kernel_size=(3,3)),
    nn.ReLU(),
    nn.MaxPool2d((6,6)),
    nn.Conv2d(in_channels=6, out_channels=32, kernel_size=(20,20)),
    nn.Conv2d(in_channels=32, out_channels=64, kernel_size=(20,20)),
    nn.Conv2d(in_channels=64, out_channels=128, kernel_size=(20,20)),
    nn.Softmax(),
    Flatten(),
    nn.Linear(64, 256),
    nn.Softmax(),
    nn.Linear(256, 10),
    nn.Sigmoid(),
    nn.Dropout(0.5)
    
)

# cnn.cuda()

## Loss

In [62]:
ce_loss = nn.CrossEntropyLoss()
nll_loss = nn.NLLLoss()

## Optimizer

In [63]:
opt_cnn = torch.optim.Adam(cnn.parameters())
opt_flat = torch.optim.Adam(flat_model.parameters())

## Training

In [64]:
train_loss = []
val_loss = []
val_accuracy = []
epochs = 25


import time

In [65]:
def onehotEncode(y):
    y_onehot = torch.zeros(y.shape[0], 200)
    y_onehot[ torch.arange(y.shape[0]), y] = 1
    return torch.LongTensor(y_onehot)

In [72]:
epochs = 1
for i in range(epochs):
    start_time = time.time()
    flat_model.train(True) # enable dropout / batch_norm training behavior
    for batch_idx, (x_batch, y_batch) in enumerate(tqdm(train_loader)):
        # train on batch
        activations = flat_model(x_batch)
        
#         loss = ce_loss(activations, onehotEncode(y_batch))
        loss = nll_loss(activations, y_batch)
        loss.backward()
        opt_flat.step()
        opt_flat.zero_grad()
        
        train_loss.append(loss.cpu().data.numpy())
        
    
#     flat_model.train(False) # disable dropout / use averages for batch_norm
#     for batch_idx, (x_batch, y_batch) in enumerate(tqdm(val_loader)):
#         activations = flat_model(x_batch)
#         loss = CE_loss(x_batch, y_batch)
#         val_loss.append(loss)
        

HBox(children=(IntProgress(value=0, max=1563), HTML(value='')))

In [73]:
train_loss

[array(-3420.2876, dtype=float32),
 array(-7105.568, dtype=float32),
 array(-6862.0815, dtype=float32),
 array(-9555.198, dtype=float32),
 array(-15532.112, dtype=float32),
 array(-18336.94, dtype=float32),
 array(-19861.146, dtype=float32),
 array(-29989.078, dtype=float32),
 array(-43000.867, dtype=float32),
 array(-34782.55, dtype=float32),
 array(-55339.918, dtype=float32),
 array(-64847.074, dtype=float32),
 array(-96655.23, dtype=float32),
 array(-109585.055, dtype=float32),
 array(-120643.59, dtype=float32),
 array(-104602.1, dtype=float32),
 array(-172613.17, dtype=float32),
 array(-205282.55, dtype=float32),
 array(-251104.39, dtype=float32),
 array(-280038.62, dtype=float32),
 array(-308397.72, dtype=float32),
 array(-391618.44, dtype=float32),
 array(-419643.12, dtype=float32),
 array(-441578.16, dtype=float32),
 array(-586499.56, dtype=float32),
 array(-780570.7, dtype=float32),
 array(-811777., dtype=float32),
 array(-931149.75, dtype=float32),
 array(-988923.8, dtype=floa

In [24]:
torch.arange(4)

tensor([0, 1, 2, 3])

When everything is done, please compute accuracy on the validation set and report it below.

In [None]:
val_accuracy = # Your code here
print("Validation accuracy: %.2f%%" % (val_accuracy * 100))

# Report

Below, please mention

* a brief history of tweaks and improvements;
* what is the final architecture and why?
* what is the training method (batch size, optimization algorithm, ...) and why?
* Any regularization and other techniques applied and their effects;

The reference format is:

*"I have analyzed these and these articles|sources|blog posts, tried that and that to adapt them to my problem and the conclusions are such and such".*

In [None]:
# class TinyImagaNet(Dataset):
#     def __init__(self, path):
#         self.root_dir = path
#         self.train_dir = os.path.join(path, "train")
#         self.val_dir = os.path.join(path, "val")
#         self.test_dir = os.path.join(path, "test", "images")
#         self.labels = pd.read_csv(os.path.join(path, 'words.txt'), sep='\t', header=None, names=['code', 'label'])
#         self.folders = pd.read_csv(os.path.join(path, "wnids.txt"), header=None, names=["folders"])
#     def __getitem__(self, folder, idx):
#         pass