# Q3: Even deeper! Resnet18 for PASCAL classification (15 pts)

Hopefully we all got much better accuracy with the deeper model! Since 2012, much deeper architectures have been proposed. [ResNet](https://arxiv.org/abs/1512.03385) is one of the popular ones. In this task, we attempt to further improve the performance with the “very deep” ResNet-18 architecture.


## 3.1 Build ResNet-18 (1 pts)
Write a network modules for the Resnet-18 architecture (refer to the original paper). You can use `torchvision.models` for this section, so it should be very easy!

In [1]:
import torch
import torch.nn as nn
import torch.nn.functional as F
from torchvision import models
import matplotlib.pyplot as plt
%matplotlib inline

import trainer
from utils import ARGS
from simple_cnn import SimpleCNN
from voc_dataset import VOCDataset


# you could write the whole class....
# or one line :D
ResNet = models.resnet18()
ResNet.fc = nn.Linear(512,20,bias=True)
print(ResNet.fc.bias)

Parameter containing:
tensor([-0.0066, -0.0033, -0.0072,  0.0302,  0.0039,  0.0047, -0.0424, -0.0206,
        -0.0154,  0.0231,  0.0344, -0.0125,  0.0010, -0.0060, -0.0230, -0.0085,
         0.0352, -0.0417,  0.0276,  0.0302], requires_grad=True)


In [2]:
params = ResNet.named_parameters()
with torch.no_grad():
    grad_dict = dict()
    for i, named_param in enumerate(params):
        name, param = named_param
        if 'conv' in name:
            layer_name = name.split('.')[0]
            if layer_name not in grad_dict:
                grad_dict[layer_name] = []
            grad_dict[layer_name].append(param.flatten())
            cout, cin, size, size = param.shape
        #print(name)

    for key, grad_list in grad_dict.items():
        grad_list = torch.cat(grad_list)

## 3.2 Add Tensorboard Summaries (6 pts)
You should've already written tensorboard summary generation code into `trainer.py` from q1. However, you probably just added the most basic summary features. Please implement the more advanced summaries listed here:
* training loss (should be done)
* testing MAP curves (should be done)
* learning rate
* histogram of gradients

## 3.3 Train and Test (8 pts)
Use the same hyperparameter settings from Task 2, and train the model for 50 epochs. Report tensorboard screenshots for *all* of the summaries listed above (for image summaries show screenshots at $n \geq 3$ iterations)

**REMEMBER TO SAVE A MODEL AT THE END OF TRAINING**

In [3]:
args = ARGS(batch_size=32, test_batch_size=32, epochs=50, val_every=250, lr=1e-3, size=227, save_freq=10)
model = ResNet
optimizer = torch.optim.Adam(model.parameters(), lr=args.lr)
scheduler = torch.optim.lr_scheduler.StepLR(optimizer, step_size=10, gamma=args.gamma)
test_ap, test_map = trainer.train(args, model, optimizer, scheduler, model_name='runs/q3/model')
print('test map:', test_map)

test map: 0.423294126007022


![title](imgs/q3_tb_map.png)
![title](imgs/q3_tb_loss.png)