# Assignment 2

In this assignment you will implement ResNet18.
Read the comments carefully and insert your code where you see: <br><br><b>##### START OF YOUR CODE #####</b><br><br><b>##### END OF YOUR CODE #####</b><br><br>or for the inline codes you will see<br><br><b>##### INSERT YOUR CODE HERE #####</b>

### The architecture of ResNet-18 is shown in the table.
First, we will define a convolutional block with skip connection. Then, create the model using these blocks.<br><br>
<img src="https://www.researchgate.net/profile/Paolo-Napoletano/publication/322476121/figure/tbl1/AS:668726449946625@1536448218498/ResNet-18-Architecture.png" width="500" alt="ResNet18 Architecture">

<br><sup>Image ref: Napoletano, Paolo, et al. ‘Anomaly Detection in Nanofibrous Materials by CNN-Based Self-Similarity’. Sensors (Basel, Switzerland), vol. 18, 01 2018, https://doi.org10.3390/s18010209.</sup>

#### I. ConvBlock
<img src="https://www.researchgate.net/publication/334301817/figure/fig3/AS:778452965801986@1562609058538/Residual-block-of-ResNet18-with-a-1-1-convolutional-mapping-based-residual-unit-and.png"><br>
ResNet consists of convolutional (a) and identity (b) blocks. For ResNet-18 we will only use convolutional blocks. In this step you will write a class for convolutional block. The arguments will be:

* ch_in: input channels
* ch_out: output channels
* s: strides
* act: activation function

The options for activation function are "relu", "leaky_relu" and "gelu".
<br><br>
<sup>Image ref: Owais, Muhammad, et al. ‘Artificial Intelligence-Based Classification of Multiple Gastrointestinal Diseases Using Endoscopy Videos for Clinical Diagnosis’. Journal of Clinical Medicine, vol. 8, 07 2019, p. 986, https://doi.org10.3390/jcm8070986.</sup>

In [59]:
import torch
from torch import nn
from torch.nn import functional as F

class ConvBlock(nn.Module):
    def __init__(self, ch_in, ch_out, s, act):
      super(ConvBlock,self).__init__()
      # Initialize layers
      ##### START OF YOUR CODE #####
      activations = {'relu': torch.nn.ReLU(),
                     'leaky_relu': torch.nn.LeakyReLU(),
                     'gelu': torch.nn.GELU()}
      self.conv1 = nn.Conv2d(in_channels = ch_in, out_channels = ch_out, kernel_size = (1, 1), stride = (s, s))
      self.batch1 =  nn.BatchNorm2d(num_features  = ch_out)

      self.conv2 = nn.Conv2d(in_channels = ch_in, out_channels = ch_in, kernel_size = (3, 3), stride = (s, s), padding = (1, 1))
      self.batch2 = nn.BatchNorm2d(num_features  = ch_in)
      self.act1 = activations[act]
      self.conv3 = nn.Conv2d(in_channels = ch_in, out_channels = ch_out, kernel_size = (3, 3), stride = (1, 1), padding = (1, 1))
      self.batch3 = nn.BatchNorm2d(num_features  = ch_out)
      self.act2 = activations[act]

      ##### END OF YOUR CODE #####

    def forward(self, X):
      ##### START OF YOUR CODE #####

      fx = self.conv2(X)
      fx = self.batch2(fx)
      fx = self.act1(fx)
      fx = self.conv3(fx)
      fx = self.batch3(fx)
      
      hx = self.conv1(X)
      hx = self.batch1(hx)
           
      X = hx + fx
      X = self.act2(X)
      ##### END OF YOUR CODE #####
      return X

#### II. ResNet18 class
Use the ConvBlock class to create ResNet18.
* Add batch normalization and activation function after the first conv layer as well.
* Examine the output sizes in the table above and use paddings and strides where needed.
* Pytorch doesn't have a global average pooling layer. Instead you should reshape the image as (B, C, W*H) and calculate the mean of the last dimension without keeping the dims. It will result in a tensor of (B, C)
* Fully connected layer should be 512 x 1 as we have only 2 classes and we will use sigmoid function as the final activation layer.

<img src="https://www.researchgate.net/profile/Paolo-Napoletano/publication/322476121/figure/tbl1/AS:668726449946625@1536448218498/ResNet-18-Architecture.png" width="500" alt="ResNet18 Architecture">

In [60]:
class ResNet18(nn.Module):
    def __init__(self, act, drop_rate):
      super(ResNet18, self).__init__()
      # Initialize layers
      ##### START OF YOUR CODE #####
      # input: 16, 1, 256, 256 (batch size, channel, height, width) or 256 x 256 x 1
      activations = {'relu': torch.nn.ReLU(),
                     'leaky_relu': torch.nn.LeakyReLU(),
                     'gelu': torch.nn.GELU()}

      self.conv1 = nn.Conv2d(in_channels = 1, out_channels = 64, kernel_size = (7, 7), stride = (2, 2), padding = (3, 3)) # learnable parameters: 7*7*1*64 + 64 (bias) = 3200
      self.batch1 = nn.BatchNorm2d(num_features  = 64)
      self.act1 = activations[act]

      self.maxpool1 = nn.MaxPool2d(3, stride= 2, padding = (1, 1))
      self.conv21_x = ConvBlock(ch_in = 64, ch_out = 64, s = 1, act = act)
      self.conv22_x = ConvBlock(ch_in = 64, ch_out = 64, s = 1, act = act)

      self.conv31_x = ConvBlock(ch_in = 64, ch_out = 64, s = 2, act = act)
      self.conv32_x = ConvBlock(ch_in = 64, ch_out = 128, s = 1, act = act)

      self.conv41_x = ConvBlock(ch_in = 128, ch_out = 128, s = 2, act = act)
      self.conv42_x = ConvBlock(ch_in = 128, ch_out = 256, s = 1, act = act)

      self.conv51_x = ConvBlock(ch_in = 256, ch_out = 256, s = 2, act = act)
      self.conv52_x = ConvBlock(ch_in = 256, ch_out = 512, s = 1, act = act)

      self.drop = torch.nn.Dropout(p=drop_rate)
      self.fc = nn.Linear(512, 1)
      self.sg = nn.Sigmoid()

      ##### END OF YOUR CODE #####

    def forward(self, X):
      ##### START OF YOUR CODE #####
      
      # 256 x 256 x 1
      X = self.conv1(X)    # output: 128 x 128 x 64
      X = self.batch1(X)   # output: 128 x 128 x 64
      X = self.act1(X)     # output: 128 x 128 x 64
      X = self.maxpool1(X) # output: 64 x 64 x 64
      
      X = self.conv21_x(X)  # output: 64 x 64 x 64
      X = self.conv22_x(X)  # output: 64 x 64 x 64
      
      X = self.conv31_x(X)  # output: 32 x 32 x 64
      X = self.conv32_x(X)  # output: 32 x 32 x 128
    
      X = self.conv41_x(X)  # output: 16 x 16 x 128
      X = self.conv42_x(X)  # output: 16 x 16 x 256
   
      X = self.conv51_x(X)  # output: 8 x 8 x 256
      X = self.conv52_x(X)  # output: 8 x 8 x 512
      
      # (B, C, W, H) to (B, C, W * H)  
      X = nn.AvgPool2d(X.size(-1))(X)
      X = X.view(X.size(0), -1)
      X = self.drop(X) # drop out after average pooling.

      X = self.fc(X)
      X = self.sg(X)
      ##### END OF YOUR CODE #####
      return X

In [61]:
from torchsummary import summary

# Print the summary of model
device = torch.device('cuda')
model = ResNet18("relu", .5).to(device)
summary(model, (1, 256, 256))

----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
            Conv2d-1         [-1, 64, 128, 128]           3,200
       BatchNorm2d-2         [-1, 64, 128, 128]             128
              ReLU-3         [-1, 64, 128, 128]               0
         MaxPool2d-4           [-1, 64, 64, 64]               0
            Conv2d-5           [-1, 64, 64, 64]          36,928
       BatchNorm2d-6           [-1, 64, 64, 64]             128
              ReLU-7           [-1, 64, 64, 64]               0
            Conv2d-8           [-1, 64, 64, 64]          36,928
       BatchNorm2d-9           [-1, 64, 64, 64]             128
           Conv2d-10           [-1, 64, 64, 64]           4,160
      BatchNorm2d-11           [-1, 64, 64, 64]             128
             ReLU-12           [-1, 64, 64, 64]               0
        ConvBlock-13           [-1, 64, 64, 64]               0
           Conv2d-14           [-1, 64,