<a href="https://colab.research.google.com/github/emmad225/BIACoursework/blob/main/duffyep_lab8_CNN.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# CSCI 3397 Lab 8: Image Prediction with CNN Models

**Posted:** Saturday, March 30, 2024

**Due:** Monday, April 8, 2024

__Total Points__: 11 pts

__Submission__: please rename the .ipynb file as __\<your_username\>\_lab8.ipynb__ before you submit it to canvas. Example: weidf_lab8.ipynb.

# <b>1. CNN models [11 pts] </b>

After learning AlexNet, you need to be comfortale to implement other variants based on descriptions. Let's try out for one old CNN and one new CNN.

## 1.1 LeNet [5 pts]

The Yann LeCun et al. (1989) paper <a href="http://yann.lecun.com/exdb/publis/pdf/lecun-89e.pdf">Backpropagation Applied to Handwritten Zip Code Recognition</a> is widely considered as the earliest real-world application of a convolutional neural net. Except for the tiny dataset and model, this paper still reads modern even after 33 years with descriptions on the dataset, model architecture, loss function, optimization, and classification error rates ovr training and test sets. Today, let's reproduce it with PyTorch!

<img src="https://karpathy.github.io/assets/lecun/lecun1989.png"/>


In this pset, you will be reading different tutorials to implement the concepts learned in class. Happy Hacking!



The model described in the paper is a bit non-standard by now. We will implement a simplified version instead:

```
- Input: 16x16x3 (in_channels)
- Feature Extraction
  * Layer H1 (conv): kernel_size=5x5, #kernel=12, stride=2, padding=2
  * Nonlinearity: nn.tanh
  * Layer H2 (conv): kernel size=5x5, #kernel=12, stride=2, padding=2
  * Nonlinearity: nn.tanh
- Reshape Layer: torch.flatten
- Classifier
  * Layer H3 (linear): #hidden units=30
  * Nonlinearity: nn.tanh
  * Output (linear): #hidden units=9 #Donglai said to change this to 9
```

**TODO:**
- Read and understand the example AlexNet code [link](https://github.com/pytorch/vision/blob/main/torchvision/models/alexnet.py) (line 18-53)
- Follow the example and implement the class `CNN1989` based on the description above.

Hints:
- For each type of layer, you can check out the documentation: [conv2d layer](https://pytorch.org/docs/stable/generated/torch.nn.Conv2d.html), [linear layer](https://pytorch.org/docs/stable/generated/torch.nn.Linear.html)
- The loss function below will take care of the softmax normalization of the prediction

In [None]:
import torch
import torch.nn as nn

class CNN1989(nn.Module):
  def __init__(self) -> None:
    super().__init__()

    #### your code starts ####
    self.features = nn.Sequential(nn.Conv2d(in_channels=16, out_channels=12, kernel_size=5, stride=2, padding=2), nn.Tanh(), nn.Conv2d(in_channels=12, out_channels=12, kernel_size=5, stride=2, padding=2), nn.Tanh())
    self.classifier = nn.Sequential(nn.Linear(12*4*4, 30), nn.Tanh(), nn.Linear(30, 9))
    #### your code ends ####

  def forward(self, x: torch.Tensor) -> torch.Tensor:
    #### your code starts ####
    x = self.features(x)
    x = self.flatten(x)
    x = self.classifier(x)
    #### your code ends ####
    return x

## 1.2 ResNet [5 pts]



### (a) Basic block [3 pts]

<img height=150 src="https://neurohive.io/wp-content/uploads/2019/01/resnet-e1548261477164.png" />

In [None]:
class BasicBlock(nn.Module):
    expansion = 1
    def __init__(self, in_planes, planes, stride=1):
        super(BasicBlock, self).__init__()
        self.conv1 = nn.Conv2d(in_planes, planes, kernel_size=3, stride=stride, padding=1, bias=False)
        self.bn1 = nn.BatchNorm2d(planes)
        self.conv2 = nn.Conv2d(planes, planes, kernel_size=3, stride=1, padding=1, bias=False)
        self.bn2 = nn.BatchNorm2d(planes)

        self.shortcut = nn.Sequential()
        if stride != 1 or in_planes != self.expansion*planes:
            # make sure the shortcut has the same dimension
            self.shortcut = nn.Sequential(
                nn.Conv2d(in_planes, self.expansion*planes, kernel_size=1, stride=stride, bias=False),
                nn.BatchNorm2d(self.expansion*planes)
            )

    def forward(self, x):
        #### Your code starts here ####
        # finish the forward pass
        out = self.conv1(x)
        out = self.bn1(out)
        out = F.relu(out)
        out = self.conv2(out)
        out = self.bn2(out)

        # Shortcut connection
        out += self.shortcut(x)
        out = F.relu(out)
        #### Your code ends here ####
        return out

### (b) ResNet-18 model [3 pts]
Let's implement the smallest version of ResNet.

<img height=400 src="https://pytorch.org/assets/images/resnet.png">

In [None]:
import torch.nn.functional as F

class ResNet(nn.Module):
    def __init__(self, block, num_blocks, num_classes=10):
        super(ResNet, self).__init__()
        self.in_planes = 64
        self.conv1 = nn.Conv2d(3, 64, kernel_size=3, stride=1, padding=1, bias=False)
        self.bn1 = nn.BatchNorm2d(64)

        self.layer1 = self._make_layer(block, 64, num_blocks[0], stride=1)
        self.layer2 = self._make_layer(block, 128, num_blocks[1], stride=2)
        self.layer3 = self._make_layer(block, 256, num_blocks[2], stride=2)
        self.layer4 = self._make_layer(block, 512, num_blocks[3], stride=2)
        self.linear = nn.Linear(512*block.expansion, num_classes)

    def _make_layer(self, block, planes, num_blocks, stride):
        strides = [stride] + [1]*(num_blocks-1)
        layers = []
        for stride in strides:
            layers.append(block(self.in_planes, planes, stride))
            self.in_planes = planes * block.expansion
        return nn.Sequential(*layers)

    def forward(self, x):
        # conv1: conv -> bn -> relu
        out = F.relu(self.bn1(self.conv1(x)))

        # other conv layers
        #### your code starts ####
        out = self.layer1(out)
        out = self.layer2(out)
        out = self.layer3(out)
        out = self.layer4(out)
        #### your code ends ####

        # avg pooling
        out = F.avg_pool2d(out, 4)
        # reshape
        out = out.view(out.size(0), -1)

        # classification
        #### your code starts ####
        out = self.linear(out)
        #### your code ends ####
        return out

def ResNet18():
    return ResNet(BasicBlock, [2, 2, 2, 2])