# Exercises 20 
*on Convolutional Neural Networks (CNN)*

At this point math is too complicated, so we use pytorch / keras already implemented solutions

### Exercise 1
- 32 x 32 picture, 
- 5 x 5 filter, 
- padding 0, 
- stride 1

**Answers**:
- New Image; 28 x 28; (32 - (5 - 1)) = 28 (for each filter applied seperately)
- To avoid reduction; 2 rows of padding! 
- Number of parameters; 25 per filter (5x5) + 1 bias per filter = 78 parameters (including biases)
- Constrain?

### Exercise 2
- 200 x 300 picture
- 100 Feature maps
- 3 x 3 Kernel in 3 layers
- stride 2
- padding "same" (i.e. 1 row here - whatever is needed to keep the output dimension the same)

This means;
Input n x 3 x 200 x 300
C1 n x 100 x 200 x 300
C2 n x 200 x 200 x 300
C3 n x 400 x 200 x 300

**Answers**:
- Parameters 
  - **C1**: One filter is 3 x 3 (x 3 dimensions RGB)(x 100 feature maps) + 100 biases = 2.800
  - **C2**: 3 x 3 x 100 x 200 + 200 = 180.200
  - **C3**: 3 x 3 x 200 x 400 + 400 = 720.400
  - **Total**: 903.400
- 903.400 * 32 (bits) / 8 (bytes) / 1000 (kB) / 1000 (MB) = 6.49 MB (Just for storing parameters for 1 image)
- 50 x 200 x 300 x 3 x 32 / 8 / 1000 / 1000 = 36 MB For storing 50 Images :)
- Max pooling has WAY less parameters & Compresses features to be less volatile

In [1]:
import torch
import torch.nn as nn
import numpy as np

from torch.utils.data import Dataset, DataLoader

### Coding - Training

In [2]:
# Create class Data that holds fashion mnist data
## Default 60.000 x 1 x 28 x 28 Train & 10.000 x 1 x 28 x 28 Test
class Data(Dataset):
    def __init__(self, n, train = True):
        # Load N fashion mnist images + targets
        x = [] # Images (n x 784)
        y = [] # Class labels
        with open(f'data/fashion-mnist_{"train" if train else "test"}.csv', 'r') as f:
            f.readline() # Header, w. column names
            for i in range(n):
                line = f.readline()
                sample = list(map(int, line.strip().split(',')))
                trg = sample[0]
                img = sample[1:]
                x.append(img)
                y.append(trg)
        
        # Reshape and convert to tensors
        self.x = torch.tensor(x).reshape(n, 1, 28, 28).float()
        self.y = torch.tensor(y).long()

    def __len__(self):
        return len(self.y)

    def __getitem__(self, i):
        return self.x[i], self.y[i]

In [3]:
# Load Data
train = Data(n=10**3, train=True)
test = Data(n=10**2, train=False)

# Initialise data loader
batchtrain = DataLoader(train, batch_size=len(train))
minibatchtrain = DataLoader(train, batch_size=2**7, shuffle=True)

  self.x = torch.tensor(x).reshape(n, 1, 28, 28).float()


### Coding - Modeling

In [4]:
# LeNet CNN
class LeNet(nn.Module):
    def __init__(self):
        # Inherited constructor
        super().__init__()

        # Input dimension: n (sample) x 1 (input channels) x 28 x 28 (image dimensions)
        self.c1 = nn.Conv2d(in_channels=1, out_channels=6, kernel_size=5, padding=2) # n x 6 x 28 x 28
        self.s2 = nn.AvgPool2d(kernel_size=2, stride=2) # n x 6 x 14 x 14

        self.c3 = nn.Conv2d(in_channels=6, out_channels=16, kernel_size=5) # n x 16 x 10 x 10
        self.s4 = nn.AvgPool2d(kernel_size=2, stride=2) # n x 16 x 5 x 5

        self.c5 = nn.Conv2d(in_channels=16, out_channels=120, kernel_size=5) # n x 120 x 1 x 1 (i.e. n x 120)

        self.f6 = nn.Linear(in_features=120, out_features=84) # n x 84
        self.out = nn.Linear(in_features=84, out_features=10)

        # Activation Function
        self.tanh = nn.Tanh()
        self.softmax = nn.Softmax(dim=1)

    def forward(self, x):
        # First Comvolutional layer + avg. pooling
        x = self.tanh(self.c1(x))
        x = self.tanh(self.s2(x))

        x = self.tanh(self.c3(x))
        x = self.tanh(self.s4(x))

        x = self.tanh(self.c5(x)).reshape(len(x), -1)

        x = self.tanh(self.f6(x))

        x = self.softmax(self.out(x))

        return x
    
    def predict(self, x):
        torch.argmax(self.forward(x), 1)


In [5]:
img, trg = next(iter(minibatchtrain))
print(img.shape, trg.shape)

torch.Size([128, 1, 28, 28]) torch.Size([128])


In [6]:
# Initialise model
lenet = LeNet()

# Optimizer & Cost function
optimiser = torch.optim.Adam(lenet.parameters(), lr=10e-3)
cost = nn.CrossEntropyLoss()

In [7]:
epochs = 50

acc = lambda y, yhat: (y == yhat) / len(y)

for epoch in range(epochs):
    for xbatch, ybatch in minibatchtrain:
        # zero out gradients
        optimiser.zero_grad()

        # Make predictions
        pred = lenet(xbatch)

        # Compute loss
        loss = cost(pred, ybatch)

        # Backpropagate loss
        loss.backward()
        optimiser.step()
    
    xtrain, ytrain = next(iter(batchtrain))
    ypred = lenet.predict(xtrain)
    print(f"Epoch {epoch+1} Loss: {loss} Train Accuracy: {acc(ytrain,ypred)}")

Epoch 1 Loss: 2.0980172157287598 Train Accuracy: 0.0
Epoch 2 Loss: 1.9181586503982544 Train Accuracy: 0.0
Epoch 3 Loss: 1.797006368637085 Train Accuracy: 0.0
Epoch 4 Loss: 1.8298213481903076 Train Accuracy: 0.0
Epoch 5 Loss: 1.7904880046844482 Train Accuracy: 0.0
Epoch 6 Loss: 1.7941962480545044 Train Accuracy: 0.0
Epoch 7 Loss: 1.6947968006134033 Train Accuracy: 0.0
Epoch 8 Loss: 1.6915745735168457 Train Accuracy: 0.0
Epoch 9 Loss: 1.7028948068618774 Train Accuracy: 0.0
Epoch 10 Loss: 1.6342530250549316 Train Accuracy: 0.0
Epoch 11 Loss: 1.7676767110824585 Train Accuracy: 0.0
Epoch 12 Loss: 1.7009880542755127 Train Accuracy: 0.0
Epoch 13 Loss: 1.7352887392044067 Train Accuracy: 0.0
Epoch 14 Loss: 1.645838975906372 Train Accuracy: 0.0
Epoch 15 Loss: 1.7700735330581665 Train Accuracy: 0.0
Epoch 16 Loss: 1.6923681497573853 Train Accuracy: 0.0
Epoch 17 Loss: 1.7134931087493896 Train Accuracy: 0.0
Epoch 18 Loss: 1.6652005910873413 Train Accuracy: 0.0
Epoch 19 Loss: 1.6950690746307373 Train