# Deep Learning
## Exercise 4 - Convolutional Neural Networks

### 0. Neural Networks with PyTorch
Continuing from the last exercise, work through the **What is torch.nn really?** [tutorial notebook](https://pytorch.org/tutorials/beginner/nn_tutorial.html).


### 1. CNN Forward Pass

In this exercise we address a convolutional neural network (CNN) with one-dimensional input, as we also did in the exercise *CNNs Forward Pass*, just this time we will do it with `torch` instead of manually.

While two-dimensional CNNs can be used for example for grayscale images, one-dimensional CNNs could be used for time-series such as temperature or humidity readings. Concepts for the 1D-case are equivalent to 2D networks. We interpret data in our network as three-dimensional arrays where a row denotes a feature map, a column denotes a single dimension of the observation, and the depth of the array represents different observations. As we will only work with a single input vector, the depth will always be one.

In the following steps you will build a CNN for following input:
* Input $I$: Matrix of size $1 \times 1 \times 12$. We therefore have one input consisting of a single feature map with twelve dimensions.


In [None]:
import torch
import torch.nn as nn

In [None]:
#ToDo: Define the input tensor

In [None]:
x = torch.tensor([0, 0, 0, 0, 1, 1, 1, 1, 0, 0, 0, 0], dtype=torch.float).unsqueeze(0).unsqueeze(0)
print(f'Input:\n{x}')

#### 1. Define the layers of your CNN using `torch.nn`

The CNN consists of following layers:
1. Convolutional layer with kernel of size $1 \times 3 \times 2$.
2. Max-pooling layer with stride $2$ and filter size $2$. Note that max-pooling pools each feature map separately.
3. Convolutional layer with convolutional kernel of size $2 \times 3 \times 1$.
4. Fully connected layer that maps all of its inputs to two outputs.
5. A final sigmoid activation function

Omit all bias terms and use no padding.

In [None]:
#ToDo: Define all layers

In [None]:
conv_1 = nn.Conv1d(in_channels=1, out_channels=2, kernel_size=3, bias=False, padding=0)
max_pool = nn.MaxPool1d(kernel_size=2, stride=2, padding=0)
conv_2 = nn.Conv1d(in_channels=2, out_channels=1, kernel_size=3, bias=False, padding=0)
full_connected = nn.Linear(in_features=3, out_features=2, bias=False)
activation = nn.Sigmoid()

#### 2. Manually set the weights.

To assign new weights, you have to wrap a tensor in `nn.Parameter()`.
The layers have following weights:
* The first convolutional layer has filters: $F_0^1 = (-1, 0, 1)$ and $F_1^1 = (1, 0, -1)$
* The second convolutional layer has the filter: $F_0^2 = ((-1, 0, 1), (1, 0, -1))$.
* The first output of the fully connected layer is calculated as the negative sum of all its inputs, and the second output is calculated as the positive sum of all its inputs.

In [None]:
#ToDo: Set all weights

In [None]:
conv_1.weight = nn.Parameter(torch.tensor([[[-1,0,1]],[[1,0,-1]]], dtype=torch.float))
conv_2.weight = nn.Parameter(torch.tensor([[[-1,0,1],[1,0,-1]]], dtype=torch.float))
full_connected.weight = nn.Parameter(torch.tensor([[-1, -1, -1], [1, 1, 1]], dtype=torch.float))

#### 3. perform a forward pass, printing the output after each layer.

In [None]:
#ToDo: Implement a forward pass

In [None]:
h1 = conv_1(x)
print(h1)
h2 = max_pool(h1)
print(h2)
h3 = conv_2(h2)
print(h3)
h4 = full_connected(h3)
print(h4)
output = activation(h4)
print(output)


### 2. Training an Image Classifier
The CIFAR 10 dataset is an image dataset containing images from 10 classes: ‘airplane’, ‘automobile’, ‘bird’, ‘cat’, ‘deer’, ‘dog’, ‘frog’, ‘horse’, ‘ship’, ‘truck’. Your task is to train a CNN to classify given input images into one of these 10 classes.


The images in CIFAR-10 are of size 3x32x32, i.e. 3-channel color images of 32x32 pixels in size. More information on CIFAR 10 can be found over [here](https://www.cs.toronto.edu/~kriz/cifar.html). This means we have to create a CNN with 2D convolutions and 3 input channels.

The dataset is already loaded for you. For development and debugging, you can set `use_subset = True` to work with a small subset of the data (100 examples). For your final training run, set it back to `False`.




In [None]:
import torch
import torchvision
import torchvision.transforms as transforms

use_subset = False # Set this to True for debugging purposes.

transform = transforms.ToTensor()

train_dataset = torchvision.datasets.CIFAR10(root='./data', train=True, download=True, transform=transform)
val_dataset = torchvision.datasets.CIFAR10(root='./data', train=False, download=True, transform=transform)
classes = train_dataset.classes

if use_subset:
    train_dataset = torch.utils.data.Subset(train_dataset, torch.arange(0, 100))
    val_dataset = torch.utils.data.Subset(val_dataset, torch.arange(0, 100))

print(f'classes: {classes}\nnumber of instances:\n\ttrain: {len(train_dataset)}\n\tval: {len(val_dataset)}')

Visualizing a few examples. Note that the images are only 32x32 pixels, so they look quite pixelated.

In [None]:
import matplotlib.pyplot as plt

def show_examples(n):
    for i in range(n):
        index = torch.randint(0, len(train_dataset), size=(1,)) # select a random example
        image, target = train_dataset[index]
        print(f'image of shape: {image.shape}')
        print(f'label: {classes[target]}')
        plt.imshow(image.permute(1,2,0).numpy())
        plt.show()

show_examples(4)

#### 1.  Create a training and validation dataloader.

Use `torch.utils.data.DataLoader` to retrieve batches of data. Use a batch size of 32. Make sure to have the training set shuffled after every epoch.

In [None]:
from torch.utils.data import DataLoader

In [None]:
#ToDo: create a dataloader for the trainin and validation data

In [None]:
train_loader = DataLoader(train_dataset, batch_size=32, shuffle=True)
val_loader = DataLoader(val_dataset, batch_size=32, drop_last=False)


#### 2.  Define a CNN using `torch.nn`.

You can use `nn.Sequential` as a container for your layers or define your own `nn.Module`. The network should have the following architecture:

1. A 2-d convolutional layer with 48 output channels, a kernel size of (3x3) and (1x1) padding.
2. A second 2-d convolutional layer with 96 output channels, a kernel size of (3x3) and (1x1) padding.
3. A Max-Pooling layer with a 2x2 kernel.
4. A third 2-d convolutional layer with 192 output channels, a kernel size of (3x3) and (1x1) padding.
5. A Max-Pooling layer with a 2x2 kernel.
6. A fully connected layer with output dimension of 64.
7. A final, fully connected classification layer, with output dimension of 10 (one for each class).

Directly after every conv and fully connected layer, apply a ReLU non-linearity. Do not apply a non-linearity after the final layer.
    
Remember to flatten your hidden representations before the first fully connected layer. Hint: `nn.Flatten`

In [None]:
import torch.nn as nn

In [None]:
#ToDo: Define the CNN


#Test on one Batch:
image_batch, target_batch= iter(train_loader).next()
print(model(image_batch).shape)

In [None]:
model = nn.Sequential(
            nn.Conv2d(3, 48, (3,3), padding=(1,1)),
            nn.ReLU(),
            nn.Conv2d(48, 96, (3,3), padding=(1,1)),
            nn.ReLU(),
            nn.MaxPool2d((2,2)),
            nn.Conv2d(96, 192, (3,3), padding=(1,1)),
            nn.ReLU(),
            nn.MaxPool2d((2,2)),
            nn.Flatten(),
            nn.Linear(192*8*8 ,64),
            nn.ReLU(),
            nn.Linear(64,10)
            )

#Test on one Batch:
image_batch, target_batch= iter(train_loader).next()
print(model(image_batch).shape)

#### 3.  Define loss function and optimizer.

Use cross entropy loss and SGD as the optimizer. Use a learning rate of 0.01 and set the momentum parameter of `torch.optim.SGD` to 0.9 .

In [None]:
#ToDo: Define loss and optimizer

In [None]:
loss = nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.parameters(), lr=0.01, momentum=0.9)


#### 4.  Train the CNN for 5 epochs and validate your trained model by computing the accuracy on the validation dataset.


To accelerate the training, you can use a GPU.

*Hint:* If you want to monitor the progress of training, take a look at `tqdm`.

In [None]:
#ToDo: Implement Training and Validation

In [None]:
from tqdm import tqdm
def train(epochs, model, loss_function, opt, train_loader):
    for epoch in range(epochs):
        model.train()
        cum_loss = 0
        num_batches = 0
        for img, label in tqdm(train_loader, desc='Train Iteration',ascii=True):
            output = model(img)
            loss = loss_function(output, label)
            loss.backward()
            opt.step()
            opt.zero_grad()
            cum_loss += loss.item()
            num_batches +=1
        print(f"Epoch {epoch} \t ----> \t Loss {cum_loss/len(train_loader) :.5f}")
              
train(5, model, loss, optimizer, train_loader)

def eval(model, loss_funct, val_loader):
    model.eval()
    total_matches = 0
    val_entries = 0
    cum_loss = 0
    with torch.no_grad():
        for img, label in tqdm(val_loader, desc='Val Iteration', ascii=True):
            output = model(img)
            loss = loss_funct(output, label)
            cum_loss += loss.item()
            prediction = torch.argmax(output, dim=1)
            num_matches_batch = (prediction==label).sum()
            total_matches += num_matches_batch
            val_entries += len(img)
    accuracy = total_matches/val_entries
    loss = cum_loss/len(val_loader)
    return accuracy.item(), loss

acc, val_loss = eval(model, loss, val_loader)
print(f"Validation Accuracy: {acc:.4f}, Validation Loss: {val_loss:.4f}")
            