Assignment 6: Convolutional Networks
====================================


Microsoft Forms Document: https://forms.office.com/r/89d0K2m5yr


Before we start, we should assure that we have activated CUDA -- otherwise training might take very long.
In Google Colaboratory:

1. Check the options Runtime -> Change Runtime Type on top of the page.
2. In the popup window, select hardware accelerator GPU.

Afterward, the following command should run successfully:

In [None]:
import torch
if torch.cuda.is_available():
  print("Successfully enabled CUDA processing")
else:
  print("CUDA processing not available. Things will be slow :-(")

Task 1: Dataset Loading
-----------------------

Here, we use the MNIST dataset of handwritten digits for categorical classification.

Write a function that returns the training and the test set of MNIST, using the given transform.

In [None]:
import torch
import torchvision

def datasets(transform):
  trainset = torchvision.datasets.MNIST(...)
  testset = torchvision.datasets.MNIST(...)

  return trainset, testset

Test 1: Data Types
------------------

Create the dataset with `transform=None`. Check that all inputs are of type `PIL.Image.Image`, and all targets are integral.

In [None]:
import PIL
trainset, testset = datasets(...)

for x,t in trainset:
  # check datatype of input x
  ...
  # check datatype of target t
  ...

Task 2: Data Loaders
--------------------

Create the dataset with `transform=torchvision.transforms.ToTensor()`. Create two data loaders, one for the training set and one for the test set. The training batch size should be $B=64$, for the test set, you can choose any batch size of your choice.


In [None]:
trainset, testset = datasets(...)

B = 64
trainloader = torch.utils.data.DataLoader(...)
testloader = torch.utils.data.DataLoader(...)

Test 2: Batches
---------------

Check that all batches generated by the training set data loader have the batch size of $B$ -- except for the last batch. Check that all inputs and targets are of type `torch.Tensor`. Check that all input values are in range $[0,1]$. Check that all target values are in range $[0,9]$.

In [None]:
for x,t in trainloader:
  # check datatype, size and content of x
  ...

  # check datatype, size and content of t
  ...

Task 3: Fully-Connected Network
-------------------------------

Implement a function that returns a three-layer fully-connected network in pytorch.
Use $\tanh$ as activation function between the two fully-connected layers, and provide the possibility to change the number of inputs $D$, the number of hidden neurons $K$ and the number of outputs $O$.
Use the following layers:

1. A `torch.nn.Flatten` layer to turn the $28\times28$ pixel image (2D) into a $28*28$ pixel vector (1D)
2. A fully-connected layer with D input neurons and K outputs.
3. A $\tanh$ activation function.
4. A fully-connected layer with K input neurons and K outputs.
5. A $\tanh$ activation function.
6. A fully-connected layer with K input neurons and O outputs.

In [None]:
def fully_connected(D, K, O):
  return torch.nn.Sequential(
    ...
  )

Task 4: Convolutions Output (theoretical question)
--------------------------------------------------

Consider the network as defined in Task 5.
Assume that the input is a $28\times28$ grayscale image.
How many hidden neurons do we need in the final fully-connected layer for a given number $Q_2$ of output channels of the second convolution?

...

Task 5: Convolutional Network
-----------------------------

Implement a function that generates a convolutional network wit the following layers:

1. 2D convolutional layer with $Q_1$ channels, kernel size $5\times5$, stride 1 and padding 2.
2. 2D maximum pooling with pooling size $2\times2$ and stride 2
3. $\tanh$ activation
4. 2D convolutional layer with $Q_2$ channels, kernel size $5\times5$, stride 1 and padding 2.
5. 2D maximum pooling with pooling size $2\times2$ and stride 2
6. $\tanh$ activation
7. A flattening layer to turn the 3D image into 1D vector
8. A fully-connected layer with the appropriate number of inputs and $O$ outputs.

In [None]:
def convolutional(Q1, Q2, O):
  return torch.nn.Sequential(
    ...
  )

Task 6: Training and Validation Loop
------------------------------------

Implement a function that takes the network, the number of epochs and the learning rate.
Select the correct loss function for categorical classification, and SGD optimizer.
Iterate the following steps for the given number of epochs:

1. Train the network with all batches of the training data
2. Compute the test set loss and test set accuracy
3. Store both in a vector

What do we need to take care of?

Finally, return the lists of validation losses and accuracies.

In [None]:
def train(network, epochs=100, eta=0.01):
  # select loss function and optimizer
  loss = ...
  optimizer = ...

  # instantiate the correct device
  device = torch.device("cuda")
  network = network.to(device)

  # collect loss values and accuracies over the training epochs
  val_loss, val_acc = [], []

  for epoch in range(epochs):
    # train network on training data
    for x,t in trainloader:
      ...

    # test network on test data
    with torch.no_grad():
      for x,t in testloader:
        ...

  # return loss and accuracy values
  return val_loss, val_acc

Task 7: Fully-Connected Training
--------------------------------

Create a fully-connected network with $K=10$ hidden and $O=10$ output neurons.
Train the network for 10 epochs with $\eta=0.01$ and store the obtained test losses and accuracies.
Brave people can also train for 100 epochs (takes up to 30 minutes).

In [None]:
fc = fully_connected(...)
fc_loss, fc_acc = train(fc)

Task 8: Convolutional Training
------------------------------

Create a convolutional network with $Q_1=32$ and $Q_2=64$ convolutional channels and $O=10$ output neurons.
Train the network for 10 epochs with $\eta=0.01$ and store the obtained test losses and accuracies.
Brave people can also train for 100 epochs (takes up to 30 minutes).

In [None]:
cv = convolutional(...)
cv_loss, cv_acc = train(cv)

Task 9: Plotting
----------------

Plot the two lists of loss values in one plot. Plot the two lists of accuracy values into another.

In [None]:
from matplotlib import pyplot
pyplot.figure(figsize=(10,3))
ax = pyplot.subplot(121)
# plot loss values of FC and CV network over epochs
...

ax = pyplot.subplot(122)
# plot accuracy values of FC and CV network over epochs
...


Task 10: Learnable Parameters 
-----------------------------

Estimate roughly how many learnable parameters the two networks have by analytically computing and adding the number of parameters in each layer.
Compute the number of parameters in the networks by summing the number of parameters in each layer using pytorch functionality.
You can use the `numel()` function from a `torch.Tensor` to provide the number of (learnable) parameters stored in a tensor.

Fully-connected Network:
- first fully-connected layer: ...
- second fully-connected layer: ...
- third fully-connected layer: ...
- total: ...

Convolutional Network:
- first convolutional layer: ...
- second convolutional layer: ...
- fully-connected layer: ...
- total: ...


In [None]:
def parameter_count(network):
  return ...

print("Fully-connected Network:", parameter_count(fc))
print("Convolutional Network:", parameter_count(cv))