# Autoencoder for MNIST in Pytorch Lightning

In this notebook, we will train an autoencoder for the MNIST dataset, which is a datset of handwritten digits. We will use the PyTorch Lightning framework which makes everything much more convenient!

In case you haven't done yet, you should definitely check out the **PyTorch Lightning Introduction** in **Exercise 7**!

## What we will do:

One application of autoencoders is unsupervised pretraining with unlabeled data and then finetuning the encoder with labeled data. This can increase our performance if there is only little labeled data but a lot of unlabeled data available.

In this exercise we use the MNIST dataset with 60,000 labeled images but we will pretend that we would only have the labels for 300 of those images.

We will then train our autoencoder to reproduce the unlabeled images. 

Then we will transfer the pretrained encoder weights and finetune a classifier on the labeled data for classifying the handwritten digits. This is called ***transfer learning***.

**Note**: If you are running this in a google colab notebook, we recommend you enable GPU usage:

> **Runtime**   →   **Change runtime type**   →   **Hardware Accelerator: GPU**

If you are running in colab, you should install the dependencies by running the following cell:

In [None]:
!pip install pytorch-lightning==0.7.6 > /dev/null

## Imports

In [None]:
import os
import numpy as np

import matplotlib.pyplot as plt

import torch
import torch.nn as nn
import torch.nn.functional as F
from torch.utils.data import DataLoader, random_split
import torchvision
import torchvision.transforms as transforms

import pytorch_lightning as pl
from pytorch_lightning.loggers import TensorBoardLogger

%load_ext autoreload
%autoreload 2

### Get Device
In this exercise, we'll use PyTorch Lightning to build an image classifier for the cifar10 dataset. As you know from exercise 06, processing a large set of images is quite computation extensive. Luckily, with PyTorch we're now able to make use of our GPU to significantly speed things up!

In case you don't have a GPU, you can run this notebook on Google Colab where you can access a GPU for free! 

Of course, you can also run this notebook on your CPU only - though this is definitely not recommended.


In [None]:
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(device)

## Setup TensorBoard
In exercise 07 you've already learned how to use TensorBoard. Let's use it again to make the debugging of our network and training process more convenient! Throughout this notebook, feel free to add further logs or visualizations to your TensorBoard!

In [None]:
%load_ext tensorboard
%tensorboard --logdir lightning_logs --port 6008

## The MNIST dataset

First we will download the MNIST dataset and have a look at the data.
Because MNIST is such a common toy dataset, torchvision provides a class to download it and save it to the machine you are running the notebook on.

In [None]:
from torchvision.datasets import MNIST # import torchvisions class to download the data
dataset = MNIST(os.getcwd(), download=True, transform=transforms.ToTensor()) # If the dataset does not already exist in the working directory it will be downloaded and saved there

The dataset consists of tuples of 28x28 pixel PIL images and a label that is an integer from 0 to 9. 

Lets turn a few of the images into numpy arrays, to look at their shape and visualize them.

In [None]:
plt.rcParams['figure.figsize'] = (6,6) # Make the figures a bit bigger

for i in range(9):
    image = np.array(dataset[i][0].squeeze()) # get the image of the data sample
    label = dataset[i][1] # get the label of the data sample
    plt.subplot(3,3,i+1)
    plt.imshow(image, cmap='gray', interpolation='none')
    plt.title("Class {}".format(label))
    
plt.tight_layout()
print('The shape of our greyscale images: ', image.shape)

Now let us split our data into 59,700 unlabeled images (where we remove the labels) that we will use for pretraining and the 300 labeled images.

Also we split our remaining 300 labeled images into training, validation and test data. So we only have 100 training samples! It is often the case that there is little training data available in, for instance, the medical domain.

So let us see how well we can perform with only 100 labeled samples if we have a lot of unlabeled pretraining data available.

In [None]:
unlabeled, train, val, test = random_split(dataset, [59700, 100, 100, 100])
unlabeled = [i[0] for i in unlabeled] # only take the first entry of the tuple that contains the image and forget the label
unlabeled_train, unlabeled_val = random_split(unlabeled, [58700, 1000])

## Define your Network

Do you remember the good old times when we used to implement everything in plain numpy?

Luckliy, these times are over and we're using PyTorch Lightning which makes everything MUCH easier!

Instead of implementing your own model, solver and dataloader, all you have to do is defining a `LightningModule`.

We've prepared the class `exercise_code/models` for you, that you'll now finalize to build an Autoencoder and an image classifier with PyTorch Lightning.

### 1. Define your model
Next, let's define your encoder and decoder in the `models.py` file. 

Think about a good network architecture. You're completely free here and can come up with any network you like! (\*)

Have a look at the documentation of `torch.nn` at https://pytorch.org/docs/stable/nn.html to learn how to use use this module to build your network!

Then implement your architecture: initialize it in `__init__()` and assign it to `self.model`. This is particularly easy using `nn.Sequential()` which you only have to pass the list of your layers. 

To make your model customizable and support parameter search, don't use hardcoded hyperparameters - instead, pass them as dictionary `hparams` (here, `n_hidden` is the number of neurons in the hidden layer) when initializing `models`.

Here's an easy example:

```python
        self.model = nn.Sequential(
            nn.Linear(input_size, self.hparams["n_hidden"]),
            nn.ReLU(),            
            nn.Linear(self.hparams["n_hidden"], num_classes)
        )
```

Have a look at the forward path in `forward(self, x)`, which is so easy, that you don't need to implement it yourself. 

As PyTorch automatically computes the gradients, that's all we need to do! No need anymore to manually calculate derivatives for the backward paths! :)


____
\* *The size of your final model must be less than 20 MB, which is approximately equivalent to 5 Mio. params. Note that this limit is quite lenient, you will probably need much less parameters!*

*Also, don't use conv layers as they've not been covered yet in the lecture and build you network with fully connected layers (```nn.Linear()```)!*

### 2. Training & Validation Step
Have a look at the functions `training_step` and `validation_step`, that take a batch as input and calculate the loss. 

### 3. Optimizer
Lastly, implement the function `configure_optimizers()` to define your optimizer. Here the documentation of `torch.optim`at https://pytorch.org/docs/stable/optim.html might be helpful.


That's it! You've now finalized your `LightningModule` which has (at least) the same functionality as your previous numpy-powered image classifier!

Now let's create an instance of your autoencoder.

In [None]:
from exercise_code.models import Autoencoder, Encoder, Decoder

hparams = {}
########################################################################
# TODO: Define your hyper parameters here!                             #
########################################################################

pass

########################################################################
#                           END OF YOUR CODE                           #
########################################################################
encoder_pretrained = Encoder(hparams)
encoder = Encoder(hparams)
decoder = Decoder(hparams)
ae_logger = TensorBoardLogger(save_dir='lightning_logs')
autoencoder = Autoencoder(hparams, encoder_pretrained, decoder, unlabeled_train, unlabeled_val, ae_logger)

Some tests to check whether we'll accept your model.

In [None]:
from exercise_code.Util import printModelInfo
_ = printModelInfo(autoencoder)

## Fit Model with Trainer
Now it's time to train your model. 

Have a look of the documentation of `pl.Trainer` at https://pytorch-lightning.readthedocs.io/en/latest/trainer.html to find out what arguments you can pass to define your training process. 

Then, start the training with `trainer.fit(autoencoder)` and have a look at the loss and the reconstructed images in tensorboard.

In [None]:
trainer = None

########################################################################
# TODO: Define your trainer! Don't forget the logger.                  #
########################################################################


pass

########################################################################
#                           END OF YOUR CODE                           #
########################################################################
trainer.fit(autoencoder)

### Let's have a look at the reconstructed validation images (If you have not already looked at them in tensorboard)

In [None]:
reconstructions = autoencoder.getReconstructions()
for i in range(64):
    plt.subplot(8,8,i+1)
    plt.axis('off')
    plt.imshow(reconstructions[i], cmap='gray', interpolation='none')
    
plt.tight_layout()

## The Classifier

Now let's get to the classifier in the `Classifier` class. Have a look a the forward pass.

We notice that the `self.model` takes the output of the encoder as input. So these sizes have to match. 

Define your classifier and then we initialize two of them with the same hyperparameters.

The difference is that one has the pretrained encoder from our autoencoder and the other will be trained from scratch so we can compare them.

In [None]:
from exercise_code.models import Classifier

hparams = {}
########################################################################
# TODO: Define your hyper parameters here!                             #
########################################################################

pass

########################################################################
#                           END OF YOUR CODE                           #
########################################################################
classifier_pretrained = Classifier(hparams, encoder_pretrained, train, val, test)
classifier = Classifier(hparams, encoder, train, val, test)

## Training the Classifiers

Now specify another trainer that we will use to first train the standard classifier and then the pretrained classifier to compare their performance

In [None]:
import copy
trainer = None

########################################################################
# TODO: Define your trainer! Don't forget the logger.                  #
########################################################################


pass

########################################################################
#                           END OF YOUR CODE                           #
########################################################################
trainer2 = copy.deepcopy(trainer)
trainer.fit(classifier) # train the standard classifier

In [None]:
trainer2.fit(classifier_pretrained) # train the pretrained classifier

Let's have a look at the validation accuracy of the two different classifiers and compare them. And don't forget that you can also monitor your training in tensorboard.

We will only look at the test accuracy and compare our two classifiers with respect to that in the very end.

In [None]:
print("Validation accuracy when training from scratch: {}%".format(classifier.getAcc(classifier.val_dataloader())[1]*100))
print("Validation accuracy with pretraining: {}%".format(classifier_pretrained.getAcc(classifier.val_dataloader())[1]*100))

Now that everything is working, feel free to play around with different architectures. As you've seen, it's really easy to define your model or do changes there.

You can now checkout notebook 5 which shows you how you can use the framework Optuna to perform **hyperparameter tuning with PyTorch lightning!**

To pass this submission, you'll need **50%** accuracy.

# Save your model & Report Test Accuracy

When you've done with your **hyperparameter tuning (see notebook 5)**, have achieved **at least 50% validation accuracy** and are happy with your final model, you can save it here.

Before that, we will check again whether the number of parameters is below 5 Mio and the file size is below 20 MB.

When your final model is saved, we'll lastly report the test accuracy.

In [None]:
from exercise_code.Util import test_and_save

print("Test accuracy when training from scratch: {}%".format(classifier.getAcc()[1]*100))
print('\nNow to the pretrained classifier:')
test_and_save(classifier_pretrained)

# Submission Instructions

Congrats! You've now finished your first autoencoder and transferred the weights to a classifier! Much easier, than in plain numpy, right? Time to get started with some more complex neural networks - see you at the next exercise!

1. Go on [our submission page](https://dvl.in.tum.de/teaching/submission/), register for an account and login. We use your matriculation number and send an email with the login details to the mail account associated. When in doubt, login into tum online and check your mails there. You will get an ID which we need in the next step.
2. Navigate to `exercise_code` directory and run the `create_submission.sh` file to create the zip file of your model. This will create a single `zip` file that you need to upload. Otherwise, you can also zip it manually if you don't want to use the bash script. However, **make sure that the structure of the zip file is the same** as it would be when generated with the bash-script.
3. Log into [our submission page](https://dvl.in.tum.de/teaching/submission/) with your account details and upload the `zip` file. Once successfully uploaded, you should be able to see the submitted file selectable on the top.
4. Click on this file and run the submission script. You will get an email with your score as well as a message if you have surpassed the threshold.

# Submission Goals

- Goal: Successfully implement a fully connected autoencoder for MNIST with PyTorch Lightning and transfer the encoder weights to a classifier.

- Passing Criteria: This time, there are no unit tests that check specific components of your code. The only thing that's required to pass the submission, is your model to reach at least **50% accuracy** on __our__ test dataset. The submission system will show you a number between 0 and 100 which corresponds to your accuracy.

- Submission start: __Thursday, June 11, 2020 12.00__
- Submission deadline : __June 17, 2020 23.59__ 
- You can make **$\infty$** submissions until the deadline. Your __best submission__ will be considered for bonus