# Leren en Beslissen - Open Set Recognition 1

This is the last notebook of the project and it consists of the following topic:  

4. Retraining for Outlier Exposure 

This part is based on the paper [Deep Anomaly Detection with Outlier Exposure](https://arxiv.org/abs/1812.0460) by Hendrycks et al. (2019). Do you recognize his name? Make sure to understand the method. It is a fairly simple method, but as always, scientific papers can make it sound harder than it really is. It is beneficial to especially understand the Outlier Exposure implementation and loss (chapter 3) and the Maximum Softmax Probability part (first paragraph of 4.3). Density estimation (4.4) is not relevant for you. The Discussion (5) about _flexibility_ and _closeness_ are also very interesting - especially since they discuss a very nice intuitive view on training and testing data. 

> **Exercise 4a** How do the unknown train images influence the softmax scores? Which direction do they push the softmax score towards? Discuss the views in the paper. 

In [None]:
# Standard 
import numpy as np 
import math 
import time 
import pickle 


# Plotting 
import matplotlib.pyplot as plt
import seaborn as sns
sns.set()

# Progress bar 
import tqdm 

# Pytorch 
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.utils.data as data
import torch.optim as optim

# A simple MLP 
from simpleMLP import *
from finetune_MLP import * 

In [None]:
# Folder where the datasets are/should be downloaded to. 
DATA_PATH = '../data'
# Folder where to save pretrained models to. 
TRAINED_PATH = '../trained_models'

# Make the trained path folder if it doesn't already exist 
os.makedirs(TRAINED_PATH, exist_ok=True)

## Outlier Exposure 

For Outlier Exposure, you train on two different data sets: the known data set and an outlier data set. The outlier dataset is used to train your model to distinguish between the known data and unknown data. This outlier dataset kind of serves as a 'unknown' dataset during training. Of course it is not really unknown, after all it is available during training, but the goal of your training function is to make sure your model would label the outlier data as unknown. 

It is very important to note that you cannot use the _unknown_ TEST data as the outlier training set, because this would be cheating. Rather, you pick a third dataset that serves as your outlier data for training. 

I will show you an example where MNIST serves as the known data set; CIFAR10 serves as the outlier data set for training.

First, import the datasets. 

In [None]:
# Import MNIST dataset 
from torchvision.datasets import MNIST
from torchvision.transforms import ToTensor

mnist_train = MNIST(root=DATA_PATH, train=True, download=True, transform=ToTensor())
mnist_test_set = MNIST(root=DATA_PATH, train=False, download=True, transform=ToTensor())

# Define the train-val split 
mnist_train_set, mnist_val_set = torch.utils.data.random_split(mnist_train, [50000, 10000])

# Create dataloaders for the val, test and train sets - put them in a dictionary 
mnist_loader = {}
mnist_loader['train'] = data.DataLoader(mnist_train_set, batch_size=1024, shuffle=True, drop_last=False)
mnist_loader['val'] = data.DataLoader(mnist_val_set, batch_size=1024, shuffle=True, drop_last=False)
mnist_loader['test'] = data.DataLoader(mnist_test_set, batch_size=1024, shuffle=True, drop_last=False)


In [None]:
# Import CIFAR10 datasets 
from torchvision.datasets import CIFAR10
import torchvision.transforms as transforms

# function for transforming CIFAR10 to a 1x28x28 pixel image
cifar_transform = transforms.Compose([transforms.Grayscale(num_output_channels=1), 
                                      transforms.RandomCrop((28,28)), 
                                      transforms.ToTensor()])

# load the dataset, it will be downloaded if it is not yet in your datapath
cifar_set = CIFAR10(root=DATA_PATH, train=False, download=True, transform=cifar_transform)

# create val and train loaders
cifar_loader = {}
cifar_train_set, cifar_val_set = torch.utils.data.random_split(cifar_set, [7000, 3000])
cifar_loader['train'] = data.DataLoader(cifar_train_set, batch_size=1024, shuffle=True, drop_last=False)
cifar_loader['val'] = data.DataLoader(cifar_val_set, batch_size=1024, shuffle=True, drop_last=False)

Load your preferred MLP model 

In [None]:
# Define which pretrained model you want to load 
model_name = 'MLP_1'
model_file_path = os.path.join(TRAINED_PATH, model_name+'.tar')

# load the pretrained model 
model = SimpleMLP(28*28, [128,256], 10)
model.load_state_dict(torch.load(model_file_path))
model.eval()

### Finetune MLP

Finetune your MLP model with CIFAR10 as the outlier dataset. This example below trains on all CIFAR10 classes as outliers, and all MNIST classes as 'known'. It trains for 5 epochs. 

> **Exercise 4b** Implement the Outlier Exposure model. What should be the loss objective? Perform your own experiments.  

As you can see in the code provided, I already wrote the loss objective for you. However, I did not write the validation part yet (where the train results get validated). Nonetheless, the validation part is valuable when selecting a model. It makes sure it selects the model that generalizes the best to unseen data. Without this part the function just selects the last trained MLP model and this has the risk of overfitting. 


In [None]:
# train your model for a given 

finetuned_model, val_performances, train_losses = finetune_MLP_OE(mnist_loader, cifar_loader, model, epochs=10, lr=0.01, threshold=0)

In [None]:
print(f'calculated validation performances are {val_performances} - since it not yet uses the validation part')

As you can see, the code works fine without the validation part, but it is very much recommended to implement it. This requires you to write a function that evaluates the MLP model given the validation outlier and known datasets. 

Some way of evaluating could be: Given a threshold, how many _good_ predictions does the model make, how many of the outlier cases does it classify as unknown and how many of the known data does it classify correctly to their class (0 to 9 for MNIST)?

When you have this evaluation function, it is trivial to implement the validation part in your train function. I have written code for this already and you could look at the code for the SimpleMLP too. 

In [None]:
# train your model again, now after implementing 

### OE Experiments 

Now it is time to perform experiments. Try to make comparisons on which outlier data sets have the most preferable influence on the model. How does this compare to your baseline model? How does this compare to your ODIN implementation? 

> **Exercise 4b** Set up your own experiments. How does the closeness of the three data sets (outlier/known train, unknown test) relate to the performance of the OE model? Define a couple of settings with different levels of closeness. Report performance on each of the settings and represent your observations in a meaningful way. Which conclusions can you draw? 



In [None]:
# Perform your experiments here. 

