# Introducing Noise Experiment

#### Link to Readme section: 

https://git.cs.vt.edu/sdeepti/facial-expression-recognition/-/blob/main/README.md#introducing-noise

#### Citations:

- https://discuss.pytorch.org/t/how-to-add-noise-to-mnist-dataset-when-using-pytorch/59745

**Motivation**: Noisy images are actually more representative of real world data, which are normally not uniform and often contain many confounding details. Thus, our goal for this experiment was to evaluate our model's performance on test images containing varying levels of noise.

This was achieved by applying Gaussian Noise with different levels of variance on our test set. We predict that if our model is robust, then peformance should not decrease, unless a really large amount of noise is applied to our test set.

#### 1. Initial Set-Up

This adds all the imports that are necessary for the code to run smoothly. It involves importing 'torch' which is necessary to work with our model and retrieve our datasets. Additionally, 'sklearn' is used for evaluation metrics to be reported. Note that we are importing 'skimage.util' to utilize random noise. 

In [None]:
import os
import numpy as np

import torch
from torchvision import datasets, transforms, models
from torchvision.datasets import ImageFolder
from torch.utils.data import DataLoader

from sklearn.metrics import confusion_matrix, classification_report, accuracy_score, f1_score

from skimage.util import random_noise

The dataset being used is the **KDEF Dataset** which can be found by clicking the following link:
https://www.kdef.se/ .

For this experiment, we will be analyze how our model's performance varies when different levels of noise (different values of variance) are applied to our test set. Thus, the variable **variance** will be modified when needing to change the level of noise applied to the dataset. Initially we set it to 0.05.

In [None]:
variance = 0.05
print(f'using variance of {variance}')

model_path = '../main_resnet50/FEC_resnet50_trained_face_images_80_10_10.pt'
# model_path = '../dataset_size_experiment/dataset_size_70/FEC_resnet50_trained_face_images_70_10_20.pt'
data_dir = '../data/face_images_80_10_10'
num_classes = 7

device = 'cuda' if torch.cuda.is_available() else 'cpu'

First we want to load the trained model. Transfer the model to a GPU if avaliable, and then set the model to evaluation mode. 

In [None]:
# load the trained model
model = models.resnet50(num_classes=num_classes)
# transfer model to gpu if available
model = model.to(device)
model.load_state_dict(torch.load(model_path, map_location='cpu'))
# set model to evaluation mode
model.eval()

#### Custom Noise Transformation.

Now we create our custom noise transformation to add Gaussian Noise to our test set. 

In [None]:
class GaussianNoise(object):
	def __init__(self, mean=0., var=1.):
		self.var = var
		self.mean = mean

	def __call__(self, tensor):
		return torch.tensor(random_noise(tensor, mode='gaussian', mean=self.mean, var=self.var, clip=True), dtype=torch.float)

	def __repr__(self):
		return self.__class__.__name__ + f'(mean={self.mean}, var={self.var})'

Apply the Gaussian Noise along with the other transformations applied to the test set. 

In [None]:
# test transformations with noise added
test_transforms = transforms.Compose([
	transforms.Resize(size=(224, 224)),
	transforms.ToTensor(),
	# use ImageNet standard mean and std dev for transfer learning
	transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225]),
    	# add noise after normalization
	GaussianNoise(mean=0, var=variance)
])

#### 3. Load Test Dataset and Create Dataloader

Now we load our test dataset to which we applied transformations, as well as our Gaussian Noise. Then we create the dataloader. 

In [None]:
# load test dataset and create dataloader
batch_size = 16
test_set = datasets.ImageFolder(os.path.join(data_dir, 'test'), transform=test_transforms)
test_loader = DataLoader(test_set, batch_size=batch_size, shuffle=True)


#### 4. Test Model Performance on Test Set and Compute Metrics

The function below evaluates our model's performance on the test set with Gaussian Noise added to it (i.e. variance) and computes the metrics we decided to use for all experiments, and prints them. The metrics we are using include a Confusion Matrix, F1 Score, and Classification Report.

In [None]:
# tests performance on test set and computes metrics
def test(model, test_loader):
	# list of predicted labels of all batches
	predicted_labels = torch.zeros(0, dtype=torch.long, device='cpu')
	# list of actual labels of all batches
	actual_labels = torch.zeros(0, dtype=torch.long, device='cpu')

	with torch.no_grad():
		model.eval()
		# get batch of inputs (image) and outputs (expression label) from test_loader
		for inputs, labels in test_loader:
			inputs = inputs.to(device)
			labels = labels.to(device)

			# use model to predict label
			outputs = model(inputs)
			_, preds = torch.max(outputs, dim=1)

			# append batch prediction labels and actual labels
			predicted_labels = torch.cat([predicted_labels, preds.view(-1).cpu()])
			actual_labels = torch.cat([actual_labels, labels.view(-1).cpu()])

	print('\nTest Metrics:')
	# print confusion matrix
	print('Confusion Matrix:')
	print(confusion_matrix(actual_labels.numpy(), predicted_labels.numpy()))

	print('Test Accuracy:', accuracy_score(actual_labels.numpy(), predicted_labels.numpy()))
	print('F1 score:', f1_score(actual_labels.numpy(), predicted_labels.numpy(), average='weighted'))
	# print classification report
	print('Classification Report:')
	print(classification_report(actual_labels.numpy(), predicted_labels.numpy()))

	return predicted_labels

The following results were obtained for a variance value of 0.01. 

<div>
<img src="../Images/noise-exp-var-0.01.png" width="550"/>
</div>

In [None]:
test(model, test_loader)

Note that the code for this experiment was ran multiple times for varying values of variance to understand how our model performed when different amounts of noise were applied to our test set. The variance values that we used were: 0.01, 0.05, 0.07, 0.1, 0.15, 0.2. Upon running this experiment for differing values of variance, the following results were obtained and plotted:

<div>
<img src="https://git.cs.vt.edu/sdeepti/facial-expression-recognition/-/raw/main/Images/noise-experiment-line-graph.png"  width="450" height="320">
</div>

It can be seen that there was a gradual decrease in test accuracy for increasing values of variance indicating that our model was not very robust to noise, and would not neccesarily perform well with real-world data. 

There are multiple techniques we could apply to fix this. An obvious option is to retrain our model with a small random amount of noise added to our training images as a data augmentation. By training with noisy images, our model should be more agnostic to confounding details and perform better on real world images. Another option is to limit overfitting in our model using techniques such as dropout, early stopping, and loss regularization.


#### 5. Noise Experiment - Statistical Significance Study

We also performed a Statistical Significance Study on the noise experiment for a variance level of 0.1. We repeated the experiment using the same applied noise to our test set, and evaluated our model on the noisy test set 10 times, and plotted our test accuracies using a box plot as seen below:

<div align="center">
<img src="https://git.cs.vt.edu/sdeepti/facial-expression-recognition/-/raw/main/Images/noise-statistical-sig.png">
</div>

To summarize our findings, the median accuracy of our ten runs was 0.67, the minimum accuracy was 0.65, the maximum was 0.680, and that we have no outliers. 