# Discussion

## Performance Metric

F1-score is the harmonic mean between the precision and recall, and so a good measure of success when precision and recall are equally important. We decided we wanted to equally weight each of the class labels hence we chose to use a macro average F-1 score to evaluate our models. 

## Carlini-Wagner Attacks

The basic CNN model performed very poorly on the test set generated by the C-W attack algorithm, which was unsuprising as the adversary images were specifically designed to fool this model. We saw a huge increase in our performance metric for the C-W test set when passed through the model with defences, which was good.

## Universal-Scoped Attacks

We found that the classifier with the defences was less successful than the basic one when classifying the unaltered test set and the universal-scoped attack images. This is a common problem: improvements in one area can lead to decreases in performance in another area. In order to fully assess the success of our defences model, we would need to carefully weigh up how much loss in performance on these images we are willing to accept in return for the witnessed improvement in the C-W images. Treating each set of test images as equally important, we concluded that the defences model was a better classifier, but we recognise that the levels of success would not be close to what would be necessary in a diverless car system. 

## Limitations

We recognise that our project is not fully polished and there are limitations to the conclusions we have drawn.
* The original dataset was reasonably small (40,000) and the adversary images were generated from the original dataset, hence increasing the dataset size and using completely distinct images for the creation of adversary attacks could improve the robustness of our results. In addition, due to the dataset size, we did not want to reuse the images which trained the adversary classifier to produce probability labels as inputs for the Defence CNN as we did not want to introduce systematic bias. Hence, increasing the dataset size could have allowed us to provide probabilities rather than binary labels as additional features to the Defence CNN.
* As the Carlini-Wagner attacks were formulated based off of the architecture of the basic CNN model, we expected the basic CNN to perform poorly on this test set. However, we cannot be sure for certain whether the improved performance on the CW test set by the Defence CNNs was attributable to the defences added or the fact that the attacks were not designed specifically for Defence CNN. 
* No hyperparameter tuning was conducted.

## Future works

Given the time constraints and difficulties we encountered, there are numerous avenues for further exploration with our data. Some of them are listed below:
* Due to time constraints, we did not explore hyperparameter tuning in any of our models (autoencoder, classifier, neural networks) and this could be explored in future renditions of the work.
* Repeat the testing stage again but include images which have been designed by the Carlini-Wagner attack for the Defence CNN.
* Investigate Generative Adversarial Networks.
* In future renditions, it would be interesting to weight the classes (0-42) by importance, e.g. misclassifying a stop sign may have greater consequences than a bumpy road sign. This could be done through the model's architecture or through how the evaluation is carried out.
* Furthermore, we could explore different architectures for our neural network, such as ResNets or DenseNets, which have been successful in other application domains for image classification.
* If we had more time and available training data, we would attempt at using the adversarial classifier method again, making some changes, such as using a separate training set and passing probabilites to the CNN instead of the binary labels. We might also look into creating two CNN models, one for images classified as adversarial and one for the others. This could prevent decreasing our success at classifying standard images. 