This repository contains the code base for the project for the Advanced Machine Learning (02460) course at DTU.
Please refer to our paper here for further and detailed explantions.
The collaborators were:
- Grigor Spalj
- Alessandro Contini
- Anders David Lægdsgaard Lassen
- Mads Birch Sørensen
Deep Neural Networks (DNNs) have often been found to be poorly calibrated due to overconfident predictions. At the same time the increasing complexity of modern DNNs have led to these models being considered black boxes. For that reason, various explanation methods have been proposed to uncover what features influence the predictions of DNNs. Bayesian Deep Neural Networks (BNNs) infer the entire posterior distribution of the weights, meaning that uncertainty about predictions is inherent. In this code we investigate how a Bayesian approach can improve the calibration and explainability of modern DNNs. We implemented and trained a Convolutional Neural Network (CNN) on the MURA dataset to perform a classification task and found that the Bayesian framework resulted in a significant reduction in the calibration error and improved the interpretability of the implemented visual explanation methods.
We found that adopting a Bayesian framework, resulted in a dramatic decrease in calibration error, indicating that the models confidence more closely matches the actual uncertainty of the predictions.
By visually investigating the predictions of the models using Class Activation Maps (CAMs), we discovered that the models had learned optimal but different solutions. By looking at the diverse set of learned representation, valuable information about the certainty of predictions was gained.
By aggregating the CAMs using the average,union and intersection methods, we further investigated uncertainty of the model predictions. By investigating what information is most important for the models when making predictions, we gained valuable insight into the certainty of the learned representations, and thereby about the certainty about the predictions of the models. Hence, by adopting a Bayesian framework we successfully reduced the black-box nature of modern CNNs.