# Bayesian Neural Networks 

This script explores using pytorch and pyro to build, train and test a bayesian neural network. The advantage and novelty of this model is that inference is done probablistically; thereby allowing us to quantify the certainty of model results. Kind of like a built in sensitivity  analysis. 

I like the way [this](https://towardsdatascience.com/making-your-neural-network-say-i-dont-know-bayesian-nns-using-pyro-and-pytorch-b1c24e6ab8cd) [3] tutorial phrases it:  

> Making Your Neural Network Say “I Don’t Know”

In this tutorial, we'll go over 
- key differences of `Bayesian Neural Networks (BNN)` vs `Neural Networks (NN)` 
- how BNNs do inference & training 
- simple tutorial on how to build a BNN using `pytorch` and `pyro` 
- difference in training and prediction time complexity 

# BNNs vs NNs 

## Classic Neural Network 


## Bayesian Neural Network 

![image.png](https://www.researchgate.net/profile/Florian_Haese/publication/329843608/figure/fig2/AS:713727343067138@1547177267367/llustration-of-a-Bayesian-Neural-Network-BNN-A-A-Bayesian-neuron-defines-a_W640.jpg)
Taken from [1]. 


# BNN advantage


## What happens when you give cutting edge neural network an image of noise? 

![image.png](https://miro.medium.com/max/802/0*HG51qQU8I34_fUgB.jpg)

--- 

The most obvious advantage of these BNNs are that they allow the model to propogate it's uncertainty about a prediction, from [this]( https://krasserm.github.io/2019/03/14/bayesian-neural-networks/) tutorial, there is a great plot that shows this.

![image.png](https://krasserm.github.io/img/2019-03-14/output_9_1.png)

From this we can see that the BNN knows where in it's feature space it has accurate predictions, and where it does not. I've been thinking about this as a built in sensitivity analysis. With a few extra lines of code, we can have the model say, "Hey, I don't know what the answer is, don't trust me", which is a key feature when making important decisions. 

Other advantages include being able to include priors on our weight distributions (could be tough to know how to do this well). Also priors on our X distributions, which could be as simple as calculating p(x) over each mini-batch of the training set and using this as the prior on test sets. 

# Training a BNN 

## Variation Inference 


## Sampling 


# Prediction 

In [5]:
# Import relevant packages
import torch
import torch.nn.functional as nnf
from torch.utils.data import random_split
from torch.utils.data.dataloader import DataLoader
from torch.optim import SGD 
from torch.distributions import constraints
import torchvision as torchv
import torchvision.transforms as torchvt
from torchvision.datasets.mnist import MNIST
from torch import nn
from pyro.infer import SVI, TraceMeanField_ELBO
import pyro
from pyro import poutine
import pyro.optim as pyroopt
import pyro.distributions as dist
import pyro.contrib.bnn as bnn
import matplotlib.pyplot as plt
import seaborn as sns
from torch.distributions.utils import lazy_property
import math
from torch.utils import data

from sklearn.datasets import load_iris
from matplotlib import pyplot as plt

In [6]:
X,Y = load_iris(return_X_y=True)

for i,x in enumerate(X): 
    torch.save(torch.tensor(x), './data/%d.pt' %i) 

n_classes=len(set(Y))

partition = {'train':range(0,100), 
             'val':range(100,125),
             'test':range(125,151)}

labels = {str(i):j for i,j in zip(range(151), Y)}

In [7]:
class Dataset(data.Dataset):
  'Characterizes a dataset for PyTorch'
  def __init__(self, list_IDs, labels):
        'Initialization'
        self.labels = labels
        self.list_IDs = list_IDs

  def __len__(self):
        'Denotes the total number of samples'
        return len(self.list_IDs)

  def __getitem__(self, index):
        'Generates one sample of data'
        # Select sample
        ID = self.list_IDs[index]

        # Load data and get label
        X = torch.load('data/' + ID + '.pt')
        y = self.labels[ID]

        return X, y

In [28]:
# CUDA for PyTorch
device = torch.device('cpu')

# Parameters
params = {'batch_size': 50,
          'shuffle': True,
          'num_workers': 6}

# Generators
training_set = Dataset(partition['train'], labels)
train_loader = data.DataLoader(training_set, **params)

validation_set = Dataset(partition['val'], labels)
validation_generator = data.DataLoader(validation_set, **params)

test_set = Dataset(partition['test'], labels)
test_generator = data.DataLoader(test_set, **params)

In [29]:
class FCN(nn.Module):
    def __init__(self, n_classes=n_classes):
        super(FCN, self).__init__()
        self.fc = nn.Sequential(nn.Dropout(p=0.2),
                                nn.Linear(4, 20),
                                nn.Dropout(p=0.9),
                                nn.LeakyReLU(),
                                nn.Linear(20,50),
                                nn.Dropout(p=0.9),
                                nn.LeakyReLU(),
                                nn.Linear(50, 20),
                                nn.Dropout(p=0.9),
                                nn.LeakyReLU(),
                                nn.Linear(20, n_classes),
                                nn.LogSoftmax(dim=-1))
    
    def forward(self, inp):
        return self.fc(inp)

FC_NN = FCN()

In [30]:
optim = SGD(FC_NN.parameters(recurse=True), lr=0.1, momentum=0.95)
epochs = 100

for i in range(epochs):
    total_loss = 0.0 
    total = 0.0
    correct = 0.0
    for x, y in train_loader:
        
        FC_NN.zero_grad()
        pred = FC_NN.forward(x)
        loss = nnf.binary_cross_entropy_with_logits(pred, nnf.one_hot(y, n_classes).float())
        total_loss += loss
        total += labels.size(0)
        correct += (pred.argmax(-1) == y).sum().item()
        loss.backward()
        optim.step()
    
    print('epoch: %d | loss: %.3f | acc: %.5f' %((i+1), total_loss, correct/total*100))


TypeError: Can't convert 'int' object to str implicitly

# References 

1.How machine learning can assist the interpretation of ab initio molecular dynamics simulations and conceptual understanding of chemistry. ResearchGate https://www.researchgate.net/publication/329843608_How_machine_learning_can_assist_the_interpretation_of_ab_initio_molecular_dynamics_simulations_and_conceptual_understanding_of_chemistry/figures?lo=1.

2.Variational inference for Bayesian neural networks - Martin Krasser’s Blog. https://krasserm.github.io/2019/03/14/bayesian-neural-networks/.

3.Chopra, P. Making Your Neural Network Say “I Don’t Know” — Bayesian NNs using Pyro and PyTorch. Medium https://towardsdatascience.com/making-your-neural-network-say-i-dont-know-bayesian-nns-using-pyro-and-pytorch-b1c24e6ab8cd (2019).

