### Workflow

#### Goal

* Get first-hand experience in generative modeling by training a Variational Autoencoder on the MNIST set.

#### Steps

1. Train a VAE generating images of handwritten digits on the MNIST set. You might want to start with a simple autoencoder and build up from there.
1. Note that in order to test a regular autoencoder, you should put through it images from the test set. However, in order to test the generative capabilities of a VAE, you should use vectors of random noise. If you plan on displaying the generated images throughout training, it is a good idea to keep the noise vector constant, so that you can directly follow the progress of your generative model.

#### Deliverable

* This Notebook, with examples of generated output from a VAE.

#### Resources

* [AutoEncoders - Dimensionality Reduction](https://medium.com/analytics-vidhya/dimension-manipulation-using-autoencoder-in-pytorch-on-mnist-dataset-7454578b018)
* [Variational Autoencoders Explained](http://kvfrans.com/variational-autoencoders-explained/)
* [Variational Autoencoders in PyTorch](https://vxlabs.com/2017/12/08/variational-autoencoder-in-pytorch-commented-and-annotated/)
* [VAE Implementation on Colab by github user smartgeometry-ucl](https://colab.research.google.com/github/smartgeometry-ucl/dl4g/blob/master/variational_autoencoder.ipynb#scrollTo=gGtbRxtJEuw6)



In [2]:
import os
import torch
import torch.utils.data
from torch import nn, optim
from torch.autograd import Variable
from torch.nn import functional as F
from torchvision import datasets, transforms
from torchvision.utils import save_image

In [3]:
path = 'data/MelanomaDetection'

In [4]:
# might be fun to write this in fast.ai?
# model is the same. 
# optimizer is just adam
# loss fuction is harder.
# seems like the databunch or transforms would be easy.

# for fast.ai you can either extend BaseLoss or extend Module and include a forward function.  See losses.py 



In [5]:
# you could also try to write this in trax.  

# try defining the model first.  If you feel like that went well you can try loading the data in a different notebook.



### Model and Helper Functions

In [7]:
class VAE(nn.Module):
    
    def __init__(self, input_dims, fc_dims):
        
        super(VAE, self).__init__()
        
        #shrink - encoder
        self.fc1  = nn.Linear(input_dims, fc_dims[0])
        self.relu = nn.ReLU() #reuse in encoder and decoder
        self.fc_mu  = nn.Linear(fc_dims[0], fc_dims[1]) #mean
        self.fc_lv  = nn.Linear(fc_dims[0], fc_dims[1]) #log of the variance (wonder why not std dev?)
                  
        #grow - decoder
        self.fc3 = nn.Linear(fc_dims[1], fc_dims[0])
        self.fc4 = nn.Linear(fc_dims[0], input_dims)
        
        
        #used in reconstruction part of the loss by Binary Cross Entropy. 
        self.sigmoid = nn.Sigmoid()

    
    # you may not need to pass in the target here. 
    def forward(self, src):
        
        mu, logvar = self.encode(src)
        sample     = self.latent_sample(*self.reparameterize(mu, logvar)) 
        decoded    = self.decode(sample)
        return (decoded, mu, logvar)
    
    def encode(self, sample):
        
        x  = self.relu(self.f1(sample))
        mu = self.fc_mu(x)
        lv = self.fc_lv(x)
        
        return mu, lv
    
    def decode(self, sample):
        
        x = self.relu(self.fc3(sample))
        x = self.fc4(x)
        return self.sigmoid(x)
        
    
    # gets the error term from mu and log variance
    def reparameterize(self, mu: Variable, logvar: Variable) -> Variable:
        
        std = logvar.mul(0.5).exp_() #turn the logvar into the std dev.  
        eps = torch.empty_like(std).normal_()
            
        return (mu, std, eps) 
        
    def latent_sample(self, mu: Variable, std: Variable, eps: Variable) -> Variable:
       
        if self.training:
            return eps.mul(std).add_(mu)
        
        else: #inference   
            '''
            Options
            1. spit out mu (b/c in normal distribution it has the highest prob). Deterministic
            2. Use mu and std to create a distributions to sample from. Then perform sampling. Stochastic
            
            If you want consistent predictions you shouldn't use option 2.
            '''
            return mu #idea: __init__ takes a param that controls which option above to perform during inference.
        


def loss_function():
    pass
    # LEFT OFF HERE.
    

### Fast.ai

Update: I can't remember what the goal was here.

* Write a Fast.ai training loop?
* Just grab a DataLaoder that I could use with a pytorch training loop. Can't remember interoperability of PyTorch DS and DL. I think they'll work.

Let's just write the full training loop in Fast.ai.

In [1]:
# load the data. No transforms. Just a simple DataBlock?
# TODO: ight I want transforms here? 

from fastai.data.all import *
from fastai.vision.all import *

dblock = DataBlock(blocks    = (ImageBlock, ImageBlock),
                   get_items = get_image_files,
                   splitter  = RandomSplitter())

In [4]:


dsets = dblock.datasets(path + '/unlabeled')
dsets.train[0]

(PILImage mode=RGB size=32x32, PILImage mode=RGB size=32x32)

#### Fastai Training loop

In [None]:
# this is wrong.  Should also move out of fast.ai loop. 
model = VAE()

In [None]:
# this is wrong. fix. 
learn = Learner(dls, model, loss_func=CrossEntropyLossFlat(), splitter=siamese_splitter, metrics=accuracy)


### PyTorch