<a href="https://colab.research.google.com/github/sharon-lin/ML-notebooks/blob/master/MIC_Workshop_6_GANs.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# MIC Workshop 6: Generative Adversarial Networks


## Background
Hope you're enjoying the Deep Learning workshops so far! For these workshops, we'll typically use Google Colab, an online coding environment. This is so that we don't have to worry about installing all of the libraries on everyone's different computers. 

_____
You're now working in a Notebook. Notebooks have **cells**, each of which can be run by hitting Shift+Enter. Try it on the cell below!

_You will see the output of the particular cell right below it_

In [1]:
print("Notebooks are so much fun!!!")

Notebooks are so much fun!!!


In general, notebooks are a very suitable tool for machine learning/data science. We would also recommend trying [Jupyter Notebook](https://jupyter.org/install) if you haven't already

_For later workshops, we might opt out for a more involved environment like Docker in case we want to do anything fancier_
____
## Installing PyTorch
Don't worry too much about the contents of this cell. It basically just installs the right packages for you to run PyTorch code

If this cell is causing problems for you (like `tcmalloc`,  make sure you click "connect to Hosted runtime" from the dropdown menu in the top right)


In [2]:
# Installing pytorch, don't worry about the code in this cell. 
# http://pytorch.org/
from os.path import exists
from wheel.pep425tags import get_abbr_impl, get_impl_ver, get_abi_tag
platform = '{}{}-{}'.format(get_abbr_impl(), get_impl_ver(), get_abi_tag())
cuda_output = !ldconfig -p|grep cudart.so|sed -e 's/.*\.\([0-9]*\)\.\([0-9]*\)$/cu\1\2/'
accelerator = 'cpu' #cuda_output[0] if exists('/dev/nvidia0')

!pip3 install torchvision
!pip install unidecode
!pip install tensorboardX

#!pip install tqdm

Collecting unidecode
[?25l  Downloading https://files.pythonhosted.org/packages/31/39/53096f9217b057cb049fe872b7fc7ce799a1a89b76cf917d9639e7a558b5/Unidecode-1.0.23-py2.py3-none-any.whl (237kB)
[K    100% |████████████████████████████████| 245kB 7.4MB/s 
[?25hInstalling collected packages: unidecode
Successfully installed unidecode-1.0.23
Collecting tensorboardX
[?25l  Downloading https://files.pythonhosted.org/packages/5c/76/89dd44458eb976347e5a6e75eb79fecf8facd46c1ce259bad54e0044ea35/tensorboardX-1.6-py2.py3-none-any.whl (129kB)
[K    100% |████████████████████████████████| 133kB 3.8MB/s 
Installing collected packages: tensorboardX
Successfully installed tensorboardX-1.6


In [0]:
import torch
import math
import torch.nn as nn
import torch.optim as optim
import numpy as np
from torch.autograd import Variable
from IPython import display
import tqdm
import time
from tensorboardX import SummaryWriter
from matplotlib import pyplot as plt

use_cuda = False

It is notorioulsy difficult to get a stable GAN. For that reason, let's look at a very simple example today.

### Let's try to approximate a Gaussian with a mean and variance of our choice.

**Without loss of generality, let's say that $\mu = 5$  and $\sigma^2 = 16$.**

*A note on probability:*

If we had a source of randomness that provided us with a "truly random" number $x$ in the interval $[0,1]$, we could just take $\sigma\Phi^{-1}(x) + \mu$, where $\Phi(\cdot)$ is the CDF of the standard normal. 

In other words, we wouldn't necessarily need fancy machinery for this. Nevertheless, let's try out how well the GANs we just learned about perform.

Instead of taking that approach, let's make our fancy GAN model. We would need a 

```
Generator
```
and a
```
Discriminator.
```



**Let's first make sure we have access to an actual Gaussian.**

In [0]:
def get_distribution_sampler(mu, sigma):
    return lambda n: torch.Tensor(np.random.normal(mu, sigma, (1, n)))

Now we can try feeding our **Generator** with values sampled from a uniform distribution on $[0,1]$. Just like this: 

In [0]:
def get_generator_input_sampler():
    return lambda m, n: torch.rand(m, n) 

In [8]:
get_generator_input_sampler()(5,3)

tensor([[0.6124, 0.4040, 0.3029],
        [0.7193, 0.3918, 0.6619],
        [0.9301, 0.2627, 0.0198],
        [0.7621, 0.0982, 0.4237],
        [0.3819, 0.8969, 0.4771]])

## Discussion question:
What would then be the job of the generator (together with the discriminator)? What function are they trying to learn?

Suppose we have the following two networks. Feel free to modify them in any way you want.

In [0]:
class Generator(nn.Module):
    def __init__(self, input_size, hidden_size, output_size, f):
        super(Generator, self).__init__()
        self.map1 = nn.Linear(input_size, hidden_size)
        self.map2 = nn.Linear(hidden_size, hidden_size)
        self.map3 = nn.Linear(hidden_size, output_size)
        self.f = f
        self.hidden_f = torch.tanh # change?

    def forward(self, x):
        x = self.map1(x)
        x = self.hidden_f(x)
        x = self.map2(x)
        x = self.hidden_f(x)
        x = self.map3(x)
        return x

In [0]:
class Discriminator(nn.Module):
    def __init__(self, input_size, hidden_size, output_size, f):
        super(Discriminator, self).__init__()
        self.map1 = nn.Linear(input_size, hidden_size)
        self.map2 = nn.Linear(hidden_size, hidden_size)
        self.map3 = nn.Linear(hidden_size, output_size)
        self.f = f
        self.hidden_f = torch.tanh # change?

    def forward(self, x):
        x = self.hidden_f(self.map1(x))
        x = self.hidden_f(self.map2(x))
        return self.f(self.map3(x))


With that, we are ready to go and train our model.

In [0]:
def train():
    # Model parameters
    g_input_size = 1      # Random noise dimension coming into generator, per output vector
    g_output_size = 1     # Size of generated output vector
    d_input_size = 500    # Minibatch size - cardinality of distributions
    d_output_size = 1     # Single dimension for 'real' vs. 'fake' classification
    minibatch_size = d_input_size

    num_epochs = 2000  
    print_interval = 100

    # Which of the above can we try to tweak?
    g_hidden_size = 30     # Generator complexity
    d_hidden_size = 30    # Discriminator complexity

    d_learning_rate = 1e-3
    g_learning_rate = 1e-3
    sgd_momentum = 0.9
    
    d_steps = 10
    g_steps = 30

    
    dfe, dre, ge = 0, 0, 0
    d_real_data, d_fake_data, g_fake_data = None, None, None

    # Should we change these?
    discriminator_activation_function = torch.sigmoid
    generator_activation_function = torch.tanh

#     -------------------
#     Nothing interesting happening here
    d_sampler = get_distribution_sampler(data_mean, data_stddev)
    gi_sampler = get_generator_input_sampler()
    G = Generator(input_size=g_input_size,
                  hidden_size=g_hidden_size,
                  output_size=g_output_size,
                  f=generator_activation_function)
    D = Discriminator(input_size=d_input_func(d_input_size),
                      hidden_size=d_hidden_size,
                      output_size=d_output_size,
                      f=discriminator_activation_function)
    criterion = nn.BCELoss()  # Binary cross entropy: http://pytorch.org/docs/nn.html#bceloss
    d_optimizer = optim.SGD(D.parameters(), lr=d_learning_rate, momentum=sgd_momentum)
    g_optimizer = optim.SGD(G.parameters(), lr=g_learning_rate, momentum=sgd_momentum)
#     -------------------

    for epoch in range(num_epochs):
        for d_index in range(d_steps):
            # 1. Train D on real+fake
            D.zero_grad()

            #  1A: Train D on real
            d_real_data = Variable(d_sampler(d_input_size))
            d_real_decision = D(preprocess(d_real_data))
            d_real_error = criterion(d_real_decision, Variable(torch.ones([1,1])))  # ones = true
            d_real_error.backward() # compute/store gradients, but don't change params

            #  1B: Train D on fake
            d_gen_input = Variable(gi_sampler(minibatch_size, g_input_size))
            d_fake_data = G(d_gen_input).detach()  # detach to avoid training G on these labels
            d_fake_decision = D(preprocess(d_fake_data.t()))
            d_fake_error = criterion(d_fake_decision, Variable(torch.zeros([1,1])))  # zeros = fake
            d_fake_error.backward()
            d_optimizer.step()     # Only optimizes D's parameters; changes based on stored gradients from backward()

            dre, dfe = extract(d_real_error)[0], extract(d_fake_error)[0]

        for g_index in range(g_steps):
            # 2. Train G on D's response (but DO NOT train D on these labels)
            G.zero_grad()

            gen_input = Variable(gi_sampler(minibatch_size, g_input_size))
            g_fake_data = G(gen_input)
            dg_fake_decision = D(preprocess(g_fake_data.t()))
            g_error = criterion(dg_fake_decision, Variable(torch.ones([1,1])))  # Train G to pretend it's genuine

            g_error.backward()
            g_optimizer.step()  # Only optimizes G's parameters
            ge = extract(g_error)[0]

        if epoch % print_interval == 0:
            print("Epoch %s: D (%s real_err, %s fake_err) G (%s err); Real Dist (%s),  Fake Dist (%s) " %
                  (epoch, dre, dfe, ge, stats(extract(d_real_data)), stats(extract(d_fake_data))))

    print("Plotting the generated distribution...")
    values = extract(g_fake_data)
    print(" Values: %s" % (str(values)))
    plt.hist(values, bins=50)
    plt.xlabel('Value')
    plt.ylabel('Count')
    plt.title('Histogram of Generated Distribution')
    plt.grid(True)
    plt.show()

In [0]:
# helper functions, don't worry about them
def extract(v):
    return v.data.storage().tolist()

def stats(d):
    return [np.mean(d), np.std(d)]



In [0]:
def get_moments(d):
    # Return the first 4 moments of the data provided
    mean = torch.mean(d)
    diffs = d - mean
    var = torch.mean(torch.pow(diffs, 2.0))
    std = torch.pow(var, 0.5)
    zscores = diffs / std
    skews = torch.mean(torch.pow(zscores, 3.0)) # should be 0
    kurtoses = torch.mean(torch.pow(zscores, 4.0)) - 3.0  # should be 0
    final = torch.cat((mean.reshape(1,), std.reshape(1,), skews.reshape(1,), kurtoses.reshape(1,)))
    return final

In [0]:
def decorate_with_diffs(data, exponent, remove_raw_data=False):
    mean = torch.mean(data.data, 1, keepdim=True)
    mean_broadcast = torch.mul(torch.ones(data.size()), mean.tolist()[0][0])
    diffs = torch.pow(data - Variable(mean_broadcast), exponent)
    if remove_raw_data:
        return torch.cat([diffs], 1)
    else:
        return torch.cat([data, diffs], 1)

In [0]:
#(name, preprocess, d_input_func) = ("Raw data", lambda data: data, lambda x: x)
#(name, preprocess, d_input_func) = ("Data and variances", lambda data: decorate_with_diffs(data, 2.0), lambda x: x * 2)
#(name, preprocess, d_input_func) = ("Data and diffs", lambda data: decorate_with_diffs(data, 1.0), lambda x: x * 2)
(name, preprocess, d_input_func) = ("Only 4 moments", lambda data: get_moments(data), lambda x: 4)

In [0]:
data_mean = 4
data_stddev = 1.25

In [0]:
train()

  "Please ensure they have the same size.".format(target.size(), input.size()))


Epoch 0: D (0.5926645398139954 real_err, 0.5785809755325317 fake_err) G (0.811711311340332 err); Real Dist ([4.020793837428093, 1.2057239547654734]),  Fake Dist ([0.10046711359918117, 0.03323015332170268]) 
Epoch 100: D (0.671867847442627 real_err, 0.6368222236633301 fake_err) G (0.7088874578475952 err); Real Dist ([3.983368607908487, 1.2783110133968822]),  Fake Dist ([6.642817407608033, 1.7451584440179528]) 
Epoch 200: D (0.7550183534622192 real_err, 0.7303162217140198 fake_err) G (0.6406751871109009 err); Real Dist ([3.928498724579811, 1.1893773463249993]),  Fake Dist ([3.621812135219574, 1.6435753700509457]) 
Epoch 300: D (0.6464150547981262 real_err, 0.702130138874054 fake_err) G (0.6871411204338074 err); Real Dist ([4.003934324860573, 1.260248641146983]),  Fake Dist ([4.0868424091339115, 1.4377602182681655]) 
Epoch 400: D (0.6771487593650818 real_err, 0.6674289703369141 fake_err) G (0.6904081702232361 err); Real Dist ([3.9244540418386458, 1.2040759169827442]),  Fake Dist ([3.98519

Adapted from https://github.com/devnag/pytorch-generative-adversarial-networks/blob/master/gan_pytorch.py