### Rapid-prototype your own sampler

We will now implement a Geometric Langevin Algorithm 2nd order sampler.

This will show you how to use the `simulation` interface as a light-weight interface that lends itself well to rapid prototyping for general sampling of loss manifolds.

In [1]:
import sys
sys.path.insert(1, '/home/heber/packages/TATi-unstable/lib/python3.5/site-packages')


The GLA2 sampler consists of the following steps: 

Now, we implement the actual GLA2 sampler in simple python code acting upon a given set of `parameters` and `momenta` using gradients that are obtaiend via a `gradients()` call.

In [3]:
import math
import numpy as np

def gla2_update_step(nn, momenta, old_gradients, step_width, beta, gamma):
    """Implementation of GLA2 update step using TATi's simulation interface.
    
    Note:
        Parameters are contained inside nn. For momenta we use
        python variables as the evaluation of the loss does not
        depend on them.

    Args:
      nn: ref to tati simulation instance
      momenta: numpy array of parameters
      old_gradients: gradients evaluated at last step
      step_width: step width for sampling step
      beta: inverse temperature
      gamma: friction constant

    Returns:
      updated gradients and momenta

    """

    # 1. p_{n+\tfrac 1 2} = p_n - \tfrac {\lambda}{2} \nabla_x L(x_n)
    momenta -= .5*step_width * old_gradients

    # 2. x_{n+1} = x_n + \lambda p_{n+\tfrac 1 2}
    nn.parameters = nn.parameters + step_width * momenta

    # \nabla_x L(x_{n+1})
    gradients = nn.gradients()

    # 3. \widehat{p}_{n+1} = p_{n+\tfrac 1 2} - \tfrac {\lambda}{2} \nabla_x L(x_{n+1})
    momenta -= .5*step_width * gradients

    # 4. p_{n+1} = \alpha \widehat{p}_{n+1} + \sqrt{\frac{1-\alpha^2}{\beta}} \cdot \eta_n
    alpha = math.exp(-gamma*step_width)
    momenta = alpha * momenta + \
              math.sqrt((1.-math.pow(alpha,2.))/beta) * np.random.standard_normal(momenta.shape)

    return gradients, momenta


This function `gla2_update_step()` performs the integration steps for the GLA2.

Next, we need to instantiate the interface, handing parameter and defining the neural network.

In [4]:
import TATi.simulation as tati
nn = tati(
    batch_data_files=["dataset-twoclusters.csv"],
    batch_size=5,
    seed=426,
)
print(nn.num_parameters())

3


Take note that we set a `batch_size` that is smaller than the dataset dimension. This is used to illustrate a point later on.

Before the iteration loop, we define some parameters needed by the GLA2 sampler.

In [16]:
gamma = 10
beta = 1e3

Moreover, we need temporary storage for the momentum and for the gradients.

In [27]:
momenta = np.zeros((nn.num_parameters()))
old_gradients = np.zeros((nn.num_parameters()))

As the sampler's a source of random noise we will be using `numpy`'s random number generator. For reproducible runs we fix its seed.

In [28]:
np.random.seed(426)

Finally, we set the neural network's `parameters` onto the minimum location found during training. Then, we proceedwith the actual sampling iteration that calls `gla2_update_step()` and prints the loss, parameters, and gradients.

In [29]:
nn.parameters = np.array([0.14637233, 0.32722256, -0.045677684])
for i in range(10):
    old_gradients, momenta = gla2_update_step(
        nn, momenta, old_gradients, step_width=1e-1, beta=beta, gamma=gamma)
    print("Step #"+str(i)+": "+str(nn.loss())+" at " \
        +str(nn.parameters)+", gradients "+str(old_gradients))


Step #0: 0.055072468 at [0.14637233, 0.32722256, -0.045677684], gradients [-0.4977227  -0.5074096   0.01946941]
Step #1: 0.05215037 at [0.14822862, 0.3330697, -0.051350962], gradients [-0.47487277 -0.48493218  0.01374783]
Step #2: 0.046247073 at [0.15635721, 0.3414384, -0.05003026], gradients [-0.43204674 -0.441189    0.01249683]
Step #3: 0.044368546 at [0.15870395, 0.34475836, -0.05025943], gradients [-0.4180484  -0.4269408   0.01165654]
Step #4: 0.042139463 at [0.16248181, 0.34795862, -0.048004877], gradients [-0.40186507 -0.4101999   0.01238546]
Step #5: 0.04139779 at [0.16116455, 0.35154322, -0.04999301], gradients [-0.39624512 -0.40452045  0.01101012]
Step #6: 0.039960828 at [0.16130517, 0.35606495, -0.051266517], gradients [-0.38541412 -0.39347222  0.0098615 ]
Step #7: 0.03677634 at [0.16559826, 0.36285028, -0.04894313], gradients [-0.3616403  -0.36890057  0.01036192]
Step #8: 0.036284395 at [0.16536659, 0.3648386, -0.049754377], gradients [-0.35780987 -0.3650126   0.00977268]
St