# Bayesian Normal Density

This notebook illustrate how to use a Bayesian Normal density model with the [beer framework](https://github.com/beer-asr/beer). The Normal distribution is a fairly basic model but it is used extenslively in other model as a basic building block.

In [1]:
# Add "beer" to the PYTHONPATH
import sys
sys.path.insert(0, '../')

import beer
import numpy as np
import torch

# For plotting.
from bokeh.io import show, output_notebook
from bokeh.plotting import figure, gridplot
output_notebook()

# Convenience functions for plotting.
import plotting

%load_ext autoreload
%autoreload 2

## Data

Generate some normally distributed data:

In [2]:
mean = np.array([-1.5, 4]) 
cov = np.array([
    [2, 1],
    [1, .75]
])
data = np.random.multivariate_normal(mean, cov, size=100)

fig = figure(
    title='Data',
    width=400,
    height=400,
    x_range=(mean[0] - 5, mean[0] + 5),
    y_range=(mean[1] - 5, mean[1] + 5)
)
fig.circle(data[:, 0], data[:, 1])
plotting.plot_normal(fig, mean, cov)

show(fig)

## Model Creation

We create two types of Normal distribution: one diagonal covariance matrix and another one with full covariance matrix.

In [3]:
normal_diag = beer.NormalDiagonalCovariance.create(torch.ones(2), torch.eye(2), prior_count=1e-3)
normal_full = beer.NormalFullCovariance.create(torch.zeros(2), torch.eye(2), prior_count=1e-3)

## Variational Bayes Training 

In [4]:
# Training.
beer.train_loglinear_model(normal_diag, torch.from_numpy(data).float())
beer.train_loglinear_model(normal_full, torch.from_numpy(data).float())

fig = figure(
    title='Initial model',
    width=400,
    height=400,
    x_range=(mean[0] - 5, mean[0] + 5),
    y_range=(mean[1] - 5, mean[1] + 5)
)
fig.circle(data[:, 0], data[:, 1])
plotting.plot_normal(fig, normal_diag.mean.numpy(), normal_diag.cov.numpy(), alpha=.5, color='red')
plotting.plot_normal(fig, normal_full.mean.numpy(), normal_full.cov.numpy(), alpha=.5, color='green')

show(fig)

  elif np.issubdtype(type(obj), np.float):


In [5]:
normal_full.cov


 2.0152  1.0291
 1.0291  0.8246
[torch.FloatTensor of size (2,2)]

In [6]:
normal_full.cov


 2.0152  1.0291
 1.0291  0.8246
[torch.FloatTensor of size (2,2)]

# Model comparison

We generate data for various correlation parameters:

$$
X_{\lambda} \sim \mathcal{N}(
    \begin{pmatrix} 
    0 \\
    0
    \end{pmatrix}, 
    \begin{pmatrix} 
    1 & \lambda \\
    \lambda & 1
    \end{pmatrix})
$$

and we compare the model evidence for both the Normal distribution with diagonal covariance matrix and with full covariance matrix.

$$
\ln B_{\lambda} = \ln \frac{p(X_\lambda | \mathcal{M}_{\text{full}})}{p(X_\lambda | \mathcal{M}_{\text{diag}})} =
     \ln \frac{\int_{\theta} p(X_\lambda | \theta, \mathcal{M}_{\text{full}}) p(\theta) d\theta}{\int_{\theta}p(X_\lambda | \theta, \mathcal{M}_{\text{diag}})p(\theta) d\theta} = \frac{A_{\text{full}}(\xi + \sum_{n=1}^N T(x_n)) - A_{\text{full}}(\xi)}{A_{\text{diag}}(\xi + \sum_{n=1}^N T(x_n))
     - A_{\text{diag}}(\xi)}
$$

In [7]:
from scipy.special import logsumexp
import copy

lambdas = np.linspace(-.99, .99, 100)
lBs = []
    
# For each value of lambda.
for l in lambdas:
    
    # Generate the data.
    cov = np.array([
        [1, l],
        [l, 1]
    ])
    X = np.random.multivariate_normal(np.zeros(2), cov, size=1000)
    X = torch.from_numpy(X).float()
    
    # Fit both models
    normal_diag = beer.NormalDiagonalCovariance.create(torch.zeros(2), torch.eye(2), prior_count=1e-3)
    beer.train_loglinear_model(normal_diag, X)
    normal = beer.NormalFullCovariance.create(torch.zeros(2), torch.eye(2), prior_count=1e-3)
    beer.train_loglinear_model(normal, X)
    
    # Compute the log Bayes factor.
    llh_M1 = normal.posterior.log_norm - normal.prior.log_norm
    llh_M2 = normal_diag.posterior.log_norm - normal_diag.prior.log_norm
    lBs.append((llh_M1 - llh_M2))
    
lBs = np.array(lBs)
fig1 = figure(
    title='Model Comparison',
    x_axis_label='λ',
    y_axis_label='log Bayes factor',
    width=400,
    height=400
)
fig1.line(lambdas, lBs)

show(fig1)

In [8]:
np.random.multivariate_normal(np.zeros(2), np.array([[2., 1.], [1., 2]]))

array([-0.86515736, -1.05361082])

In [9]:
cov = np.array([[2., 1.], [1., 2]])
np.linalg.eigh(cov)

(array([1., 3.]), array([[-0.70710678,  0.70710678],
        [ 0.70710678,  0.70710678]]))