# Analysis of the posterior distribution of a dataset with known $\sigma^2$ and unkown $\mu$

We assume a dataset $X\sim\mathcal{N}(\mu, \sigma^2)$ has known $\sigma^2$ and unkown $\mu$. The goal is to find the value (or the distribution) of $\mu$.

Since we do not have information about $\mu$, we choose a _prior_ distribution over $\mu$ taking into account that we know the distribution of the dataset $X$. For mathematical convenience we can choose $\mu$ to be a conjugate prior (meaning that it belongs to the same distribution familiy as $X$), that is, we assume $\mu \sim \mathcal{N}(\mu_0, \sigma^2_0)$.

It can be shown that the posterior distribution of $\mu$, $\mu_N$, is of the form

$$
    \mu_N \sim \mathcal{N}\left(\frac{\sigma^2\mu_0 + N\sigma_0^2\bar x}{\sigma^2 + N\sigma_0^2}, \left(\frac{1}{\sigma_0^2} + \frac{N}{\sigma^2} \right)^{-1}\right)
$$

In [1]:
import plotly.graph_objs as go
from plotly.offline import iplot
import numpy as np
from scipy.stats import norm

In [2]:
mu_0, sigma_0 = 0, 1
xrange = np.linspace(-3, 3, 300)
yrange = norm.pdf(xrange, loc=mu_0, scale=sigma_0)

In [3]:
data = [
    {"type": "scatter", "x": xrange, "y": yrange}
]

fig = go.FigureWidget(data=data)
iplot(fig)

Suppose a stream of data reaches our system one observation at a time, we know that the data arriving our system is Gaussian distributed with $\sigma^2=0.8$ and unknown $\mu$.

In order to estimate $\mu$, we assume the prior distribution of $\mu$ to be Gaussian with mean $\mu_0=0$ and $\sigma_0^2 = 1$. In this example, $\mu_0$ and $\sigma_0^2$ will dictate the estimated $\mu$ and the _error_ rate respectively.

In [81]:
def gaussian_learning(X, mu0, sigma0, sigma):
    """
    With a given dataset X ~ N(mu, sigma) with known variance of
    sigma and unkown mean mu, estimate the parameter mu considering
    a gaussian prior mu ~ N(mu0, sigma0).
    ** We assume the number of point arriving in our system is one by one **
    
    Parameters
    ----------
    X: numpy.array
        Observations seen over time
    mu0: float
        initial prior mean
    sigma0: float
        initial prior variance
    sigma: float
        known variance of the distribution
        
    Returns
    -------
    dict: Dictionary with keys 'mu' and 'sigma', each a list of
          the learned paramters for mu and sigma over time
    """
    priors = {
        "mu": [mu0],
        "sigma": [sigma0]
    }
    
    for obs in X:
        mu0 = (sigma * mu0 + 1 * sigma0 * obs) / (sigma + 1 * sigma0)
        sigma0 = 1 / (1 / sigma0 + 1 / sigma)
        
        priors["mu"].append(mu0)
        priors["sigma"].append(sigma0)
    
    return priors

def plot_gaussian_learning(priors):
    xvals = np.ones_like(priors["mu"]).cumsum()
    data = [
        {"type": "scatter", "x": xvals, "y": priors["mu"], "name": "mu0"},
        {"type": "scatter", "x": xvals, "y": priors["sigma"], "name": "sigma0", "yaxis": "y2"}
    ]

    layout = {
        "yaxis": {
            "title": "mu"
        },

        "yaxis2": {
            "title": "sigma",
            "side": "right",
            "overlaying": "y"
        }
    }

    fig = go.Figure(data=data, layout=layout)
    
    return fig

def animate_learned_gaussian(priors, xmin, xmax):
    (mu0, sigma0), *dataset = list(zip(*priors.values()))

    data = [{"type": "scatter", "x": xrange, "y": norm.pdf(xrange, loc=mu0, scale=sigma0)}]
    layout = {
        "xaxis": {
            "range": [xmin, xmax], "autorange": False,        
        },
        "yaxis": {
            "range": [0, 10], "autorange": False
        },
        "updatemenus": [{
            "type": "buttons",
            "buttons": [{
                "label": "Learn",
                "method": "animate",
                "args": [None]
            }]
        }]
    }

    frames = [
        {"data": [{"type": "scatter", "x": xrange, "y": norm.pdf(xrange, loc=mu, scale=sigma)}],
         "layout": {"title": f"mu={mu:0.2f} sigma={sigma:0.2f}"}}
    for mu, sigma in dataset]

    fig = go.Figure(data=data, layout=layout, frames=frames)
    
    return fig

In [82]:
nobs = 20
mu, sigma = 2, 1
mu_0, sigma_0 = 0, 1
stream = np.random.randn(nobs) * sigma + mu
priors = gaussian_learning(stream, mu_0, sigma_0, sigma)

In [83]:
iplot(plot_gaussian_learning(priors))

In [84]:
fig = animate_learned_gaussian(priors, xmin=0, xmax=3)
iplot(fig)

What happens if the distribution eventually changes?

In [85]:
nobs = 20
# Paramaters of our initial prior distribution
mu_0, sigma_0 = 0, 1
# Our known variance
sigma = 0.8 # Our known variance
mu0, mu1 = 2, -2 # our *unknown* set of means
stream0 = np.random.randn(nobs) * sigma + mu0
stream1 = np.random.randn(nobs) * sigma + mu1
stream = np.r_[stream0, stream1]

priors = gaussian_learning(stream, mu_0, sigma_0, sigma)
fig = plot_gaussian_learning(priors)
iplot(fig)

In [86]:
fig = animate_learned_gaussian(priors, xmin=-2, xmax=2)
iplot(fig)