Instructions:

Implement Bayesian inference
1) Randomly generate n (2D) random samples from a MVN with mean [-1, 1]; covariance [2, 1.3; 1.3; 4] 
2) Use Gibbs sampling to infer unknown parameters : mean & covariance

In [4]:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns
import random

from scipy.stats import invwishart

# 1 Generate Data from MVN

In [5]:
random_seed = 123
rng = np.random.default_rng(random_seed)

In [6]:
μ = np.array([-1, 1])
Σ = np.array([[2, 1.3], [1.3, 4]])
X = np.random.multivariate_normal(μ, Σ, size=1000)

# 2 Inference

## 2.1 Semi Conjugate Prior

Assume Semi-Conjugate Prior ($\mu$ and $\Sigma$ independent)

$
\qquad p(\mu, \Sigma) = p(\mu)p(\Sigma)
$

Where,

$
\qquad \mu \sim \mathcal{N}(\mu_0, \Lambda_0^{-1})
$  
$
\qquad \Sigma \sim \mathcal{IW}(\nu_0, \Psi_0)
$

Conditional Posterior for $\mu$ (Murphy 128)  
$$\begin{align}
    p(\mu|D, \Sigma) &= \mathcal{N}(\mathbf{m}_N, \mathbf{V}_N)\\
    \mathbf{V}_N &= (\mathbf{V}_0^{-1} + N\Sigma^{-1})^{-1}\\
    \mathbf{m}_N &= \mathbf{V}_N(\Sigma^{-1}(N\bar{\mathbf{x}}) + \mathbf{V}_0^{-1}\mathbf{m}_0)
\end{align}$$  

Conditional Posterior for $\Sigma$ (Murphy 129)
$$\begin{align}
    p(\Sigma | D, \mathbf{\mu}) &= \mathcal{IW}(\Sigma|\mathbf{S}_N, \nu_N)\\
    \nu_N &= \nu_0 + N\\
    \mathbf{S}^{-1}_N &= \mathbf{S}_0 + \mathbf{S}_{\mu}
\end{align}$$

In [1]:
def gibbs(X, num_iters, num_burn):
    n, d = X.shape
    x_bar = X.mean(axis=0)

    μ_0 = np.zeros(d)
    V_0 = np.eye(d) * 1e10  # uninformative prior on mu
    ν_0 = d + 2
    Ψ_0 = np.eye(d)

    μ_samples = np.zeros((num_iters + num_burn, d))
    Σ_samples = np.zeros((num_iters + num_burn, d, d))

    # Initialize
    μ = x_bar.copy()
    Σ = np.cov(X.T)

    for i in range(num_iters + num_burn):
        ''' 
            Sample μ
        '''
        Σ_inv = np.linalg.inv(Σ)
        V_0_inv = np.linalg.inv(V_0)

        V_n = np.linalg.inv(V_0_inv + n * Σ_inv)
        μ_n = V_n @ ((Σ_inv @ (n * x_bar)) + (V_0_inv @ μ_0))
        μ = np.random.multivariate_normal(mean=μ_n, cov=V_n)

        ''' 
            Sample Σ
        '''
        Ψ_μ = X - μ
        Ψ_n = Ψ_0 + Ψ_μ.T @ Ψ_μ
        ν_n = ν_0 + n
        Σ = invwishart.rvs(df=ν_n, scale=Ψ_n)

        # save
        μ_samples[i] = μ
        Σ_samples[i] = Σ

    return μ_samples[num_burn:], Σ_samples[num_burn:]


In [7]:
mus, sigmas = gibbs(X,10000, 2500)

In [8]:
print(mus.mean(axis=0), sigmas.mean(axis=0))

[-0.9845891  1.0000674] [[2.08023097 1.33088953]
 [1.33088953 4.0109295 ]]


## 2.2 Fully Conjugate Prior

Assuming a fully conjugate prior
$$\begin{align}
    p(\boldsymbol{\mu}, \boldsymbol{\Sigma}) = p(\boldsymbol{\mu} | \boldsymbol{\Sigma})p(\boldsymbol{\Sigma}) = \mathcal{NIW}(\mu,\Sigma | \mathbf{m}_0, \kappa_0, \nu_0, \mathbf{S}_0)
\end{align}$$

Joint Posterior

$$\begin{align}

    p(\mu,\Sigma | D) &= \mathcal{N}(\mu|\mathbf{m}_N, \frac{1}{\kappa_N}\Sigma) \times \mathcal{IW}(\Sigma | \mathbf{S}_N, \nu_N)\\ &= \mathcal{NIW}(\mu,\Sigma | \mathbf{m}_N, \kappa_N, \nu_N, \mathbf{S}_N)\\\\

    m_N &= \frac{\kappa_0\mathbf{m}_0 + N\bar{\mathbf{x}}}{\kappa_0 + N}\\
    \kappa_N &= \kappa_0 + N\\
    \nu_N &= \nu_0 + N \\
    \mathbf{S}_N &= \mathbf{S}_0 + \mathbf{S}_{\bar{\mathbf{x}}} + \frac{\kappa_0N}{\kappa_0 + N}(\bar{\mathbf{x}} - \mathbf{m}_0)(\bar{\mathbf{x}} - \mathbf{m}_0)^T\\
    &= \mathbf{S}_0 + \mathbf{S} + \kappa_0\mathbf{m}_0\mathbf{m}_0^T - \kappa_N\mathbf{m}_N\mathbf{m}_N^T \\

    \mathbf{S} &= \sum_{i=1}^{N} (\bar{x} - \mu_0)(\bar{x} - \mu_0)^T
    
\end{align}$$

Inference Process:  
$
\qquad \Sigma \sim \mathcal{IW}(\nu_n, S_n)
$  
  
$
\qquad \mu \sim \mathcal{N}(\mu_n, \frac{\Sigma}{\kappa_n})
$

In [9]:
def exact_inference(X):
    n,d = X.shape

    x_bar = X.mean(axis=0)

    k_0 = 0.01         # Prior confidence in μ_0
    k_n = k_0 + n      
    μ_0 = np.zeros(d)  # Prior mean vector 
    ν_0 = d + 1        # DOF for inverse wishart
    ν_n = ν_0 + n
    S_0 = np.eye(d)    # prior scatter matrix

    ''' 
        Infer Σ
    '''
    S = (X - x_bar).T @ (X - x_bar)
    diff = (x_bar - μ_0).reshape(-1,1)
    S_n = S_0 + S +(k_0 * n / k_n) * (diff @ diff.T)
    Σ = invwishart.rvs(df=ν_n, scale=S_n)
    

    '''
        Infer μ
    '''
    μ_n = (k_0 * μ_0 + n * x_bar)/k_n
    μ = np.random.multivariate_normal(mean=μ_n, cov=Σ/k_n)

    return μ,Σ

In [14]:
mu, sigma = exact_inference(X)

In [15]:
print(mu,sigma)

[-0.99017225  1.0004372 ] [[2.20515488 1.39655234]
 [1.39655234 4.14845622]]
