# Bayesian PCA

In [1]:
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.datasets import load_iris

In [2]:
%config InlineBackend.figure_format = "retina"

* A fully probabilistic approach to PCA allows us to automatically choose the dimensionality of the principal subspace
* In this notebook, we consider a model in which only $\bf W$ has a prior distribution of the form

$$
    p({\bf W}\vert\boldsymbol\alpha) = \prod_{m=1}^M \left(\frac{\alpha_m}{2\pi}\right)^{D/2} \exp\left(-\frac{\alpha_m}{2}{\bf w}_m^T{\bf w}_m\right)
$$

To choose the values of $\{\alpha_m\}_{m=1}^M$, we maximize the marginal likelihood once $\bf W$ has been integrated out. That is, we want to maximize 

$$
    p({\bf X}\vert\boldsymbol\mu, \boldsymbol\mu, \sigma^2) = \int\prod_{m=1}^M \mathcal{N}({\bf w}_m\vert {\bf 0}, \alpha_m{\bf I}) \cdot \mathcal{N}({\bf x}_n \vert \boldsymbol\mu,{\bf C}) \ \text{d}{\bf W} \label{eq:a}\tag{1}
$$

Where
* ${\bf C} = {\bf W}{\bf W}^T + \sigma^2{\bf I}$


The integral over $\eqref{eq:a}$ is intractable, thus we make use of the Laplace approximation

In [3]:
iris = load_iris()
X_iris, y_iris = iris["data"], iris["target"]

In [4]:
N, D = X_iris.shape
M = 2
sigma2 = 1

In [7]:
W = np.random.randn(D, M)
S = np.cov(X_iris.T)
C = W @ W.T + np.eye(D) * sigma2
alpha = np.random.rand(M)

Rewriting the equation $\sum_{m}\alpha_m{\bf w}^T_m{\bf w}_m$

In [21]:
np.einsum("m,jm,jm",alpha, W, W)

3.5301657232594517

In [24]:
np.trace(alpha * np.identity(M) @ W.T @ W)

3.530165723259452

In [47]:
np.linalg.inv(S @ np.linalg.inv(C) - np.eye(D)) @ C @ W

array([[-1.20960248, 12.64040337],
       [-2.29046053,  5.90228928],
       [ 2.55310206, -6.50051632],
       [-2.7095394 ,  1.46510875]])

In [48]:
W

array([[-0.3130791 , -1.96893035],
       [ 0.55427098, -0.80387089],
       [-1.00525531,  0.89190196],
       [ 0.70597081, -0.14137084]])

# References 

1. https://papers.nips.cc/paper/1549-bayesian-pca.pdf
2. https://www.cs.toronto.edu/~rsalakhu/STA4273_2015/notes/Lecture8_2015.pdf
3. https://haralick.org/ML/Neural_Networks_for_Pattern_Recognition_Christopher_Bishop.pdf
4. http://www.miketipping.com/papers/met-mppca.pdf