Skip to content

Latest commit

 

History

History
152 lines (115 loc) · 5.85 KB

blrkc-post.md

File metadata and controls

152 lines (115 loc) · 5.85 KB
layout mathjax author affiliation e_mail date title chapter section topic theorem sources proof_id shortcut username
proof
true
Joram Soch
BCCN Berlin
joram.soch@bccn-berlin.de
2024-01-19 00:51:30 -0800
Posterior distribution for Bayesian linear regression with known covariance
Statistical Models
Univariate normal data
Bayesian linear regression with known covariance
Posterior distribution
authors year title in pages url
Bishop CM
2006
Bayesian linear regression
Pattern Recognition for Machine Learning
pp. 152-161, eqs. 3.49-3.51, ex. 3.7
authors year title in pages url doi
Penny WD
2012
Comparing Dynamic Causal Models using AIC, BIC and Free Energy
NeuroImage
vol. 59, iss. 2, pp. 319-330, eq. 27
10.1016/j.neuroimage.2011.07.039
P433
blrkc-post
JoramSoch

Theorem: Let

$$ \label{eq:GLM} y = X \beta + \varepsilon, ; \varepsilon \sim \mathcal{N}(0, \Sigma) $$

be a linear regression model with measured $n \times 1$ data vector $y$, known $n \times p$ design matrix $X$ and known $n \times n$ covariance matrix $\Sigma$ as well as unknown $p \times 1$ regression coefficients $\beta$. Moreover, assume a multivariate normal distribution over the model parameter $\beta$:

$$ \label{eq:GLM-N-prior} p(\beta) = \mathcal{N}(\beta; \mu_0, \Sigma_0) ; . $$

Then, the posterior distribution is also a multivariate normal distribution

$$ \label{eq:GLM-N-post} p(\beta|y) = \mathcal{N}(\beta; \mu_n, \Sigma_n) $$

and the posterior hyperparameters are given by

$$ \label{eq:GLM-N-post-par} \begin{split} \mu_n &= \Sigma_n (X^\mathrm{T} \Sigma^{-1} y + \Sigma_0^{-1} \mu_0) \\ \Sigma_n &= \left( X^\mathrm{T} \Sigma^{-1} X + \Sigma_0^{-1} \right)^{-1} ; . \end{split} $$

Proof: According to Bayes' theorem, the posterior distribution is given by

$$ \label{eq:GLM-N-BT} p(\beta|y) = \frac{p(y|\beta) , p(\beta)}{p(y)} ; . $$

Since $p(y)$ is just a normalization factor, the posterior is proportional to the numerator:

$$ \label{eq:GLM-N-post-JL} p(\beta|y) \propto p(y|\beta) , p(\beta) = p(y,\beta) ; . $$

Equation \eqref{eq:GLM} implies the following likelihood function:

$$ \label{eq:GLM-LF} p(y|\beta) = \mathcal{N}(y; X \beta, \Sigma) = \sqrt{\frac{1}{(2 \pi)^n |\Sigma|}} , \exp\left[ -\frac{1}{2} (y-X\beta)^\mathrm{T} \Sigma^{-1} (y-X\beta) \right] ; . $$

Combining the likelihood function \eqref{eq:GLM-LF} with the prior distribution \eqref{eq:GLM-N-prior} using the probability density function of the multivariate normal distribution, the joint likelihood of the model is given by

$$ \label{eq:GLM-N-JL-s1} \begin{split} p(y,\beta) = ; & p(y|\beta) , p(\beta) \\ = ; & \sqrt{\frac{1}{(2 \pi)^n |\Sigma|}} , \exp\left[ -\frac{1}{2} (y-X\beta)^\mathrm{T} \Sigma^{-1} (y-X\beta) \right] \cdot \\ ; & \sqrt{\frac{1}{(2 \pi)^p |\Sigma_0|}} , \exp\left[ -\frac{1}{2} (\beta-\mu_0)^\mathrm{T} \Sigma_0^{-1} (\beta-\mu_0) \right] ; . \end{split} $$

Collecting identical variables gives:

$$ \label{eq:GLM-N-JL-s2} \begin{split} p(y,\beta) = ; & \sqrt{\frac{1}{(2 \pi)^{n+p} |\Sigma| |\Sigma_0|}} \cdot \\ & \exp\left[ -\frac{1}{2} \left( (y-X\beta)^\mathrm{T} \Sigma^{-1} (y-X\beta) + (\beta-\mu_0)^\mathrm{T} \Sigma_0^{-1} (\beta-\mu_0) \right) \right] ; . \end{split} $$

Expanding the products in the exponent gives:

$$ \label{eq:GLM-N-JL-s3} \begin{split} p(y,\beta) = ; & \sqrt{\frac{1}{(2 \pi)^{n+p} |\Sigma| |\Sigma_0|}} \cdot \\ & \exp\left[ -\frac{1}{2} \left( y^\mathrm{T} \Sigma^{-1} y - y^\mathrm{T} \Sigma^{-1} X \beta - \beta^\mathrm{T} X^\mathrm{T} \Sigma^{-1} y + \beta^\mathrm{T} X^\mathrm{T} \Sigma^{-1} X \beta + \right. \right. \\ & \hphantom{\exp \left[ -\frac{1}{2} \right.} ; \left. \left. \beta^\mathrm{T} \Sigma_0^{-1} \beta - \beta^\mathrm{T} \Sigma_0^{-1} \mu_0 - \mu_0^\mathrm{T} \Sigma_0^{-1} \beta + \mu_0^\mathrm{T} \Sigma_0^{-1} \mu_0 \right) \right] ; . \end{split} $$

Regrouping the terms in the exponent gives:

$$ \label{eq:GLM-N-JL-s4} \begin{split} p(y,\beta) = ; & \sqrt{\frac{1}{(2 \pi)^{n+p} |\Sigma| |\Sigma_0|}} \cdot \\ & \exp\left[ -\frac{1}{2} \left( \beta^\mathrm{T} [ X^\mathrm{T} \Sigma^{-1} X + \Sigma_0^{-1} ] \beta - 2 \beta^\mathrm{T} [X^\mathrm{T} \Sigma^{-1} y + \Sigma_0^{-1} \mu_0] + \right. \right. \\ & \hphantom{\exp \left[ -\frac{1}{2} \right.} ; \left. \left. y^\mathrm{T} \Sigma^{-1} y + \mu_0^\mathrm{T} \Sigma_0^{-1} \mu_0 \right) \right] ; . \end{split} $$

Completing the square over $\beta$, we finally have

$$ \label{eq:GLM-N-JL-s5} \begin{split} p(y,\beta) = ; & \sqrt{\frac{1}{(2 \pi)^{n+p} |\Sigma| |\Sigma_0|}} \cdot \\ & \exp\left[ -\frac{1}{2} \left( (\beta-\mu_n)^\mathrm{T} \Sigma_n^{-1} (\beta-\mu_n) + (y^\mathrm{T} \Sigma^{-1} y + \mu_0^\mathrm{T} \Sigma_0^{-1} \mu_0 - \mu_n^\mathrm{T} \Sigma_n^{-1} \mu_n) \right) \right] \end{split} $$

with the posterior hyperparameters

$$ \label{eq:GLM-N-post-par-qed} \begin{split} \mu_n &= \Sigma_n (X^\mathrm{T} \Sigma^{-1} y + \Sigma_0^{-1} \mu_0) \\ \Sigma_n &= \left( X^\mathrm{T} \Sigma^{-1} X + \Sigma_0^{-1} \right)^{-1} ; . \end{split} $$

Ergo, the joint likelihood is proportional to

$$ \label{eq:GLM-N-JL-s6} p(y,\beta) \propto \exp\left[ -\frac{1}{2} (\beta-\mu_n)^\mathrm{T} \Sigma_n^{-1} (\beta-\mu_n) \right] ; , $$

such that the posterior distribution over $\beta$ is given by

$$ \label{eq:GLM-N-post-qed} p(\beta|y) = \mathcal{N}(\beta; \mu_n, \Sigma_n) $$

with the posterior hyperparameters given in \eqref{eq:GLM-N-post-par-qed}.