layout

mathjax

author

affiliation

e_mail

date

title

chapter

section

topic

theorem

sources

proof_id

shortcut

username

proof

true

Joram Soch

BCCN Berlin

joram.soch@bccn-berlin.de

2024-01-19 00:51:30 -0800

Posterior distribution for Bayesian linear regression with known covariance

Statistical Models

Univariate normal data

Bayesian linear regression with known covariance

Posterior distribution

authors	year	title	in	pages	url
Bishop CM	2006	Bayesian linear regression	Pattern Recognition for Machine Learning	pp. 152-161, eqs. 3.49-3.51, ex. 3.7	https://www.springer.com/gp/book/9780387310732

authors	year	title	in	pages	url	doi
Penny WD	2012	Comparing Dynamic Causal Models using AIC, BIC and Free Energy	NeuroImage	vol. 59, iss. 2, pp. 319-330, eq. 27	https://www.sciencedirect.com/science/article/pii/S1053811911008160	10.1016/j.neuroimage.2011.07.039

P433

blrkc-post

JoramSoch

Theorem: Let

$$ \label{eq:GLM} y = X \beta + \varepsilon, ; \varepsilon \sim \mathcal{N}(0, \Sigma) $$

be a linear regression model with measured $n \times 1$ data vector $y$, known $n \times p$ design matrix $X$ and known $n \times n$ covariance matrix $\Sigma$ as well as unknown $p \times 1$ regression coefficients $\beta$. Moreover, assume a multivariate normal distribution over the model parameter $\beta$:

$$ \label{eq:GLM-N-prior} p(\beta) = \mathcal{N}(\beta; \mu_0, \Sigma_0) ; . $$

Then, the posterior distribution is also a multivariate normal distribution

$$ \label{eq:GLM-N-post} p(\beta|y) = \mathcal{N}(\beta; \mu_n, \Sigma_n) $$

and the posterior hyperparameters are given by

$$ \label{eq:GLM-N-post-par} \begin{split} \mu_n &= \Sigma_n (X^\mathrm{T} \Sigma^{-1} y + \Sigma_0^{-1} \mu_0) \\ \Sigma_n &= \left( X^\mathrm{T} \Sigma^{-1} X + \Sigma_0^{-1} \right)^{-1} ; . \end{split} $$

Proof: According to Bayes' theorem, the posterior distribution is given by

$$ \label{eq:GLM-N-BT} p(\beta|y) = \frac{p(y|\beta) , p(\beta)}{p(y)} ; . $$

Since $p(y)$ is just a normalization factor, the posterior is proportional to the numerator:

$$ \label{eq:GLM-N-post-JL} p(\beta|y) \propto p(y|\beta) , p(\beta) = p(y,\beta) ; . $$

Equation \eqref{eq:GLM} implies the following likelihood function:

$$ \label{eq:GLM-LF} p(y|\beta) = \mathcal{N}(y; X \beta, \Sigma) = \sqrt{\frac{1}{(2 \pi)^n |\Sigma|}} , \exp\left[ -\frac{1}{2} (y-X\beta)^\mathrm{T} \Sigma^{-1} (y-X\beta) \right] ; . $$

Combining the likelihood function \eqref{eq:GLM-LF} with the prior distribution \eqref{eq:GLM-N-prior} using the probability density function of the multivariate normal distribution, the joint likelihood of the model is given by

$$ \label{eq:GLM-N-JL-s1} \begin{split} p(y,\beta) = ; & p(y|\beta) , p(\beta) \\ = ; & \sqrt{\frac{1}{(2 \pi)^n |\Sigma|}} , \exp\left[ -\frac{1}{2} (y-X\beta)^\mathrm{T} \Sigma^{-1} (y-X\beta) \right] \cdot \\ ; & \sqrt{\frac{1}{(2 \pi)^p |\Sigma_0|}} , \exp\left[ -\frac{1}{2} (\beta-\mu_0)^\mathrm{T} \Sigma_0^{-1} (\beta-\mu_0) \right] ; . \end{split} $$

Collecting identical variables gives:

$$ \label{eq:GLM-N-JL-s2} \begin{split} p(y,\beta) = ; & \sqrt{\frac{1}{(2 \pi)^{n+p} |\Sigma| |\Sigma_0|}} \cdot \\ & \exp\left[ -\frac{1}{2} \left( (y-X\beta)^\mathrm{T} \Sigma^{-1} (y-X\beta) + (\beta-\mu_0)^\mathrm{T} \Sigma_0^{-1} (\beta-\mu_0) \right) \right] ; . \end{split} $$

Expanding the products in the exponent gives:

$$ \label{eq:GLM-N-JL-s3} \begin{split} p(y,\beta) = ; & \sqrt{\frac{1}{(2 \pi)^{n+p} |\Sigma| |\Sigma_0|}} \cdot \\ & \exp\left[ -\frac{1}{2} \left( y^\mathrm{T} \Sigma^{-1} y - y^\mathrm{T} \Sigma^{-1} X \beta - \beta^\mathrm{T} X^\mathrm{T} \Sigma^{-1} y + \beta^\mathrm{T} X^\mathrm{T} \Sigma^{-1} X \beta + \right. \right. \\ & \hphantom{\exp \left[ -\frac{1}{2} \right.} ; \left. \left. \beta^\mathrm{T} \Sigma_0^{-1} \beta - \beta^\mathrm{T} \Sigma_0^{-1} \mu_0 - \mu_0^\mathrm{T} \Sigma_0^{-1} \beta + \mu_0^\mathrm{T} \Sigma_0^{-1} \mu_0 \right) \right] ; . \end{split} $$

Regrouping the terms in the exponent gives:

$$ \label{eq:GLM-N-JL-s4} \begin{split} p(y,\beta) = ; & \sqrt{\frac{1}{(2 \pi)^{n+p} |\Sigma| |\Sigma_0|}} \cdot \\ & \exp\left[ -\frac{1}{2} \left( \beta^\mathrm{T} [ X^\mathrm{T} \Sigma^{-1} X + \Sigma_0^{-1} ] \beta - 2 \beta^\mathrm{T} [X^\mathrm{T} \Sigma^{-1} y + \Sigma_0^{-1} \mu_0] + \right. \right. \\ & \hphantom{\exp \left[ -\frac{1}{2} \right.} ; \left. \left. y^\mathrm{T} \Sigma^{-1} y + \mu_0^\mathrm{T} \Sigma_0^{-1} \mu_0 \right) \right] ; . \end{split} $$

Completing the square over $\beta$, we finally have

$$ \label{eq:GLM-N-JL-s5} \begin{split} p(y,\beta) = ; & \sqrt{\frac{1}{(2 \pi)^{n+p} |\Sigma| |\Sigma_0|}} \cdot \\ & \exp\left[ -\frac{1}{2} \left( (\beta-\mu_n)^\mathrm{T} \Sigma_n^{-1} (\beta-\mu_n) + (y^\mathrm{T} \Sigma^{-1} y + \mu_0^\mathrm{T} \Sigma_0^{-1} \mu_0 - \mu_n^\mathrm{T} \Sigma_n^{-1} \mu_n) \right) \right] \end{split} $$

with the posterior hyperparameters

$$ \label{eq:GLM-N-post-par-qed} \begin{split} \mu_n &= \Sigma_n (X^\mathrm{T} \Sigma^{-1} y + \Sigma_0^{-1} \mu_0) \\ \Sigma_n &= \left( X^\mathrm{T} \Sigma^{-1} X + \Sigma_0^{-1} \right)^{-1} ; . \end{split} $$

Ergo, the joint likelihood is proportional to

$$ \label{eq:GLM-N-JL-s6} p(y,\beta) \propto \exp\left[ -\frac{1}{2} (\beta-\mu_n)^\mathrm{T} \Sigma_n^{-1} (\beta-\mu_n) \right] ; , $$

such that the posterior distribution over $\beta$ is given by

$$ \label{eq:GLM-N-post-qed} p(\beta|y) = \mathcal{N}(\beta; \mu_n, \Sigma_n) $$

with the posterior hyperparameters given in \eqref{eq:GLM-N-post-par-qed}.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

blrkc-post.md

blrkc-post.md

Files

blrkc-post.md

Latest commit

History

blrkc-post.md

File metadata and controls