Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion I/Table_of_Contents.md
Original file line number Diff line number Diff line change
Expand Up @@ -73,10 +73,12 @@ Proofs by **[Number](/I/Proof_by_Number.html)** and **[Topic](/I/Proof_by_Topic.
&emsp;&ensp; 4.1.1. *[Definition](/D/mvn.html)* <br>
&emsp;&ensp; 4.1.2. **[Probability density function](/P/mvn-pdf.html)** <br>
&emsp;&ensp; 4.1.3. **[Linear transformation theorem](/P/mvn-ltt.html)** <br>
&emsp;&ensp; 4.1.4. **[Marginal distributions](/P/mvn-marg.html)** <br>

4.2. Normal-gamma distribution <br>
&emsp;&ensp; 4.2.1. *[Definition](/D/ng.html)* <br>
&emsp;&ensp; 4.2.2. **[Kullback-Leibler divergence](/P/ng-kl.html)** <br>
&emsp;&ensp; 4.2.2. **[Marginal distributions](/P/ng-marg.html)** <br>
&emsp;&ensp; 4.2.3. **[Kullback-Leibler divergence](/P/ng-kl.html)** <br>

5. Matrix-variate continuous distributions

Expand Down
51 changes: 51 additions & 0 deletions P/mvn-marg.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
---
layout: proof
mathjax: true

author: "Joram Soch"
affiliation: "BCCN Berlin"
e_mail: "joram.soch@bccn-berlin.de"
date: 2020-01-29 15:12:00

title: "Marginal distributions of the multivariate normal distribution"
chapter: "Probability Distributions"
section: "Multivariate continuous distributions"
topic: "Multivariate normal distribution"
theorem: "Marginal distributions"

sources:

proof_id: "P35"
shortcut: "mvn-marg"
username: "JoramSoch"
---


**Theorem:** Let $x$ follow a [multivariate normal distribution](/D/mvn.html):

$$ \label{eq:mvn}
x \sim \mathcal{N}(\mu, \Sigma) \; .
$$

Then, the marginal distribution of any subset vector $x_s$ is also a multivariate normal distribution

$$ \label{eq:mvn-marg}
x_s \sim \mathcal{N}(\mu_s, \Sigma_s)
$$

where $\mu_s$ drops the irrelevant variables (the ones not in the subset, i.e. marginalized out) from the mean vector $\mu$ and $\Sigma_s$ drops the corresponding rows and columns from the covariance matrix $\Sigma$.


**Proof:** Define an $m \times n$ subset matrix $S$ such that $s_{ij} = 1$, if the $j$-th element in $\mu_s$ corresponds to the $i$-th element in $x$, and $s_{ij} = 0$ otherwise. Then,

$$ \label{eq:xs}
x_s = S x
$$

and we can apply the [linear transformation theorem](/P/mvn-ltt.html) to give

$$ \label{eq:mvn-marg-qed}
x_s \sim \mathcal{N}(S \mu, S \Sigma S^\mathrm{T}) \; .
$$

Finally, we see that $S \mu = \mu_s$ and $S \Sigma S^\mathrm{T} = \Sigma_s$.
91 changes: 91 additions & 0 deletions P/ng-marg.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,91 @@
---
layout: proof
mathjax: true

author: "Joram Soch"
affiliation: "BCCN Berlin"
e_mail: "joram.soch@bccn-berlin.de"
date: 2020-01-29 21:42:00

title: "Marginal distributions of the normal-gamma distribution"
chapter: "Probability Distributions"
section: "Multivariate continuous distributions"
topic: "Normal-gamma distribution"
theorem: "Marginal distributions"

sources:

proof_id: "P36"
shortcut: "ng-marg"
username: "JoramSoch"
---


**Theorem:** Let $x$ and $y$ follow a [normal-gamma distribution](/D/ng.html):

$$ \label{eq:ng}
x,y \sim \mathrm{NG}(\mu, \Lambda, a, b) \; .
$$

Then, the marginal distribution of $y$ is a gamma distribution

$$ \label{eq:ng-marg-y}
y \sim \mathrm{Gam}(a, b)
$$

and the marginal distribution of $x$ is a multivariate t-distribution

$$ \label{eq:ng-marg-x}
x \sim \mathrm{t}\left( \mu, \left(\frac{a}{b} \Lambda \right)^{-1}, 2a \right) \; .
$$


**Proof:** The [probability density function of the normal-gamma distribution](/P/ng-pdf.html) is given by

$$ \label{eq:ng-pdf}
\begin{split}
p(x,y) &= p(x|y) \cdot p(y) \\
p(x|y) &= \mathcal{N}(x; \mu, (y \Lambda)^{-1}) \\
p(y) &= \mathrm{Gam}(y; a, b) \; .
\end{split}
$$

<br>
Using the law of marginal probability, the marginal distribution of $y$ can be derived as

$$ \label{eq:ng-marg-y-qed}
\begin{split}
p(y) &= \int p(x,y) \, \mathrm{d}x \\
&= \int \mathcal{N}(x; \mu, (y \Lambda)^{-1}) \, \mathrm{Gam}(y; a, b) \, \mathrm{d}x \\
&= \mathrm{Gam}(y; a, b) \int \mathcal{N}(x; \mu, (y \Lambda)^{-1}) \, \mathrm{d}x \\
&= \mathrm{Gam}(y; a, b)
\end{split}
$$

which is the [probability density function of the gamma distribution](/P/ng-pdf.html) with shape parameter $a$ and rate parameter $b$.

<br>
Using the law of marginal probability, the marginal distribution of $x$ can be derived as

$$ \label{eq:ng-marg-x-qed}
\begin{split}
p(x) &= \int p(x,y) \, \mathrm{d}y \\
&= \int \mathcal{N}(x; \mu, (y \Lambda)^{-1}) \, \mathrm{Gam}(y; a, b) \, \mathrm{d}y \\
&= \int \sqrt{\frac{|y \Lambda|}{\sqrt{(2 \pi)^n}}} \, \exp \left[ -\frac{1}{2} (x-\mu)^\mathrm{T} (y \Lambda) (x-\mu) \right] \cdot \frac{b^a}{\Gamma(a)} \, y^{a-1} \exp[-b y] \, \mathrm{d}y \\
&= \int \sqrt{\frac{y^n |\Lambda|}{\sqrt{(2 \pi)^n}}} \, \exp \left[ -\frac{1}{2} (x-\mu)^\mathrm{T} (y \Lambda) (x-\mu) \right] \cdot \frac{b^a}{\Gamma(a)} \, y^{a-1} \exp[-b y] \, \mathrm{d}y \\
&= \int \sqrt{\frac{|\Lambda|}{\sqrt{(2 \pi)^n}}} \cdot \frac{b^a}{\Gamma(a)} \cdot y^{a+\frac{n}{2}-1} \cdot \exp \left[ -\left( b + \frac{1}{2} (x-\mu)^\mathrm{T} \Lambda (x-\mu) \right) y \right] \mathrm{d}y \\
&= \int \sqrt{\frac{|\Lambda|}{\sqrt{(2 \pi)^n}}} \cdot \frac{b^a}{\Gamma(a)} \cdot \frac{\Gamma\left( a+\frac{n}{2} \right)}{\left( b + \frac{1}{2} (x-\mu)^\mathrm{T} \Lambda (x-\mu) \right)^{a+\frac{n}{2}}} \cdot \mathrm{Gam}\left( y; a+\frac{n}{2}, b + \frac{1}{2} (x-\mu)^\mathrm{T} \Lambda (x-\mu) \right) \mathrm{d}y \\
&= \sqrt{\frac{|\Lambda|}{\sqrt{(2 \pi)^n}}} \cdot \frac{b^a}{\Gamma(a)} \cdot \frac{\Gamma\left( a+\frac{n}{2} \right)}{\left( b + \frac{1}{2} (x-\mu)^\mathrm{T} \Lambda (x-\mu) \right)^{a+\frac{n}{2}}} \int \mathrm{Gam}\left( y; a+\frac{n}{2}, b + \frac{1}{2} (x-\mu)^\mathrm{T} \Lambda (x-\mu) \right) \mathrm{d}y \\
&= \sqrt{\frac{|\Lambda|}{\sqrt{(2 \pi)^n}}} \cdot \frac{b^a}{\Gamma(a)} \cdot \frac{\Gamma\left( a+\frac{n}{2} \right)}{\left( b + \frac{1}{2} (x-\mu)^\mathrm{T} \Lambda (x-\mu) \right)^{a+\frac{n}{2}}} \\
&= \frac{\sqrt{|\Lambda|}}{(2 \pi)^\frac{n}{2}} \cdot \frac{\Gamma\left( \frac{2a+n}{2} \right)}{\Gamma\left( \frac{2a}{2} \right)} \cdot b^a \cdot \left( b + \frac{1}{2} (x-\mu)^\mathrm{T} \Lambda (x-\mu) \right)^{-\left( a+\frac{n}{2} \right)} \\
&= \frac{\sqrt{|\Lambda|}}{\pi^\frac{n}{2}} \cdot \frac{\Gamma\left( \frac{2a+n}{2} \right)}{\Gamma\left( \frac{2a}{2} \right)} \cdot \left( \frac{1}{b} \right)^{-a} \cdot \left( b + \frac{1}{2} (x-\mu)^\mathrm{T} \Lambda (x-\mu) \right)^{-a} \cdot 2^{-\frac{n}{2}} \cdot \left( b + \frac{1}{2} (x-\mu)^\mathrm{T} \Lambda (x-\mu) \right)^{-\frac{n}{2}} \\
&= \frac{\sqrt{|\Lambda|}}{\pi^\frac{n}{2}} \cdot \frac{\Gamma\left( \frac{2a+n}{2} \right)}{\Gamma\left( \frac{2a}{2} \right)} \cdot \left( 1 + \frac{1}{2b} (x-\mu)^\mathrm{T} \Lambda (x-\mu) \right)^{-a} \cdot \left( 2b + (x-\mu)^\mathrm{T} \Lambda (x-\mu) \right)^{-\frac{n}{2}} \\
&= \frac{\sqrt{|\Lambda|}}{\pi^\frac{n}{2}} \cdot \frac{\Gamma\left( \frac{2a+n}{2} \right)}{\Gamma\left( \frac{2a}{2} \right)} \cdot \left( \frac{1}{2a} \right)^{-a} \cdot \left( 2a + (x-\mu)^\mathrm{T} \left( \frac{a}{b}\Lambda \right) (x-\mu) \right)^{-a} \cdot \left( \frac{b}{a} \right)^{-\frac{n}{2}} \cdot \left( 2a + (x-\mu)^\mathrm{T} \left( \frac{a}{b}\Lambda \right) (x-\mu) \right)^{-\frac{n}{2}} \\
&= \frac{\sqrt{\left( \frac{a}{b} \right)^n |\Lambda|}}{(2a)^{-a}\,\pi^\frac{n}{2}} \cdot \frac{\Gamma\left( \frac{2a+n}{2} \right)}{\Gamma\left( \frac{2a}{2} \right)} \cdot \left( 2a + (x-\mu)^\mathrm{T} \left( \frac{a}{b}\Lambda \right) (x-\mu) \right)^{-a} \cdot \left( 2a + (x-\mu)^\mathrm{T} \left( \frac{a}{b}\Lambda \right) (x-\mu) \right)^{-\frac{n}{2}} \\
&= \frac{\sqrt{\left( \frac{a}{b} \right)^n |\Lambda|}}{(2a)^{-a}\,\pi^\frac{n}{2}} \cdot \frac{\Gamma\left( \frac{2a+n}{2} \right)}{\Gamma\left( \frac{2a}{2} \right)} \cdot (2a)^{-a} \cdot \left( 1 + \frac{1}{2a} (x-\mu)^\mathrm{T} \left( \frac{a}{b}\Lambda \right) (x-\mu) \right)^{-a} \cdot (2a)^{-\frac{n}{2}} \cdot \left( 1 + \frac{1}{2a} (x-\mu)^\mathrm{T} \left( \frac{a}{b}\Lambda \right) (x-\mu) \right)^{-\frac{n}{2}} \\
&= \frac{\sqrt{\left( \frac{a}{b} \right)^n |\Lambda|}}{(2a)^\frac{n}{2}\,\pi^\frac{n}{2}} \cdot \frac{\Gamma\left( \frac{2a+n}{2} \right)}{\Gamma\left( \frac{2a}{2} \right)} \cdot \left( 1 + \frac{1}{2a} (x-\mu)^\mathrm{T} \left( \frac{a}{b}\Lambda \right) (x-\mu) \right)^{-\frac{2a+n}{2}} \\
&= \sqrt{\frac{\left| \frac{a}{b}\,\Lambda \right|}{(2a\,\pi)^n}} \cdot \frac{\Gamma\left( \frac{2a+n}{2} \right)}{\Gamma\left( \frac{2a}{2} \right)} \cdot \left( 1 + \frac{1}{2a} (x-\mu)^\mathrm{T} \left( \frac{a}{b}\Lambda \right) (x-\mu) \right)^{-\frac{2a+n}{2}} \\
\end{split}
$$

which is the [probability density function of a multivariate t-distribution](/P/mvt-pdf.html) with mean vector $\mu$, shape matrix $\left( \frac{a}{b}\Lambda \right)^{-1}$ and $2a$ degrees of freedom.