StatProofBook · StatProofBook · Jan 29, 2020 · Jan 29, 2020 · Jan 29, 2020
diff --git a/I/Table_of_Contents.md b/I/Table_of_Contents.md
@@ -73,10 +73,12 @@ Proofs by **[Number](/I/Proof_by_Number.html)** and **[Topic](/I/Proof_by_Topic.
    &emsp;&ensp; 4.1.1. *[Definition](/D/mvn.html)* <br>
    &emsp;&ensp; 4.1.2. **[Probability density function](/P/mvn-pdf.html)** <br>
    &emsp;&ensp; 4.1.3. **[Linear transformation theorem](/P/mvn-ltt.html)** <br>
+   &emsp;&ensp; 4.1.4. **[Marginal distributions](/P/mvn-marg.html)** <br>
 
    4.2. Normal-gamma distribution <br>
    &emsp;&ensp; 4.2.1. *[Definition](/D/ng.html)* <br>
-   &emsp;&ensp; 4.2.2. **[Kullback-Leibler divergence](/P/ng-kl.html)** <br>
+   &emsp;&ensp; 4.2.2. **[Marginal distributions](/P/ng-marg.html)** <br>
+   &emsp;&ensp; 4.2.3. **[Kullback-Leibler divergence](/P/ng-kl.html)** <br>
 
 5. Matrix-variate continuous distributions
 

diff --git a/P/mvn-marg.md b/P/mvn-marg.md
@@ -0,0 +1,51 @@
+---
+layout: proof
+mathjax: true
+
+author: "Joram Soch"
+affiliation: "BCCN Berlin"
+e_mail: "joram.soch@bccn-berlin.de"
+date: 2020-01-29 15:12:00
+
+title: "Marginal distributions of the multivariate normal distribution"
+chapter: "Probability Distributions"
+section: "Multivariate continuous distributions"
+topic: "Multivariate normal distribution"
+theorem: "Marginal distributions"
+
+sources:
+
+proof_id: "P35"
+shortcut: "mvn-marg"
+username: "JoramSoch"
+---
+
+
+**Theorem:** Let $x$ follow a [multivariate normal distribution](/D/mvn.html):
+
+$$ \label{eq:mvn}
+x \sim \mathcal{N}(\mu, \Sigma) \; .
+$$
+
+Then, the marginal distribution of any subset vector $x_s$ is also a multivariate normal distribution
+
+$$ \label{eq:mvn-marg}
+x_s \sim \mathcal{N}(\mu_s, \Sigma_s)
+$$
+
+where $\mu_s$ drops the irrelevant variables (the ones not in the subset, i.e. marginalized out) from the mean vector $\mu$ and $\Sigma_s$ drops the corresponding rows and columns from the covariance matrix $\Sigma$.
+
+
+**Proof:** Define an $m \times n$ subset matrix $S$ such that $s_{ij} = 1$, if the $j$-th element in $\mu_s$ corresponds to the $i$-th element in $x$, and $s_{ij} = 0$ otherwise. Then,
+
+$$ \label{eq:xs}
+x_s = S x
+$$
+
+and we can apply the [linear transformation theorem](/P/mvn-ltt.html) to give
+
+$$ \label{eq:mvn-marg-qed}
+x_s \sim \mathcal{N}(S \mu, S \Sigma S^\mathrm{T}) \; .
+$$
+
+Finally, we see that $S \mu = \mu_s$ and $S \Sigma S^\mathrm{T} = \Sigma_s$.
diff --git a/P/ng-marg.md b/P/ng-marg.md
@@ -0,0 +1,91 @@
+---
+layout: proof
+mathjax: true
+
+author: "Joram Soch"
+affiliation: "BCCN Berlin"
+e_mail: "joram.soch@bccn-berlin.de"
+date: 2020-01-29 21:42:00
+
+title: "Marginal distributions of the normal-gamma distribution"
+chapter: "Probability Distributions"
+section: "Multivariate continuous distributions"
+topic: "Normal-gamma distribution"
+theorem: "Marginal distributions"
+
+sources:
+
+proof_id: "P36"
+shortcut: "ng-marg"
+username: "JoramSoch"
+---
+
+
+**Theorem:** Let $x$ and $y$ follow a [normal-gamma distribution](/D/ng.html):
+
+$$ \label{eq:ng}
+x,y \sim \mathrm{NG}(\mu, \Lambda, a, b) \; .
+$$
+
+Then, the marginal distribution of $y$ is a gamma distribution
+
+$$ \label{eq:ng-marg-y}
+y \sim \mathrm{Gam}(a, b)
+$$
+
+and the marginal distribution of $x$ is a multivariate t-distribution
+
+$$ \label{eq:ng-marg-x}
+x \sim \mathrm{t}\left( \mu, \left(\frac{a}{b} \Lambda \right)^{-1}, 2a \right) \; .
+$$
+
+
+**Proof:** The [probability density function of the normal-gamma distribution](/P/ng-pdf.html) is given by
+
+$$ \label{eq:ng-pdf}
+\begin{split}
+p(x,y) &= p(x|y) \cdot p(y) \\
+p(x|y) &= \mathcal{N}(x; \mu, (y \Lambda)^{-1}) \\
+p(y) &= \mathrm{Gam}(y; a, b) \; .
+\end{split}
+$$
+
+<br>
+Using the law of marginal probability, the marginal distribution of $y$ can be derived as
+
+$$ \label{eq:ng-marg-y-qed}
+\begin{split}
+p(y) &= \int p(x,y) \, \mathrm{d}x \\
+&= \int \mathcal{N}(x; \mu, (y \Lambda)^{-1}) \, \mathrm{Gam}(y; a, b) \, \mathrm{d}x \\
+&= \mathrm{Gam}(y; a, b) \int \mathcal{N}(x; \mu, (y \Lambda)^{-1}) \, \mathrm{d}x \\
+&= \mathrm{Gam}(y; a, b)
+\end{split}
+$$
+
+which is the [probability density function of the gamma distribution](/P/ng-pdf.html) with shape parameter $a$ and rate parameter $b$.
+
+<br>
+Using the law of marginal probability, the marginal distribution of $x$ can be derived as
+
+$$ \label{eq:ng-marg-x-qed}
+\begin{split}
+p(x) &= \int p(x,y) \, \mathrm{d}y \\
+&= \int \mathcal{N}(x; \mu, (y \Lambda)^{-1}) \, \mathrm{Gam}(y; a, b) \, \mathrm{d}y \\
+&= \int \sqrt{\frac{|y \Lambda|}{\sqrt{(2 \pi)^n}}} \, \exp \left[ -\frac{1}{2} (x-\mu)^\mathrm{T} (y \Lambda) (x-\mu) \right] \cdot \frac{b^a}{\Gamma(a)} \, y^{a-1} \exp[-b y] \, \mathrm{d}y \\
+&= \int \sqrt{\frac{y^n |\Lambda|}{\sqrt{(2 \pi)^n}}} \, \exp \left[ -\frac{1}{2} (x-\mu)^\mathrm{T} (y \Lambda) (x-\mu) \right] \cdot \frac{b^a}{\Gamma(a)} \, y^{a-1} \exp[-b y] \, \mathrm{d}y \\
+&= \int \sqrt{\frac{|\Lambda|}{\sqrt{(2 \pi)^n}}} \cdot \frac{b^a}{\Gamma(a)} \cdot y^{a+\frac{n}{2}-1} \cdot \exp \left[ -\left( b + \frac{1}{2} (x-\mu)^\mathrm{T} \Lambda (x-\mu) \right) y \right] \mathrm{d}y \\
+&= \int \sqrt{\frac{|\Lambda|}{\sqrt{(2 \pi)^n}}} \cdot \frac{b^a}{\Gamma(a)} \cdot \frac{\Gamma\left( a+\frac{n}{2} \right)}{\left( b + \frac{1}{2} (x-\mu)^\mathrm{T} \Lambda (x-\mu) \right)^{a+\frac{n}{2}}} \cdot \mathrm{Gam}\left( y; a+\frac{n}{2}, b + \frac{1}{2} (x-\mu)^\mathrm{T} \Lambda (x-\mu) \right) \mathrm{d}y \\
+&= \sqrt{\frac{|\Lambda|}{\sqrt{(2 \pi)^n}}} \cdot \frac{b^a}{\Gamma(a)} \cdot \frac{\Gamma\left( a+\frac{n}{2} \right)}{\left( b + \frac{1}{2} (x-\mu)^\mathrm{T} \Lambda (x-\mu) \right)^{a+\frac{n}{2}}} \int \mathrm{Gam}\left( y; a+\frac{n}{2}, b + \frac{1}{2} (x-\mu)^\mathrm{T} \Lambda (x-\mu) \right) \mathrm{d}y \\
+&= \sqrt{\frac{|\Lambda|}{\sqrt{(2 \pi)^n}}} \cdot \frac{b^a}{\Gamma(a)} \cdot \frac{\Gamma\left( a+\frac{n}{2} \right)}{\left( b + \frac{1}{2} (x-\mu)^\mathrm{T} \Lambda (x-\mu) \right)^{a+\frac{n}{2}}} \\
+&= \frac{\sqrt{|\Lambda|}}{(2 \pi)^\frac{n}{2}} \cdot \frac{\Gamma\left( \frac{2a+n}{2} \right)}{\Gamma\left( \frac{2a}{2} \right)} \cdot b^a \cdot \left( b + \frac{1}{2} (x-\mu)^\mathrm{T} \Lambda (x-\mu) \right)^{-\left( a+\frac{n}{2} \right)} \\
+&= \frac{\sqrt{|\Lambda|}}{\pi^\frac{n}{2}} \cdot \frac{\Gamma\left( \frac{2a+n}{2} \right)}{\Gamma\left( \frac{2a}{2} \right)} \cdot \left( \frac{1}{b} \right)^{-a} \cdot \left( b + \frac{1}{2} (x-\mu)^\mathrm{T} \Lambda (x-\mu) \right)^{-a} \cdot 2^{-\frac{n}{2}} \cdot \left( b + \frac{1}{2} (x-\mu)^\mathrm{T} \Lambda (x-\mu) \right)^{-\frac{n}{2}} \\
+&= \frac{\sqrt{|\Lambda|}}{\pi^\frac{n}{2}} \cdot \frac{\Gamma\left( \frac{2a+n}{2} \right)}{\Gamma\left( \frac{2a}{2} \right)} \cdot \left( 1 + \frac{1}{2b} (x-\mu)^\mathrm{T} \Lambda (x-\mu) \right)^{-a} \cdot \left( 2b + (x-\mu)^\mathrm{T} \Lambda (x-\mu) \right)^{-\frac{n}{2}} \\
+&= \frac{\sqrt{|\Lambda|}}{\pi^\frac{n}{2}} \cdot \frac{\Gamma\left( \frac{2a+n}{2} \right)}{\Gamma\left( \frac{2a}{2} \right)} \cdot \left( \frac{1}{2a} \right)^{-a} \cdot \left( 2a + (x-\mu)^\mathrm{T} \left( \frac{a}{b}\Lambda \right) (x-\mu) \right)^{-a} \cdot \left( \frac{b}{a} \right)^{-\frac{n}{2}} \cdot \left( 2a + (x-\mu)^\mathrm{T} \left( \frac{a}{b}\Lambda \right) (x-\mu) \right)^{-\frac{n}{2}} \\
+&= \frac{\sqrt{\left( \frac{a}{b} \right)^n |\Lambda|}}{(2a)^{-a}\,\pi^\frac{n}{2}} \cdot \frac{\Gamma\left( \frac{2a+n}{2} \right)}{\Gamma\left( \frac{2a}{2} \right)} \cdot \left( 2a + (x-\mu)^\mathrm{T} \left( \frac{a}{b}\Lambda \right) (x-\mu) \right)^{-a} \cdot \left( 2a + (x-\mu)^\mathrm{T} \left( \frac{a}{b}\Lambda \right) (x-\mu) \right)^{-\frac{n}{2}} \\
+&= \frac{\sqrt{\left( \frac{a}{b} \right)^n |\Lambda|}}{(2a)^{-a}\,\pi^\frac{n}{2}} \cdot \frac{\Gamma\left( \frac{2a+n}{2} \right)}{\Gamma\left( \frac{2a}{2} \right)} \cdot (2a)^{-a} \cdot \left( 1 + \frac{1}{2a} (x-\mu)^\mathrm{T} \left( \frac{a}{b}\Lambda \right) (x-\mu) \right)^{-a} \cdot (2a)^{-\frac{n}{2}} \cdot \left( 1 + \frac{1}{2a} (x-\mu)^\mathrm{T} \left( \frac{a}{b}\Lambda \right) (x-\mu) \right)^{-\frac{n}{2}} \\
+&= \frac{\sqrt{\left( \frac{a}{b} \right)^n |\Lambda|}}{(2a)^\frac{n}{2}\,\pi^\frac{n}{2}} \cdot \frac{\Gamma\left( \frac{2a+n}{2} \right)}{\Gamma\left( \frac{2a}{2} \right)} \cdot \left( 1 + \frac{1}{2a} (x-\mu)^\mathrm{T} \left( \frac{a}{b}\Lambda \right) (x-\mu) \right)^{-\frac{2a+n}{2}} \\
+&= \sqrt{\frac{\left| \frac{a}{b}\,\Lambda \right|}{(2a\,\pi)^n}} \cdot \frac{\Gamma\left( \frac{2a+n}{2} \right)}{\Gamma\left( \frac{2a}{2} \right)} \cdot \left( 1 + \frac{1}{2a} (x-\mu)^\mathrm{T} \left( \frac{a}{b}\Lambda \right) (x-\mu) \right)^{-\frac{2a+n}{2}} \\
+\end{split}
+$$
+
+which is the [probability density function of a multivariate t-distribution](/P/mvt-pdf.html) with mean vector $\mu$, shape matrix $\left( \frac{a}{b}\Lambda \right)^{-1}$ and $2a$ degrees of freedom.