Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
42 changes: 42 additions & 0 deletions D/covmat-samp.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
---
layout: definition
mathjax: true

author: "Joram Soch"
affiliation: "BCCN Berlin"
e_mail: "joram.soch@bccn-berlin.de"
date: 2021-05-20 07:46:00

title: "Sample covariance matrix"
chapter: "General Theorems"
section: "Probability theory"
topic: "Covariance"
definition: "Sample covariance matrix"

sources:
- authors: "Wikipedia"
year: 2021
title: "Sample mean and covariance"
in: "Wikipedia, the free encyclopedia"
pages: "retrieved on 2020-05-20"
url: "https://en.wikipedia.org/wiki/Sample_mean_and_covariance#Definition_of_sample_covariance"

def_id: "D153"
shortcut: "covmat-samp"
username: "JoramSoch"
---


**Definition:** Let $x = \left\lbrace x_1, \ldots, x_n \right\rbrace$ be a [sample](/D/samp) from a [random vector](/D/rvec) $X \in \mathbb{R}^{p \times 1}$. Then, the sample covariance matrix of $x$ is given by

$$ \label{eq:cov-samp}
\hat{\Sigma} = \frac{1}{n} \sum_{i=1}^{n} (x_i - \bar{x}) (x_i - \bar{x})^\mathrm{T}
$$

and the unbiased sample variance matrix of $x$ is given by

$$ \label{eq:cov-samp-unb}
S = \frac{1}{n-1} \sum_{i=1}^{n} (x_i - \bar{x}) (x_i - \bar{x})^\mathrm{T}
$$

where $\bar{x}$ is the [sample mean](/D/mean-samp).
40 changes: 40 additions & 0 deletions D/nst.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
---
layout: definition
mathjax: true

author: "Joram Soch"
affiliation: "BCCN Berlin"
e_mail: "joram.soch@bccn-berlin.de"
date: 2021-05-20 07:35:00

title: "Non-standardized t-distribution"
chapter: "Probability Distributions"
section: "Univariate continuous distributions"
topic: "t-distribution"
definition: "Non-standardized t-distribution"

sources:
- authors: "Wikipedia"
year: 2021
title: "Student's t-distribution"
in: "Wikipedia, the free encyclopedia"
pages: "retrieved on 2021-05-20"
url: "https://en.wikipedia.org/wiki/Student%27s_t-distribution#Generalized_Student's_t-distribution"

def_id: "D152"
shortcut: "nst"
username: "JoramSoch"
---


**Definition:** Let $X$ be a [random variable](/D/rvar) following a [Student's t-distribution](/D/t) with $\nu$ degrees of freedom. Then, the [random variable](/D/rvar)

$$ \label{eq:Y}
Y = \sigma X + \mu
$$

is said to follow a non-standardized t-distribution with non-centrality $\mu$, scale $\sigma^2$ and degrees of freedom $\nu$:

$$ \label{eq:nct}
Y \sim \mathrm{nst}(\mu, \sigma^2, \nu) \; .
$$
39 changes: 21 additions & 18 deletions I/Table_of_Contents.md
Original file line number Diff line number Diff line change
Expand Up @@ -98,10 +98,11 @@ title: "Table of Contents"
&emsp;&ensp; 1.7.4. **[Covariance under independence](/P/cov-ind)** <br>
&emsp;&ensp; 1.7.5. **[Relationship to correlation](/P/cov-corr)** <br>
&emsp;&ensp; 1.7.6. *[Covariance matrix](/D/covmat)* <br>
&emsp;&ensp; 1.7.7. **[Covariance matrix and expected values](/P/covmat-mean)** <br>
&emsp;&ensp; 1.7.8. **[Covariance matrix and correlation matrix](/P/covmat-corrmat)** <br>
&emsp;&ensp; 1.7.9. *[Precision matrix](/D/precmat)* <br>
&emsp;&ensp; 1.7.10. **[Precision matrix and correlation matrix](/P/precmat-corrmat)** <br>
&emsp;&ensp; 1.7.7. *[Sample covariance matrix](/D/covmat-samp)* <br>
&emsp;&ensp; 1.7.8. **[Covariance matrix and expected values](/P/covmat-mean)** <br>
&emsp;&ensp; 1.7.9. **[Covariance matrix and correlation matrix](/P/covmat-corrmat)** <br>
&emsp;&ensp; 1.7.10. *[Precision matrix](/D/precmat)* <br>
&emsp;&ensp; 1.7.11. **[Precision matrix and correlation matrix](/P/precmat-corrmat)** <br>

1.8. Correlation <br>
&emsp;&ensp; 1.8.1. *[Definition](/D/corr)* <br>
Expand Down Expand Up @@ -299,23 +300,25 @@ title: "Table of Contents"
&emsp;&ensp; 3.2.3. **[Relation to standard normal distribution](/P/norm-snorm)** (1) <br>
&emsp;&ensp; 3.2.4. **[Relation to standard normal distribution](/P/norm-snorm2)** (2) <br>
&emsp;&ensp; 3.2.5. **[Relation to standard normal distribution](/P/norm-snorm3)** (3) <br>
&emsp;&ensp; 3.2.6. **[Gaussian integral](/P/norm-gi)** <br>
&emsp;&ensp; 3.2.7. **[Probability density function](/P/norm-pdf)** <br>
&emsp;&ensp; 3.2.8. **[Moment-generating function](/P/norm-mgf)** <br>
&emsp;&ensp; 3.2.9. **[Cumulative distribution function](/P/norm-cdf)** <br>
&emsp;&ensp; 3.2.10. **[Cumulative distribution function without error function](/P/norm-cdfwerf)** <br>
&emsp;&ensp; 3.2.11. **[Quantile function](/P/norm-qf)** <br>
&emsp;&ensp; 3.2.12. **[Mean](/P/norm-mean)** <br>
&emsp;&ensp; 3.2.13. **[Median](/P/norm-med)** <br>
&emsp;&ensp; 3.2.14. **[Mode](/P/norm-mode)** <br>
&emsp;&ensp; 3.2.15. **[Variance](/P/norm-var)** <br>
&emsp;&ensp; 3.2.16. **[Full width at half maximum](/P/norm-fwhm)** <br>
&emsp;&ensp; 3.2.17. **[Differential entropy](/P/norm-dent)** <br>
&emsp;&ensp; 3.2.18. **[Kullback-Leibler divergence](/P/norm-kl)** <br>
&emsp;&ensp; 3.2.6. **[Relationship to chi-squared distribution](/P/norm-chi2)** <br>
&emsp;&ensp; 3.2.7. **[Gaussian integral](/P/norm-gi)** <br>
&emsp;&ensp; 3.2.8. **[Probability density function](/P/norm-pdf)** <br>
&emsp;&ensp; 3.2.9. **[Moment-generating function](/P/norm-mgf)** <br>
&emsp;&ensp; 3.2.10. **[Cumulative distribution function](/P/norm-cdf)** <br>
&emsp;&ensp; 3.2.11. **[Cumulative distribution function without error function](/P/norm-cdfwerf)** <br>
&emsp;&ensp; 3.2.12. **[Quantile function](/P/norm-qf)** <br>
&emsp;&ensp; 3.2.13. **[Mean](/P/norm-mean)** <br>
&emsp;&ensp; 3.2.14. **[Median](/P/norm-med)** <br>
&emsp;&ensp; 3.2.15. **[Mode](/P/norm-mode)** <br>
&emsp;&ensp; 3.2.16. **[Variance](/P/norm-var)** <br>
&emsp;&ensp; 3.2.17. **[Full width at half maximum](/P/norm-fwhm)** <br>
&emsp;&ensp; 3.2.18. **[Differential entropy](/P/norm-dent)** <br>
&emsp;&ensp; 3.2.19. **[Kullback-Leibler divergence](/P/norm-kl)** <br>

3.3. t-distribution <br>
&emsp;&ensp; 3.3.1. *[Definition](/D/t)* <br>
&emsp;&ensp; 3.3.2. **[Relationship to non-central scaled t-distribution](/P/ncst-t)** <br>
&emsp;&ensp; 3.3.2. *[Non-standardized t-distribution](/D/nst)* <br>
&emsp;&ensp; 3.3.3. **[Relationship to non-central scaled t-distribution](/P/ncst-t)** <br>

3.4. Gamma distribution <br>
&emsp;&ensp; 3.4.1. *[Definition](/D/gam)* <br>
Expand Down
142 changes: 142 additions & 0 deletions P/norm-chi2.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,142 @@
---
layout: proof
mathjax: true

author: "Joram Soch"
affiliation: "BCCN Berlin"
e_mail: "joram.soch@bccn-berlin.de"
date: 20201-05-20 10:18:00

title: "Relationship between normal distribution and chi-squared distribution"
chapter: "Probability Distributions"
section: "Univariate continuous distributions"
topic: "Normal distribution"
theorem: "Relationship to chi-squared distribution"

sources:
- authors: "Glen_b"
year: 2014
title: "Why is the sampling distribution of variance a chi-squared distribution?"
in: "StackExchange CrossValidated"
pages: "retrieved on 2021-05-20"
url: "https://stats.stackexchange.com/questions/121662/why-is-the-sampling-distribution-of-variance-a-chi-squared-distribution"
- authors: "Wikipedia"
year: 2021
title: "Cochran's theorem"
in: "Wikipedia, the free encyclopedia"
pages: "retrieved on 2020-05-20"
url: "https://en.wikipedia.org/wiki/Cochran%27s_theorem#Sample_mean_and_sample_variance"

proof_id: "P233"
shortcut: "norm-snorm"
username: "JoramSoch"
---


**Theorem:** Let $X_1, \ldots, X_n$ be [independent](/D/ind) [random variables](/D/rvar) where each of them is following a [normal distribution](/D/norm) with mean $\mu$ and variance $\sigma^2$:

$$ \label{eq:norm}
X_i \sim \mathcal{N}(\mu, \sigma^2) \quad \text{for} \quad i = 1, \ldots, n \; .
$$

Define the [sample mean](/D/mean-samp)

$$ \label{eq:mean-samp}
\bar{X} = \frac{1}{n} \sum_{i=1}^{n} X_i
$$

and the [unbiased sample variance](/D/var-samp)

$$ \label{eq:var-samp}
s^2 = \frac{1}{n-1} \sum_{i=1}^{n} \left( X_i - \bar{X} \right)^2 \; .
$$

Then, the [sampling distribution](/D/dist-samp) of the sample variance is given by a [chi-squared distribution](/D/chi2) with $n-1$ degrees of freedom:

$$ \label{eq:norm-chi2}
V = (n-1) \, \frac{s^2}{\sigma^2} \sim \chi^2(n-1) \; .
$$


**Proof:** Consider the [random variable](/D/rvar) $U_i$ defined as

$$ \label{eq:Ui}
U_i = \frac{X_i - \mu}{\sigma}
$$

which [follows a standard normal distribution](/P/norm-snorm)

$$ \label{eq:norm-snorm}
U_i \sim \mathcal{N}(0,1) \; .
$$

Then, the sum of squared random variables $U_i$ can be rewritten as

$$ \label{eq:sum-Ui2-s1}
\begin{split}
\sum_{i=1}^{n} U_i^2 &= \sum_{i=1}^{n} \left( \frac{X_i - \mu}{\sigma} \right)^2 \\
&= \sum_{i=1}^{n} \left( \frac{(X_i - \bar{X}) + (\bar{X} - \mu)}{\sigma} \right)^2 \\
&= \sum_{i=1}^{n} \frac{(X_i - \bar{X})^2}{\sigma^2} + \sum_{i=1}^{n} \frac{(\bar{X} - \mu)^2}{\sigma^2} + \sum_{i=1}^{n} \frac{(X_i - \bar{X})(\bar{X} - \mu)}{\sigma^2} \\
&= \sum_{i=1}^{n} \left( \frac{X_i - \bar{X}}{\sigma^2} \right)^2 + \sum_{i=1}^{n} \left( \frac{\bar{X} - \mu}{\sigma^2} \right)^2 + \frac{(\bar{X} - \mu)}{\sigma^2} \sum_{i=1}^{n} (X_i - \bar{X}) \; .
\end{split}
$$

Because the following sum is zero

$$ \label{eq:Xi-Xb}
\begin{split}
\sum_{i=1}^{n} (X_i - \bar{X}) &= \sum_{i=1}^{n} X_i - n \bar{X} \\
&= \sum_{i=1}^{n} X_i - n \cdot \frac{1}{n} \sum_{i=1}^{n} X_i \\
&= \sum_{i=1}^{n} X_i - \sum_{i=1}^{n} X_i \\
&= 0 \; ,
\end{split}
$$

the third term disappears, i.e.

$$ \label{eq:sum-Ui2-s2}
\sum_{i=1}^{n} U_i^2 = \sum_{i=1}^{n} \left( \frac{X_i - \bar{X}}{\sigma^2} \right)^2 + \sum_{i=1}^{n} \left( \frac{\bar{X} - \mu}{\sigma^2} \right)^2 \; .
$$

[Cochran's theorem](/P/snorm-cochran) states that, if a sum of squared [standard normal](/D/snorm) [random variables](/D/rvar) can be written as a sum of squared forms

$$ \label{eq:cochran-p1}
\begin{split}
\sum_{i=1}^{n} U_i^2 = \sum_{j=1}^{m} Q_j \quad &\text{where} \quad Q_j = \sum_{k=1}^{n} \sum_{l=1}^{n} U_k B^{(j)}_{kl} U_l \\
&\text{with} \quad \sum_{j=1}^{m} B^{(j)} = I_n \\
&\text{and} \quad r_j = \mathrm{rank}(B^{(j)}) \; ,
\end{split}
$$

then the terms $Q_j$ are [independent](/D/ind) and each term $Q_j$ follows a [chi-squared distribution](/D/chi2) with $r_j$ degrees of freedom:

$$ \label{eq:cochran-p2}
Q_j \sim \chi^2(r_j) \; .
$$

We observe that \eqref{eq:sum-Ui2-s2} can be represented as

$$ \label{eq:sum-Ui2-s3}
\begin{split}
\sum_{i=1}^{n} U_i^2 &= \sum_{i=1}^{n} \left( \frac{X_i - \bar{X}}{\sigma^2} \right)^2 + \sum_{i=1}^{n} \left( \frac{\bar{X} - \mu}{\sigma^2} \right)^2 \\
= Q_1 + Q_2 &= \sum_{i=1}^{n} \left( U_i - \frac{1}{n} \sum_{j=1}^n U_j \right)^2 + \frac{1}{n} \left( \sum_{i=1}^{n} U_i \right)^2
\end{split}
$$

where, with the $n \times n$ matrix of ones $J_n$, the matrices $B^{(j)}$ are

$$ \label{eq:sum-Ui2-s3-Bj}
B^{(1)} = I_n - \frac{J_n}{n} \quad \text{and} \quad B^{(2)} = \frac{J_n}{n} \; .
$$

Because all columns of $B^{(2)}$ are identical, it has rank $r_2 = 1$. Because the $n$ columns of $B^{(1)}$ add up to zero, it has rank $r_1 = n-1$. Thus, the conditions of [Cochran's theorem](/P/snorm-cochran) are met and the squared form

$$ \label{eq:Q1}
Q_1 = \sum_{i=1}^{n} \left( \frac{X_i - \bar{X}}{\sigma^2} \right)^2 = (n-1) \, \frac{1}{\sigma^2} \, \frac{1}{n-1} \sum_{i=1}^{n} \left( X_i - \bar{X} \right)^2 = (n-1) \, \frac{s^2}{\sigma^2}
$$

follows a [chi-squared distribution](/D/chi2) with $n-1$ degrees of freedom:

$$ \label{eq:norm-chi2-qed}
(n-1) \, \frac{s^2}{\sigma^2} \sim \chi^2(n-1) \; .
$$