# Definition

The Student's *t* distribution, discovered by a [beer guy](https://en.wikipedia.org/wiki/William_Sealy_Gosset), represents a confidence interval on the mean of a normal distribution with unknown variance, based on samples from that distribution. More specifically, if you take a sample of *n* observations from a normal distribution, then the z-score of the difference between the sample mean and the population mean computed using the estimated standard deviation of the normal distribution follows a *t* distribution.

Let's unpack that:

1\. Start with a normally distributed random variable *𝑋* with parameters $\mu$ (mean) and $\sigma^2$ (variance):

$\quad X \sim N(\mu, \sigma^2)$

2\. For *n* data points, the sample mean $\bar{X}$ is just:

$\quad\bar{X}=\frac{1}{n}\sum^n_{i-1}X_i$

3\. If you know the variance ($\sigma^2$), then the standard error of the mean (*sem*) is:

$\quad sem=\frac{a}{\sqrt{n}}$

In this case, a good test statistic that quantifies the signal-to-noise ratio is the difference between the actual mean and the sample mean (the signal, in the numerator), standardized by the standard error of the mean (the noise, in the denominator), is:

$\quad z=\frac{\bar{X}-\mu}{\sigma/\sqrt{n}}$

This quantity (which is a z-score and thus called *z*) has a standard normal distribution (i.e., mean=0, variance=1).

4\. However, if you do not know the variance ($\sigma^2$), then you need to use the [Bessel-corrected sample variance](https://mathworld.wolfram.com/BesselsCorrection.html) (*S*) to compute the standard error of the mean in the test statistic, now called *t*:

$\quad t=\frac{\bar{X}-\mu}{S/\sqrt{n}}, where\:S=\sqrt{\frac{1}{n-1}\sum^n_{i=1}{(X_i-\bar{X})^2}}$

Note the (*n*–1) term in *S*, which is the Bessel correction and is the **degrees of freedom** of the *t*-distribution. This term makes the distribution of *t* slightly different than the distribution of *z*. Specifically, *t* has "heavy tails"; i.e., a higher probability of extreme values. Note that as *n* increases, *𝑡*⟶*𝑧* (they become more and more similar).




# Additional Resources

Working with the *t* distribution in [Matlab](https://www.mathworks.com/help/stats/students-t-distribution.html), [R](https://stat.ethz.ch/R-manual/R-devel/library/stats/html/TDist.html), and [Python](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.t.html).

# Credits

Copyright 2021 by Joshua I. Gold, University of Pennsylvania