# T-Test

A *t-test* is a statistical test used to compare the means of two groups and determine if the difference between them is statistically significant.

It's commonly used when dealing with small sample sizes and assumes that the data follows a normal distribution.


### Variable Explanation

- $\bar{X}_1$ and $\bar{X}_2$: The sample means of group  1 and group 2, respectively.
- $s^2_1$ and $s^2_2$: The variances (squared standard deviations) of group 1 and group 2, respectively.
- $n_1$ and $n_2$: The sample sizes of group 1 and group 2, respectively.
- $t$: The t-statistic, which measures how many standard deviations the observed difference between the groups is from zero.

### Mathematical Formula

The formula for a *t-test* comparing two sample means:

$$
t = \frac{\bar{X}_1 - \bar{X}_2} {\sqrt{(\frac{s^2_1} {n_1}) + (\frac{s^2_2} {n_2})}}
$$

### Formula Breakdown

1. **Difference of means**: $\bar{X}_1 - \bar{X}_2$ calculates the difference between the two sample means.
2. **Standard error**: $\sqrt{(\frac{s^2_1} {n_1}) + (\frac{s^2_2} {n_2})}$ measures the variability of the sample mean differences.
3. **t-statistic**: measures how many standard errors the observed difference between the sample means is away from the expected difference is (which is typically zero).

### Example
Suppose two classes math major and arts/history major both just wrote a test in chemistry, and you want to see if there's a significant difference in the scores they got. After gathering a random sample from both classes, you perform a t-test. The calculated t-value helps you determine whether the difference in test scores is due to random chance or is statistically significant.

**Math Major**:
- Sample mean: $\bar{X}_1$ = 85
- Sample size: $n_1$ = 10
- Standard deviation: $s_1$ = 5

**Arts/History Major**:
- Sample mean: $\bar{X}_2$ = 80
- Sample size: $n_2$ = 12
- Standard deviation: $s_2$ = 6

Using the t-test formula:

$$
t = \frac{85 - 80} {\sqrt{(\frac{5^2} {10}) + (\frac{6^2} {12})}} = \frac{5} {\sqrt{5.5}} \approx 2.13
$$

This lets you calculate **t-statistics** for two samples each one from a different group. And you will be able to tell if there is a statistically significant difference between them.

## Interpreting the results

### Degrees of Freedom (df) in a T-Test: Basic Information

In the context of a **t-test, degrees of freedom (df)** refer to the number of information and independent values that can vary. It's a critical factor that influences the shape of the **t-distribution**, especially with small sample sizes. More degrees of freedom result in a **t-distribution** closer to a normal distribution, while fewer degrees of freedom lead to a wider t-distribution with heavier tails.

### Mathematical Formula for Degrees of Freedom

The formula for degrees of freedom depends on the type of t-test you're using. For a **two-sample t-test** (with equal variances assumed), the formula is:

$$
df = (n_1 - 1) + (n_2 - 1)
$$

### Variable Explanation

- $n_1$: The sample size of group 1.
- $n_2$: The sample size of group 2.
- $df$: The degrees of freedom, which account for how many values are free to vary in estimating population parameters.

### Breakdown of the Formula

- 1 $n_1 - 1$: For group 1, the degrees of freedom are based on how many values are independent to vary when calculating the mean. We have $n_1$ free independent information from $n_1$ samples. We also get one additional information in this case the mean. Then, if we were to know $n_1 - 1$ values of the samples, using the additional information the mean, we may deduce the last unknown (free) value, in this case we only need the $n_1 - 1$ information to get all the information.
- 2 $n_2 - 1$: Similarly, for group 2, we subtract 1 from the sample size to reflect that after calculating the mean, the remaining data points are free to vary.
- 3 **Adding the two terms**: By summing the degrees of freedom from both groups, you get the total degrees of freedom for a two-sample t-test.