# t-Tests

z-Tests work when we ***know*** our population parameters $\mu$ and $\sigma$. However, t-Tests will allow us to work only with samples when we want to know things like

1. How different a sample mean is from a population mean
1. How different two samples means are from each other
    1. The two samples can either be independent or dependent

Note that the t-test can also be referred to Student's t ([Intersting History](https://en.wikipedia.org/wiki/Student%27s_t-test)).

### Sampling Distribution

$S$, the standard deviation of a sample of size $n$, is: $$S=\sqrt{\frac{\sum (x_i-\bar{x})^2}{n-1}}$$

When we know the population parameters $\mu$ and $\sigma$, we can reconstruct the sampling distribution. To find where a sampling mean fell on this distribution, we could compute the z-score: $$z = \frac{\bar{x}-\mu}{\frac{\sigma}{\sqrt{n}}}$$

where $\frac{\sigma}{\sqrt{n}}$ is the standard error of the mean, $SE$. However, when we don't know the population parameters, the $SE$ is dependent on $S$ which produces a shorter, and longer tailed distribution than a normal distribution. Intuitively, this makes sense given we are obviously more uncertain regarding the mean when the population standard deviation is unknown. $$t = \frac{\bar{x}-\mu}{\frac{s}{\sqrt{n-1}}}$$

As our sampling size, $n$, gets larger, the t-distribution approaches the normal distribution, gets a skinnier tail, and $s \rightarrow \sigma$.

### Degrees of Freedom

In the above formula for $S$, we notice that there is a $n-1$ term in the denominator. The variance is the average distance of the the points from the mean and is a measure of spread. When computing the standard deviation or variance for samples, rather than dividing by the number of sampled items, $n$, as is this case when we compute the standard deviation or variance of a population, we divide by the number of degrees of freedom, $n-1$.

Degrees of freedom can be defined as the number of independent pieces of information, or alternatively, the number of pieces of information that can be freely varied without violating any restrictions.

##### Example:  Degrees of Freedom in a Summation of $n$ Numbers

You have $n$ numbers that must sum to 10. $$x_1 + x_2 + \dots + x_n = 10$$ How many degrees of freedom do you have? Or in other words, how many numbers can you choose in any order you wish?

Let's start by imagining $n=4$, and we choose $x_1=13$. Then we have $$x_2, + x_3 + x_4 = -3$$  
Now, let's imagine we choose $x_2=8$. Then, we have $$x_3 + x_4 = -11$$  
Then, imagine we choose $x_3=3$. Then, we have $$x_4=-14$$

Thus, we were able to choose $3$ numbers, but $x_4$ was fixed once we chose the other $3$. In general, we have $n-1$ degrees of freedom for a summation of $n$ numbers.


##### Example:  Degrees of Freedom in Marginal Totals

Let's imagine we have a 3x3 table where each column and row must sum to 9. How many degrees of freedom do we have?

| | | | 9 |
| --- | --- | --- | ---|
| | | | 9 |
| | | | 9 |
| 9 | 9 | 9 | |

As you attempt to fill in those numbers, you will see that you can choose 4 numbers before the rest are fixed.

If we move to a 4x4 table, we will find that we have a $(n-1)(n-1)=(n-1)^2=9$ degrees of freedom.

##### Samples

Now, let's go back to $$S=\sqrt{\frac{\sum (x_i-\bar{x})^2}{n-1}}$$

The mean of the sample is defined as $\bar{x}=\frac{x_1+x_2+\dots+x_n}{n}$ or $\bar{x}\cdot n=x_1+x_2+\dots+x_n$. Using the summation example above as our proof, we see that, for a given mean $\bar{x}$, we have $n-1$ degrees of freedom.

Since $S$ depends on $\bar{x}$, there are only $n-1$ degrees of freedom. Stated alternatively, only $n-1$ values are independent once we have the mean. $n-1$ is called the effective sample size.

Now, thinking about the t-distributions and the law of large numbers, we see that as the number of degrees of freedom increases, the better the t-distribution approximates the sampling distribution. Furthermore, we know the sampling distribution of the mean is normal by the central limit theorem.

### Computing t-statistic

[t-Tables](https://s3.amazonaws.com/udacity-hosted-downloads/t-table.jpg) - Instead of a z-table, we will use a t-table when dealing with t-distributions.

##### Example 1

Imagine we want to know the t-statistic where 10% (proportion = 0.1) of the values lie to the right it for a sample size of 10. This means we have 9 degrees of freedom. So using the t-table, we see that for $df=9$ and the *Area in Right Tail* of 0.1 the t-statistic is 1.383.

##### Example 2 - One-tailed Test

What is the t-critical value for a one-tailed alpha level of 0.05 (5% of all values lie to the right of this value OR 5% of the area under the curve lies to the right of this value) with 12 degrees of freedom?  
Answer: 1.782

##### Example 3 - Two-tailed Test

You have a sample of size 30. What are the t-critical values for a two-tailed test with $\alpha=0.05$?  
Answer: $\pm 2.045$

##### Example 4 - Obtaining Probabilities from t-statistic

Your sample is size 24 and you get a t-statistic of 2.45. The area ot the right of the t-statistic is between ______ and _______?  
Answer: .02 and .01



##### Understanding t-statistic

Remember $t = \frac{\bar{x}-\mu_0}{\frac{s}{\sqrt{n}}}$, where $\bar{x}$ is from a population with mean $\mu$ and standard deviation $\sigma$. Note: We've used $\mu_0$ in the equation to illustrate that we will typically use the t-test with a null hypothesis $H_0$ and associated population mean $\mu_0$.

##### Question 1

The <u>larger/smaller</u> the value of $\bar{x}$, the stronger the evidence that $\mu \gt \mu_0$.  
Answer: larger

##### Question 2

The <u>larger/smaller</u> the value of $\bar{x}$, the stronger the evidence that $\mu \lt \mu_0$.  
Answer: smaller

##### Question 3

The further the value of $\bar{x}$ from $\mu_0$ in either direction, the <u>stronger/weaker</u> the evidence that $\mu \neq \mu_0$.  
Answer: stronger

### One-sample t-test

$$t = \frac{\bar{x}-\mu_0}{\frac{s}{\sqrt{n}}}$$

$$H_0: \mu=\mu_0$$
$$H_A: \mu \lt \mu_0 \text{ OR } \mu \gt \mu_0 \text{  OR } \mu \neq \mu_0$$

Numerator:  $\bar{x}-\mu_0$ is the difference betweent he sample mean ($\bar{x}$) and null hypothesis mean ($\mu_0$), where the sample mean is the point estimate for the population mean ($\mu$).

Denominator: $\frac{s}{\sqrt{n}}$ (aka standard error) measures the amount of difference between the population mean ($\mu$) and null hypothesis mean ($\mu_0$) that we would expect by chance.

$t^*$ is the t-statistic critical value obtained for a defined significane level $\alpha$, e.g., 0.05 (5%), 0.01 (1%), or 0.001 (.1%). If we obtain a sample that results in a t-statistic less than $-t^*$ or greater than $t^*$, then we reject $H_0$.

##### Example - [dataset](https://docs.google.com/spreadsheets/d/1PMtYHFuOAkKJGQTx5tsqjvbPSbA_kRBKlCD6KEqSAXk/edit#gid=0)

Imagine we have a statistic from some time ago taht indicates the average beak width for finches is 6.07mm. We'd like to know: Do finches today have different-sized beak widths than before?

$$H_0: \mu=6.07$$
$$H_a: \mu \neq 6.07$$

Given the dataset, we know we $n=500$, and therefore, the number of degrees of freedom, $df$, is $df=409$.

Using the dataset to obtain $\bar{x}$ and $s$, we find that $t=22.36$. Thus, we should reject $H_0$.

### P-Value

A p-value is defined as the probabiity of obtaining a sample. Remember, $t^*$ is defined by the desired $\alpha$ level (aka significance level).

For one-tailed tests, the p-value is considered significant if the probability of a t-value for a sample is greater than $t^*$ if $t$ is positive or less than $t^*$ if $t$ is negative.

For two-tailed tests, the p-value is considered significant if the probability for the t-value for a sample is less than $-t^*$ or greater than $t^*$.

We reject $H_0$ when the p-value is less than the $\alpha$.

##### Example 1

Given a hypothetical $\mu = 10$ and sample $\{5, 19, 11, 23, 12, 7, 3, 21\}$, determine if we should reject $H_0$ for $\alpha=0.05$.

We compute:

$$\bar{x} = 12.625$$
$$s = 7.5958$$
$$t = 0.9775$$

Using our chart, t-value of $0.977$, and $df=7$, we see we have $.15 \lt \frac{1}{2}p \lt .20$ for the left tail and $.15 \lt \frac{1}{2}p \lt .20$ for the right tail. Thus, we have $.3 \lt p \lt .4$.

To get a more exact solution, we can use a tool called [GraphPad](https://www.graphpad.com/quickcalcs/) (see Continous Data -> One sample t-test -> Select "Enter mean, SD, and N) to obtain $p=.3609$.

For $\alpha=0.05$, we see that $p=.3609$ is not statistically significant given $p \gt \alpha$. Therefore, we fail to reject $H_0$.


##### Example 2

Let's say in Santa Clara County the mean rent is $1830. A rental company wants to confirm that the mean rent is accurate and that they are renting their properties for the correct amount.

They take a random sample of $n=25$ units. The obtain: $$\bar{x}=1700$$ $$s=200$$ For $\alpha=0.05$, $H_0: \mu=1830$, $H_a: \mu \neq 1830$, compute the t-critical values ($t^*$ and $-t^*$) and t-value for the sample to determine if we should reject or fail to reject $H_0$.
 
For $df=24$ and $alpha=0.05$, we find $$t^*=2.064$$ using our t-table. We also find, for our sample, $$t=-3.25$$ Thus, we reject the null hypothesis as $t < -t^*$ ($p \lt \alpha$).

### Cohen's d - Effect Size Measure

Cohen's d is a standardized mean difference that measures the distance between 2 means in standard deviation units. Instead of dividing by the standard error $\frac{s}{\sqrt{n}}$, we will divide by $s$. $$\text{Cohen's d} = \frac{\bar{x}-\mu}{s}$$

##### Example

Using the rent example above, we find that $\text{Cohen's d}=-0.65$.

### Confidence Interval (CI)

For $\alpha=0.05$, we'll have a $95$% confidence interval with $2.5$% of values in each tail for a two-tailed test. $$\text{CI} = (\bar{x}-t^*\frac{s}{\sqrt{n}},\bar{x}+t^*\frac{s}{\sqrt{n}})$$ where $$\text{SE} = \frac{s}{\sqrt{n}}$$ is our standard error term.

##### Example

Using the rent example above with $\bar{x}=1700$ and $t^*=2.064$, what are the rent values that define our $95$% confidence interval?

Our confidence interval is then defined by $\text{CI} = (\bar{x}-2.064\frac{s}{\sqrt{n}}, \bar{x}+2.064\frac{s}{\sqrt{n}}) = (1617.44, 1782.56)$.

### Margin of Error

In the above computation for our confidence interval, note that we added and subtracted the same term from $\bar{x}$. This term is known as the *margin of error* and is defined as $$\text{Margin of Error} = t^*\frac{s}{\sqrt{n}}$$

##### Example

Sticking with the rent example, what happens if we increase $n$ to $n=100$. $$t^*=1.984$$ $$\text{SE}=20$$ $$\text{Margin of Error} = t^*\frac{s}{\sqrt{n}} = 39.68$$

### Dependent t-tests

What if we want to determine the difference in paired samples where the same subject takes the test twice? Some examples include:

##### Within-subject Designs:

* Repeated Measures Design
    * $H_0: \mu_1 = \mu_2$
    * Subjects are assigned two conditions in random order, e.g., control and treatment conditions or two treatments being tested
    * Example: Errors on two types of keyboards (below)
* Pre-Test, Post-Test
    * $H_0: \mu_{\text{pre}} = \mu_{\text{post}}$
    * Take a measure of a random variable, implement a test/treament, measure the same random variable for the same sample again
* Growth over time (longitudinal study)
    * $H_0: \mu_{\text{time1}} = \mu_{\text{time2}}$
    * Measure a variable at two points in time

These sorts of study designs result in paired data where we usually want to compute the difference between the values.

| $x_i$ | $y_i$ | $D_i = \lvert x_i-y_i \rvert$ |
| ---- | ---- | ---- |
| $x_1$ | $y_1$ | $D_1 = \lvert x_1 - y_1 \rvert$ |
| $x_2$ | $y_2$ | $D_2 = \lvert x_2 - y_2 \rvert$ |
| $\dots$ | $\dots$ | $\dots$|
| $x_n$ | $y_n$ | $D_n = \lvert x_n - y_n \rvert$ |

$x_i$ could refer to the first treatment, pre-test, or initial measurement. Whereas $y_i$ could represent the second treatment, post-test, or measurement later in time.

##### Why use Dependent t-tests/samples

* Advantages
    * Controls for individual differences
        * Nuisance variations between subjects are minimized when using the same subject
    * Few subjects
    * Cost-effective & less expensive
    * Less time-consuming
* Disadvantages
    * Carry-over effects
        * Second measurmenet can be affected by the first treatment
        * Example: If a treatment concerns student learning, comparing students after two separate treatments may be difficult if they've already learned material after the first treatment.
    * Order may influence results
        * Example: A treatment that requires two pills but taking the first causes an interaction with the second pill.

Dependent sample (within-subject designs) disadvantages are the advnatages for independent samples (between-subjec designs).

##### Example

Imagine we design two keyboards, one in QWERTY order and the other in alphabetical order. We want to determine the effect on the number of mistakes individuals make when typing. (See [keyboard data](/workspaces/Career-Change-Preparation/Udacity/Statistics/Data/Keyboards-Lesson-10.xlsx))

We take a sample with $n=25$ where individuals are assigned both keyboards in random order.

$$H_0: \mu_Q = \mu_A$$
$$H_a = \mu_Q \neq \mu_A$$

Using the [keyboard data](/workspaces/Career-Change-Preparation/Udacity/Statistics/Data/Keyboards-Lesson-10.xlsx), we find $\bar{x}_Q = 5.08$ and $\bar{x}_A=7.8$. Thus, $$\mu_D = \mu_Q - \mu_A = -2.72$$ $$s_D=\sqrt{s_Q^2+s_A^2}=\sqrt{\frac{\sum ((q_i-a_i)- \mu_D)^2}{n-1}}=3.6914$$ $$t = \frac{\mu_D}{\frac{s_D}{\sqrt{n}}} = -3.69$$

For $\alpha=0.05$, what are the t-critical values for a two-tailed test? Using the lookup table we find $t^*=2.064$ and $-t^*=-2.064$. (0.025 or 2.5% of the data falls within the tails beyond these t-critical values.)

Thus, we reject $H_0$ as our computed t-value is statistically significant, $t \lt -t^*$.

Let's compute $\text{Cohen's d}$ now. $$\text{Cohen's d} = \frac{\mu_D}{s_D} = -0.7371$$

Now compute the CI. $$\text{CI} = \mu_D \pm t^*\frac{s_D}{\sqrt{n}}= (-4.2432, -1.1968)$$

##### Rewriting the Equations

Note that we could have written the equations above as: $$H_0: \mu_D = 0$$ $$H_A:\mu_D\neq 0$$ $$t=\frac{\bar{x}_D-0}{\frac{s}{\sqrt{n}}}$$ where population mean difference $\mu_D = \mu_Q-\mu_A$ and sample mean difference $\bar{x}_D = \bar{x}_Q-\bar{x}_A$.

This would allow us to change $\mu_D=0$ to some other arbitrary number for our particular question. Sometimes, we may be looking for the means not to be the same ($\mu_D=0$) but to be different than some predetermined amount, e.g., an amount specified in a study ($\mu_D=6$).

### Independent t-tests

##### Independent t-tests/samples

* Advantages
    * No carry-over effects
    * No concnerns for order of treatment
        * Each subject only receives one treatment
* Disdvantages
    * Many subjects
        * Must control for individual differences between groups
    * More time
    * More expensive

Two types
* Experimental
    * Give treatments to subjects
* Observational
    * Observe and compare the characteristics of two populations

##### Remember

$$H_0: \mu_1 - \mu_2 = 0$$
$$H_A: \mu_1 - \mu_2 \neq 0 \text{ OR } \mu_1 - \mu_2 \gt 0 \text{ OR } \mu_1 - \mu_2 \lt 0$$
$$t = \frac{\bar{x}_1-\bar{x}_2}{\text{SE}}$$

Reject $H_0$ if $p \lt \alpha$  
Fail to reject $H_0$ if $p \gt \alpha$

### Independent Samples' Statistics

##### Example
For a normally distributed populations, we have
$$\mathcal{N}(\mu_1, \sigma_1) - \mathcal{N}(\mu_2, \sigma_2) = \mathcal{N}(\mu_1-\mu_2, \sqrt{\sigma_1^2 + \sigma_2^2})$$

##### t-Statistic

$$t=\frac{(\bar{x}_1-\bar{x}_2)-(\mu_1-\mu_2)}{\text{SE}}$$

##### Standard Error
For independent samples of the same size $n$, we have
$$\text{SE} = \frac{s}{\sqrt{n}} = \frac{\sqrt{s_1^2 + s_2^2}}{\sqrt{n}} = \sqrt{\frac{s_1^2 + s_2^2}{n}}$$
For independent samples with different sample sizes $n_1$ and $n_2$, we have
 $$\text{SE} = \sqrt{\frac{s_1^2}{n_1} + \frac{s_2^2}{n_2}}$$

 ##### Degrees of Freedom

 $$df = (n_1-1) + (n_2-2) = n_1+n_2-2$$