# Paired sample t-test

##  Theory

- $X_i$ and $Y_i$ are paired, and correlated
    - $Cov(X_i, Y_i) = \sigma_{XY}$
    
    
- Independece across pair: 
    - $E(X_i - Y_i) = E(\bar X - \bar Y) = \mu_X - \mu_Y$
    - $Var(\bar X - \bar Y) = \frac{1}{N}[\sigma^2_X+\sigma^2_Y - 2 \rho\sigma_X \sigma_Y]$, where $\rho$ is correlation coefficient between $X$ and $Y$.


- Normal assumption:
    - $D = X-Y \sim N(.,.)$
    - $\bar D \sim N(.,.)$


- Under big sample size:
    - $D = X-Y$ doens'y have to be normal
    - $\bar D \xrightarrow{N\to\infty} N(.,.)$


- Test Statistic (Actually for **one-sample t-test**):
    - $$t = \frac{\bar D - \mu_D}{s_{\bar D}} \sim t\ (df=N-1)$$
    </br>
    - $${s_{\bar D}} = \frac{s_D}{\sqrt{N}} = \frac{\sqrt{\frac{1}{N-1}\sum(D_i-\bar D)^2}}{\sqrt{N}}$$

**Comparison with independent sample t-test**
- When $\sigma_X = \sigma_Y$:
    - *Independent*: $Var(\bar X - \bar Y) = 2\sigma^2/N$
    - *Paired*:  $Var(\bar X - \bar Y) = 2\sigma^2(1-\rho)/N$

**Non-parametric methods**
- Signed rank test
- Assumption: half positive and half negative for the sign of differences between pairs
- Robust to outliers
- No need for normal assumptions
- Good for small sample size


## Numerical Example

In [2]:
import numpy as np

In [3]:
D = [2,4,10,12,16,15,4,27,9,-1,15]
N = len(D)

In [4]:
D_bar = np.mean(D)
D_bar

10.272727272727273

In [5]:
s_D = np.std(D, ddof=1)
s_D

7.976100664998018

In [6]:
s_D_bar = s_D / np.sqrt(N)
s_D_bar

2.404884835991147

In [7]:
T = D_bar / s_D_bar
T

4.271608818429545

In [8]:
from scipy.stats import t
p = 1 - (t.cdf(T, df = N-1)-0.5) * 2
p

0.001632849921999746

**Conclusion:** $p$ < 0.05, there is significance difference

# ANOVA test

## Theory

**Assumption**:
$$Y_{ij} = \mu + \alpha_i + e_{ij}, i^{th}\ treatment,\  j^{th} observation$$

- $e_{ij} \sim N(0, \sigma^2)$,  iid
- Constraint: $\sum \alpha_i=0$
- $H_0: \alpha_i=0$ for each $i$


- Break the errors:
    - $SS_W = \sum_i\sum_j(Y_{ij}-\bar Y_{i.})^2$
    - $SS_B = J \sum_i(\bar Y_{..}-\bar Y_{i.})^2$


- Under normal distribution:
    $$SS_W/\sigma^2 \sim \chi^2[I(J-1)]$$


- Under $H_0$:
    $$SS_B/\sigma^2 \sim \chi^2(I-1)$$


- When two $\chi$ samples are **independent**
$$\frac{\chi^2_a/a}{\chi^2_b/b} \sim F(a,b)$$

    
- Under normal distribution and $H_0$
$$F = \frac{SS_B/(I-1)}{SS_W/[I(J-1)]} \sim F[I-1, I(J-1)]$$
$$$$
     - Under $H_0$, $E(numerator) = E(denominator) = \sigma^2$, so F should be close to 1
     - Under $H_1$, when some $\alpha_i>0$, $E(numerator) >\sigma^2$, so F should be larger than 1 


<img src="https://ecstep.com/wp-content/uploads/2017/12/F-distribution-2.png" width="400">

**Violation of assumptions**:
- Independence: should not be violated
- Normality: still valid if non-normal and large sample
- Non-constant variance: still valid with equal sample size across groups

**Another perspective of looking at ANOVA**:
- F-test for a set of parametrs between complicated model and a reduced/simple model

**Example**
<img src="./fig/anova.png" width="800">

## Multiple Comparison

- Goal: control Type-I error


**How to define combined error rate**
- $H_0 = H_{01} \cap H_{02} ... \cap H_{0K}$
- Option1: Experiment/Family-Wise Error 
    - Reject one or more of $H_{0i}$ while all $H_{0i}$ are true
    
    
- Option 2: False Discovery Rate (FDR)
    - $\frac{Number\ of\ Type-I\ error}{Number\ of\ Rejecting\ H_0}$
    - e.g., a 0.05 FDR means that we allow one incorrect **Rejection** with 19 correct **Rejections**
    - If $H_0$ is true, then all discoveries are false

**Bonferroni Correction**
- Instead of $t_{df}(\alpha)$, use $t_{df}(\frac{\alpha}{M})$
    - $M$ is number of comparisons
    - Controls family-wise error at $\alpha$


- Completely general: it applies to any set of c inferences, not only to multiple comparisons following ANOVA
    - Feature - Different features of a product
    - Drug - Different symptoms of a disease


- "The Bonferroni method would require p-values to be smaller than .05/100000 to declare significance. Since adjacent voxels tend to be highly correlated, this threshold is generally too stringent."

**Protected LSD (Least Significant Difference) for multiple comparisons**
- Use ANOVA F-test first
- User t-test as usual for multiple comparisons for **a few** planned comparisons

**Tukey’s (“honestly significant difference” or “HSD”) for multiple comparisons**
- Specifically for multiple comparisons after ANOVA test
- Between the largest and smallest sample means
    - $g$ is number of groups
    - $v$ is degrees of freedom for error


- Test statistic
$$Q = max_i \frac{\bar y_{i.}}{\sqrt{MS_E/n}} - min_j \frac{\bar y_{j.}}{\sqrt{MS_E/n}} \sim q(g, v)$$

- When sample size is different:<img src="https://wikimedia.org/api/rest_v1/media/math/render/svg/9d01f8ffa3951483d2acdc9a9b715377901bebdf" width="300">
- When sample size is same:
$$\bar y_{i.} - \bar y_{j.} \pm \frac {q(\alpha, k, N-k)}{\sqrt{2}} \hat \sigma_{\epsilon} \sqrt{\frac{1}{n} +\frac{1}{n}}$$

- Compare with Bonferroni
$$\bar y_{i.} - \bar y_{j.} \pm t_{(\alpha/2K,v)}  \hat \sigma_{\epsilon} \sqrt{\frac{1}{n} +\frac{1}{n}}$$

**To be added**
- http://www2.hawaii.edu/~taylor/z631/multcomp.pdf
- http://www.stat.cmu.edu/~genovese/talks/hannover1-04.pdf
- http://personal.psu.edu/abs12//stat460/Lecture/lec10.pdf

## Two-factor ANOVA

**Assumnption**
$$Y_{ijk} = \mu + \alpha_i + \beta_j + \delta_{ij} + e_{ijk}$$
**Error break**
$$SS=SS_A+SS_B+SS_{AB}+SS_E$$
**Four $\chi$ Distributions**
$$SS_A/\sigma^2 \sim \chi^2(I-1)$$
$$SS_B/\sigma^2 \sim \chi^2(J-1)$$
$$SS_{AB}/\sigma^2 \sim \chi^2[(I-1)(J-1)]$$
**Three F-statistics**
    $$F=\frac{MS_?}{MS_E} = \frac{{SS}_?/{df}_?}{SS_E/[IJ(K-1)]} \sim F[df_?, IJ(K-1)]$$

---
<img src ="./fig/anova_2_table.png" width="400">

# Experiment Design


## Examples of confounding
- Effect of ***Gender*** in College Admission confounded by ***Major***: women apply for hard majors
- Effect of ***Coffee Drinking*** on coronary diseases confounded by **Smoking**: coffee drinkers smoke more
- **Randomization**: mitigate the impact of confounding factors so that they are *same* in both groups


## Completely Randomized Design (CRD)

- For each experiment $i$, randomly assign a treatment with equal probability
- Example: one-wayANOVA


## Randomized Completedly Block Design (RCB)
- Goal: higher power by decreasing error variance
- Note: Blocks exist ***at the time of the randomization of treatments to units***. You ***cannot*** determine the design used in an experiment just by looking at a table of results, you have to know the randomization.

$$y_{ij} = \mu + \alpha_i + \beta_j + \epsilon_{ij}$$

- The **computation** of estimated effects, sums of squares, contrasts, and so on is done exactly as for a two-way factorial, but **design** is different.
<img src="./fig/anova_table.png" width="300">


- With a randomized block design, the experimenter divides subjects into subgroups called **blocks**, such that the variability within blocks is less than the variability between blocks. Then, subjects within each block are randomly assigned to treatment conditions. 
- Compared to a completely randomized design, this design reduces variability within treatment conditions and potential confounding, producing a better estimate of treatment effects.


- ***Example***: Paired-Sample t-test, where ***person*** is the block
- ***Example***: Fertilizer agricultural experiment, where ***field*** is the block
- ***Spatial and Temporal Blocking***

The table below shows a randomized block design for a hypothetical medical experiment.

|Gender	||Treatment|
|::|::|::|
||Placebo	|Vaccine|
|Male	|250	|250|
|Female	|250	|250|
Subjects are assigned to blocks, based on gender. Then, within each block, subjects are randomly assigned to treatments (either a placebo or a cold vaccine). For this design, 250 men get the placebo, 250 men get the vaccine, 250 women get the placebo, and 250 women get the vaccine.

It is known that men and women are physiologically different and react differently to medication. This design ensures that each treatment condition has an equal proportion of men and women. As a result, differences between treatment conditions cannot be attributed to gender. This randomized block design removes gender as a potential source of variability and as a potential confounding variable.

# Categorical analysis

## Chi-square test
- Example: one dimension

$$Q=\sum_{i=1}^{k}\frac{(Y_i - np_i)^2}{np_i} = \sum \frac{(Expected - Observed)^2}{Expected} \sim \chi^2(k-1)$$


<img src="https://onlinecourses.science.psu.edu/stat414/sites/onlinecourses.science.psu.edu.stat414/files/lesson44/e14/index.gif" width="400">

- ExampleL: two dimension
    - the multi-variable distribution on $I$ changes across group $J$
    
$$E_{ij} = \frac{n_{i.}n_{.j}}{n_{..}}$$
$$df = (I-1)(J-1)$$

- Example: Pair nomial data
<img src="./fig/chi.png" width="400">

- Null hypothesis:
    - P(Negative to Positive) = P(Positive to Negative):  $p_b = p_c$
- Alternative hypothesis:
    - P(Negative to Positive) $\neq$ P(Positive to Negative): $p_b \neq p_c$
- Test Statistic:
    - $\chi ^{2}={(b-c)^{2} \over b+c}$
- Interpretation:
    - if b>c and significant, then Test 2 is makeing a different to change from negative to positive