## Week06 - ANOVA, Non-parametric tests ###

### Confidence intervals for variance
Describes the how well you know the true variance.

Calculating the confidence intervals for variance:

$ \frac{(N -1) s^2}{\chi^2 _{1-\alpha/2}} < \sigma^2 <  \frac{(N -1) s^2}{\chi^2 _{\alpha/2}} $ 

In this equation, the larger $\chi^2 _{1-\alpha/2}$ value is in the denominator of the lower bound.

Note that the $\chi^2$ values are not symmetric

$ \chi^2_{\alpha/2} \ne \chi^2_{1-\alpha/2} $

In addition, both $\chi^2$ values are both greater than zero. This ensures that both the upper and lower limits of the confidence intervals for variance (and standard deviation) are always greater than zero.

## ANOVA ##

### Recommended reading ###

McDonald, J.H. 2014. Handbook of Biological Statistics (3rd ed.). Sparky House Publishing, Baltimore, Maryland. Freely available online at [www.biostathandbook.com](www.biostathandbook.com)

__Analysis of variance__: test for a statistically significant difference between means of 3+ different groups

* Does not tell you which group is different
* Requires the use of __*post hoc*__ analysis to determine which means are different from each other

Example:
<img src='images/anova_example.png' width = '600'> http://www.biostathandbook.com/onewayanova.html

__One-Way ANOVA__ 
Fisher's LSD *post-hoc* test used to determine which populations are different from each other.

__Two-Way ANOVA__
Data are grouped into different genotypes, within those groupings, sex is segregated. Thus two factors are varying across the examples



One-Way example:

* J Populations (or "treatments")
* N samples per population
* $H_0: \mu_1 = \mu_2 = \mu_3 = \mu_4 $
* $H_a$: One mean will be different from any of the others

Three different types of CTDs in a water bath, each has four different measurements
( Does not require the same number of samples within each population

Use the F-statistic: The ratio of the variances of two groups of samples taken from a normal distribution follows an *F* distribution

$$ F = \frac{s_1^2}{s_2^2} $$

The F distribution can be used to test whether variances are  significantly different. In the case of ANOVA, we want to test whether the variance of differences between different groups is larger than the variance within groups.

Sum of Squares Between: __SSB__ $$\sum_{j=1}^J{N_j(\bar{y_j}-\bar{y})^2}$$where $\bar{y_j}$ is the mean of each population and $\bar{y}$ is the mean of all samples

Mean Square Between: __MSB__ $$\frac{SSB}{J-1}$$

$J-1$ is the degrees of freedom in calculating MSB.

Sum of Squares Within: __SSW__ $$ \sum_{j=1}^J{\sum_{i=1}^{N_j}}({y_{ij}} - \bar{y}_i)^2  $$

Mean Square Within: __MSW__ $$ \frac{SSW} {(\sum_{j=1}^J { N_j}) -J } $$

$\sum_{j=1}^J ({ N_j} -J )$ is the degrees of freedom in calculating MSW. This is the total number of samples minus the number of groups.

F-Distribution: $$ F =\frac{MSB}{MSW} $$


The null hypothesis can be rejected if F is large. This is a one-tailed test, since small values of F do not lead to a rejection of the null hypothesis. The region of rejection is is above some critical level, which is determined by the confidence level and the degrees of freedon in the numerator and denominator.


### Popular *post-hoc* tests ###

* Fisher's LSD (least significant difference)

* Tukey HSD (honest significant difference) 


### Parametric vs. non-parametric statistical tests

#### Parametric test
- Based on parameters that summarize a distribution, such as mean and standard deviation
- For example, t-tests and ANOVA assume a normal distribution of samples

#### Non-parametric test
- Advantage: No assumptions about parent population (more robust)
- Disadvantage: Less power in situations where parametric assumptions are satisfied (more samples needed to draw conclusions at same confidence level)

### Testing for normality

The following figures come from a notebook on the central limit theorem and testing for normality of a distribution:

https://github.com/tompc35/oceanography-notebooks/blob/master/central_limit_theorem.ipynb

<img src='images/norm_dist_week3.png' width="500">

Blue: Sample distribution ($O_i$)<br>
Red: Normal distribution with same mean and standard deviation, expected value ($E_i$)

#### Chi squared test for normality

$$ X^2 = \sum_{i=1}^k \frac{\left(O_i - E_i\right)^2}{E_i}$$

Tests for goodness of fit

Compare this test statistic to the Chi-Squared distribution $\chi_{\nu, 1-\alpha}^2$, where $\nu = k-1$ is the degrees of freedom.

- If test statistic is larger than the Chi-square value, can reject the Null Hypothesis that they are from the same distribution. Note that this test is sensitive to bin size.

#### Probability Plot

<img src='images/prob_dens.png' width="500">

The corresponding probability plot for this distribution is shown below:

<img src='images/prob_plot.png' width="500">

The x-axis is the _quantiles_ of the normal. If a normal distribution is split up into some discrete number of pieces, the quantiles are the z-scores at the edges of each piece. The quantiles are tightly clustered near zero.

The y-axis is the _ordered values_ in the sample distribution.

If values are normally distributed, the quantiles should plot linearly with the ordered values. That is, most values are clustered around the mean. Note that this test is qualitative and the $R^2$ statistic does not have much meaning in this case. As we will see later, correlation statistics are only meaningful of the variables are normally distributed.

###### Example for a non-normal distribution:

<img src='images/non_norm_dist.png' width="500">
<img src='images/non_norm_prob_plot.png' width="500">



#### Kolmogorov-Smirnov test

Can be used to compare two sample distributions, or a sample distribution and a reference distribution (normal, exponential, etc.)

Null Hypothesis: Samples are drawn from the same distribution (in the two-sample case)

##### An oceanographic example

<img src='images/km_dist.png' width="500">

_Source_: Durkin et al (2009), Chitin in diatoms and its association with the cell wall, Eukaryotic Cell

The following graph illustrates the K-S test statistic for a two-sample test.

<img src='images/KS_wiki.png' width="400">
Source: https://en.wikipedia.org/wiki/Kolmogorov%E2%80%93Smirnov_test <br>

Illustration of the two-sample Kolmogorov–Smirnov statistic. Red and blue lines each correspond to an empirical distribution function, and the black arrow is the two-sample KS statistic.

This is a very sensitive test, therefore with lots of samples it is very easy to reject the null hypothesis. i.e. low power

```python
from scipy import stats

help(stats.kstest)
```

### Other tests for normality

__Shapiro-Wilk__
- High Power
- Biased at __Large__ sample size

```python
from scipy import stats

help(stats.shapiro)
```

__Anderson-Darling__

These tests, along with the K-S test and probability plots are included in the Python stats library.

```python
from scipy import stats
help(stats.anderson)
```

### Geometric mean

If you were to log-transform data and then do a T-test, you'd be testing for a differences between geometric means.


Will amplify the large values


### Non-parametric tests: univariate data

#### Wilcoxan signed-rank test

__$H_0$__: the median difference between pairs of observations is zero

- Rank the absolute values of the differences (smallest = 1)
- Sum the ranks of the positive values, and sum the ranks the negative values separately
- The smaller of the two sums is the test statistic T
- Low values of T required for significance
- Use __Mann-Whitney__ test for unpaired data

```python
from scipy import stats

stats.wilcoxon
```

#### Mann- Whitney test
- ranked test
- analaogue of t-test for independent samples

```python
from scipy import stats

stats.mannwhitneyu
```

#### Kruskal-Wallis ANOVA

__$H_0$__: Means of ranks of groups are the same <br>
__$H_0 (II)$__: Medians of groups are the same (assuming they come from distributions with the same shape)

- Related to the Mann-Whitney rank-sum test (two groups)
- Does not assume normality, but...
- According to [McDonald](http://www.biostathandbook.com), the Fisher's classic ANOVA is not actually very sensitive to non-normal distributions
- Like Fisher's classic ANOVA, testing $H_0 (II)$ does not assume difference groups have same variance( homoscedasticity)
- Welch's ANOVA is another alternative to Fisher's ANOVA that does not assume homoscedasticity (like Welch's t-test)

```python
from scipy import stats

stats.kruskal
```

https://docs.scipy.org/doc/scipy-0.14.0/reference/stats.html