# MATH 3375 Examples Notebook #24

# Non-Parametric Tests

Many common statistical tests use distributions (such as the t-test) that may focus on the mean or standard deviation, or the test may have underlying assumptions that may not be a good fit (e.g., the assumption of normally distributed data). There are a number of alternative tests that  include simulation-based tests and non-parametric tests. 

## 1. McNemar's Test

This test performs a comparison between 2 treatments, but for a binary result (whereas t-tests compare means for quantitative results). It is different from a 2-proportion z-test, because: 
1. A 2-proportion z-test assumes each observation is independent, whereas McNemar's Test works on paired data (each observation in the first sample is paired with (related or identical to) a corresponding observation in the second sample.
2. A 2-proportion z-test can only compare the overall proportion of a given outcome between the 2 groups (because all observations are independent). McNemar's Test focuses on the frequency of disparate results for each set of paired observations.

#### Background:
100 patients are infected with a virus. It is possible that each patient has a slightly different mutation of the virus. Samples are drawn from each of the 100 patients, and 2 cultures are prepared for each patient.  For each patient, one culture is treated with Treatment A and the other is treated with Treatment B. 

* Treatment A successfully eradicated the virus in 61 of the 100 cases.
* Treatment B successfully eradicated the virus in 68 of the 100 cases.

Note that it is not sufficient to compare the proportion of successes in each case (as we would in a 2-proportion z-test.) See the 2 possible scenarios below for a better illustration of how McNemar's Test is different (and more powerful).

#### Results: Scenario 1

* 32 cases: Neither treatment worked
* 61 cases: Both treatments worked
* 7 cases: B worked, A did not
* 0 cases: A worked, B did not

#### Results: Scenario 2

* 12 cases: Neither treatment worked
* 41 cases: Both treatments worked
* 27 cases: B worked, A did not
* 20 cases: A worked, B did not


In [None]:
# Set up outcome matrix, Scenario 1

data_tbl_1 <- matrix(c(61,0,7,32), ncol=2, byrow=TRUE)
colnames(data_tbl_1) <- c('B Worked', 'B Did Not')
rownames(data_tbl_1) <- c('A Worked', 'A Did Not')

data_tbl_1

In [None]:
# Run McNemar's Test
mcnemar.test(data_tbl_1)

In [None]:
# Set up outcome matrix, Scenario 2

data_tbl_2 <- matrix(c(41,20,27,12), ncol=2, byrow=TRUE)
colnames(data_tbl_2) <- c('B Worked', 'B Did Not')
rownames(data_tbl_2) <- c('A Worked', 'A Did Not')

data_tbl_2

In [None]:
# Run McNemar's Test
mcnemar.test(data_tbl_2)

### Test Results

Scenario 1 would provide sufficient evidence that Treatment B is better, with $p \approx 0.02 \lt 0.05$.

Scenario 2 would **not** provide sufficient evidence that either treatment is better, with $p \approx 0.38 \ge 0.05$.


## 2. Wilcoxon Signed Rank Test

This test focuses on medians rather than means. This is especially useful for non-normal data or data with outliers or skewed distributions (where mean and standard deviation are heavily influenced by outliers or skewness).

### Two variations of the test:

#### a. Compare median of sample to specific value
Application is analogous to 1-sample t-test, but focusing on median instead of mean.



In [None]:
#Example with single sample: petal length of iris sample

M = median(iris$Petal.Length)
boxplot(Petal.Length~Species,data=iris)
abline(h=M,col="red")



In [None]:
#Test each species separately

Petal.setosa <- iris$Petal.Length[iris$Species=="setosa"]
Petal.versicolor <- iris$Petal.Length[iris$Species=="versicolor"]
Petal.virginica <- iris$Petal.Length[iris$Species=="virginica"]

wilcox.test(Petal.setosa, mu=M)
wilcox.test(Petal.versicolor, mu=M)
wilcox.test(Petal.virginica, mu=M)


#### b. Compare median of 2 _paired_ samples to each other
Application is analogous to matched pairs t-test, but focusing on median instead of mean.

The scenario below compares reaction times for participants' left and right hands in a computer-based task.

Null hypothesis: Median difference in reaction time is zero
Alternative: Median difference in reaction time is NOT zero

In [None]:
left <- c(50, 47, 62, 81, 77, 49, 56, 78)
right <- c(45, 48, 64, 82, 74, 42, 52, 72)

boxplot (left-right,horizontal=TRUE,main="Reaction Time Difference (Left-Right)")


In [None]:
#Non-Parametric Test for Median Difference

wilcox.test(left,right, paired = TRUE, exact=FALSE)

## 3. Mann-Whitney Test

This test is analogous to a 2-Sample t-test for independent samples, but focusing on median instead of mean. It is a variant of the Wilcoxon Rank Test.

We demonstrate below using a test to compare median horsepower and mpg between manual and automatic transmission cars.

Notice how the test compares to the t-test in each case.


In [None]:
head(mtcars,3)

#### Tests for Horsepower: Median and Mean

In [None]:
hist(mtcars$hp)

In [None]:
boxplot(hp~am,data=mtcars)

In [None]:
# Median test (non-parametric)
wilcox.test(hp~am,data=mtcars,exact=FALSE)

In [None]:
# Mean test (parametric)
t.test(hp~am,data=mtcars,exact=FALSE)

##### Observations
Notice that the t-test was not significant for the comparison, but the Mann Whitney test was. This is related to the strongly skewed distribution and presence of outliers for the horsepower variable.


#### Tests for MPG: Median and Mean

In [None]:
hist(mtcars$mpg)

In [None]:
boxplot(mpg~am,data=mtcars)

In [None]:
wilcox.test(mpg~am,data=mtcars,exact=FALSE)

In [None]:
t.test(mpg~am,data=mtcars,exact=FALSE)

##### Observations
Notice that for MPG, the distribution was closer to symmetric with no outliers. In this case, the t-test and Mann Whitney test were both able to detect a significant difference. 
