# Chapter 14 Nonparametric Statistical Methods

In [1]:
import math
import itertools

import polars as pl
from polars import col, lit
from scipy import stats
import numpy as np
import altair as alt

RNG = np.random.default_rng()
DATA = {}  # input data
ANS = {}   # calculation results

## 14.1 Inferences for Single Samples

In scipy:
- The sign test is implemented as [stats.quantile_test](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.quantile_test.html).
- The Wilcoxon signed rank test is implemented as [stats.wilcoxon](https://docs.scipy.org/doc/scipy/reference/generated/scipy.stats.wilcoxon.html).

### Ex 14.1

In [2]:
DATA['14.1'] = np.array([37, 26, 31, 35, 32, 32, 27, 31, 34, 36])

#### (a)

For the exact p-value:

In [3]:
stats.quantile_test(DATA['14.1'], q=30, p=0.5, alternative='greater')

QuantileTestResult(statistic=2, statistic_type=1, pvalue=0.0546875)

which can be calculated directly using the binomial CDF.

In [36]:
stats.binom.cdf(np.count_nonzero(DATA['14.1'] <= 30), n=DATA['14.1'].size, p=0.5)

0.0546875

Or with normal approximation with continuity correction:

In [29]:
stats.norm.cdf((np.count_nonzero(DATA['14.1'] <= 30) - DATA['14.1'].size * 0.5 + 0.5) / np.sqrt(DATA['14.1'].size * 0.5 * 0.5))

0.056923149003329024

In any case the p-value is > α = 0.05, so cannot reject $H_0$

#### (b)

For the exact test:

In [36]:
stats.wilcoxon(DATA['14.1']-30, alternative='greater', method='exact')

WilcoxonResult(statistic=43.5, pvalue=0.0654296875)

Normal approximation with continuity correction:

In [38]:
stats.wilcoxon(DATA['14.1']-30, alternative='greater', method='approx', correction=True)

WilcoxonResult(statistic=43.5, pvalue=0.05671152359300298)

In any case the p-value is > α = 0.05, so cannot reject $H_0$

### Ex 14.2

In [42]:
DATA['14.2'] = pl.DataFrame({
    'pair': range(1, 13),
    'treated': [14, 26, 2, 4, -5, 14, 3, -1, 1, 6, 3, 4],
    'control': [8, 18, -7, -1, 2, 9, 0, -4, 13, 3, 3, 3]})
DATA['14.2']

pair,treated,control
i64,i64,i64
1,14,8
2,26,18
3,2,-7
4,4,-1
5,-5,2
6,14,9
7,3,0
8,-1,-4
9,1,13
10,6,3


In [54]:
stats.quantile_test(
    DATA['14.2'].select(col('treated')-col('control')).to_series(), 
    alternative='greater')

QuantileTestResult(statistic=3, statistic_type=1, pvalue=0.072998046875)

In [58]:
stats.wilcoxon(
    DATA['14.2'].get_column('treated'),
    DATA['14.2'].get_column('control'),
    alternative='greater')

WilcoxonResult(statistic=47.0, pvalue=0.10604514898621259)

Using both the sign test and Wilcoxon signed rank test, the p-value indicates that VB does not improve the IQ at the 0.05 significance level. However the Wilcoxon test gives an even less significant p-value because the negatives have greater magnitudes than the positives -- somewhat increasing the possibility that the true median is 0 when considering only the signs. 

In [67]:
(
    alt.Chart(
        DATA['14.2'].select(
            (col('treated')-col('control')).alias('diff')))
    .mark_tick()
    .encode(
        alt.X('diff:Q').axis(grid=False)))

## 14.2 Inferences for Two Independent Samples

### Ex 14.11

The table below gives the survival times of 16 mice that were randomly assigned to a control group or to a treatment group. Did the treatment prolong survival? Answer using the Wilcoxon-Mann-Whitney test at $\alpha = .10$.

Survival Times of Mice (Days)||||||||||
---|---|---|---|---|---|---|---|---|---
**Control Group** | 52 | 104 | 146 | 10 | 50 | 31 | 40 | 27 | 46
**Treatment Group** | 94 | 197 | 16 | 38 | 99 | 141 | 23 ||

In [8]:
# ✍️
DATA['14.11'] = pl.read_json('Ex14-11.json')
DATA['14.11'].head()

Control,Treatmnt
list[f64],list[f64]
"[52.0, 104.0, … 46.0]","[94.0, 197.0, … 23.0]"


In [39]:
stats.mannwhitneyu(
    DATA['14.11']['Treatmnt'].explode(),
    DATA['14.11']['Control'].explode(),
    alternative='greater')

MannwhitneyuResult(statistic=36.0, pvalue=0.3402972027972028)

P-value = 0.34 > 0.1, so inconclusive to say treatment prolongs survival.

## 14.6 *Resampling Methods

The inherent error of resampling could be far less serious than making wrong and often unverifiable assumptions about the population distribution.

- permutation test: draw from samples *without replacement*
- bootstrap method: draw from samples *with replacement*
- jackknife method: delete one observation at a time.

### Ex 14.34

Consider a subset of the carbon atomic weight data shown below from Exercise 8.11.

||||||
---|---|---|---|---
**Method 1**| 12.0129 | 12.0072 | 12.0064 | 12.0054
**Method 2**| 12.0318 | 12.0246 | 12.0069 |

Do a two-sided permutation test to compare the two methods of atomic weight determination by listing all $\binom{7}{3} = 35$ assignments of labels x and y to the data values. Use $\alpha = .10$.

In [40]:
# ✍️
method_1 = np.array([12.0129, 12.0072, 12.0064, 12.0054])
method_2 = np.array([12.0318, 12.0246, 12.0069])

In [3]:
def get_statistic(m1: np.ndarray, m2: np.ndarray) -> float:
    return np.mean(m1) - np.mean(m2)

In [42]:
res = stats.permutation_test(
    (method_1, method_2), get_statistic,
    permutation_type='independent', alternative='two-sided', n_resamples=np.inf)
res.pvalue

0.17142857142857143

P-value > 0.1, therefor no significant difference between the 2 methods.

### Ex 14.35

Do a one-sided permutation test on the mouse data from Exercise 14.11 by using a computer package to draw 25 random samples _without replacement_. Enumerate your samples. Estimate the P-value and compare it with that obtained in Exercise 14.11.

In [55]:
# ✍️
res = stats.permutation_test(
    (DATA['14.11']['Treatmnt'].item(), DATA['14.11']['Control'].item()), 
    lambda m1, m2: m1.mean() - m2.mean(),
    permutation_type='independent', alternative='greater', n_resamples=25)
res.pvalue

0.19230769230769232

Because 25 does not exhaust all the combinations, the P-value may be different each time you run the code. Here we get P-value = 0.19 whereas in Ex 14.11 it was 0.34.

### Ex 14.36

Use a computer package to draw 25 bootstrap samples from the data in Exercise 7.16 on thermostat settings. Enumerate your samples. Estimate the standard error of the sample mean and compare it with the exact value.

### Ex 14.37

Use the bootstrap samples from Exercise 14.36 to estimate the standard errors of the sample median, sample maximum, and sample minimum.

### Ex 14.38

Do a bootstrap test on the mouse data from Exercise 14.11 by using a computer package to draw 25 random samples _with replacement_ as done in Example 14.7. Enumerate your samples. Estimate the P-value and compare it with that obtained in Exercises 14.11 and 14.35.

### Ex 14.39

Show the result (14.30) which relates the bootstrap estimate of the standard error of the
sample mean to the exact value. You will need to use the following result from **finite
population sampling**: Consider a population of N numbers: ${x_1, x_2, \ldots , x_N}$ with mean
and variance equal to

$$
\mu = \frac{\sum_{i=1}^N x_i}{N} \quad \text{and} \quad \sigma^2 = \frac{\sum_{i=1}^N (x_i - \mu)^2}{N}.
$$

If $\bar{X}$ is the sample mean of an SRS of size n _with replacement_ from this population, then

$$
\mathrm{Var}(\bar{X}) = \frac{\sigma^2}{N} = \frac{\sum_{i=1}^N (x_i - \mu)^2}{n\,N}.
$$

### Ex 14.40

Show the result (14.32) which states that the jackknife estimate of the standard error of the sample mean is exact. Check the result numerically by enumerating the 10 jackknife samples for the data from Exercise 14.1 and calculating the jackknife estimate of the standard error of the sample mean.

## Advanced Exercises

### Ex 14.43

Derive the formulas (14.4) for the mean and variance of the Wilcoxon signed rank statistic
under $H_0$ by carrying out the steps below.

#### (a)

The $i$th rank can correspond to a positive sign or a negative sign. Let $Z_i$ be an indicator variable with $Z_i = 1$ if the $i$th rank corresponds to a positive sign and $Z_i = 0$ if the $i$th rank corresponds to a negative sign. Show that $W_+ = \sum_{i=1}^n i\,Z_i$.

✍️ Because $W_+$ is the sum of the positive ranks,
$$
W_+ = \sum_{i=1}^n i\,Z_i\,.
$$

#### (b)

Note that the $Z_i$ are independent and identically distributed Bernoulli r.v.'s with success probability $p = P(X_ > \tilde{\mu_0})$. Hence show that

$$
\mathrm{E}(W_+) = \sum_{i=1}^n i\,p = \frac{p\,n(1+n)}{2}
$$

and

$$
\mathrm{Var}(W_+) = \sum_{i=1}^n i^2 p\,(1-p) = \frac{p\,(1-p)\,n(n+1)(2n+1)}{6}\,.
$$

Now (14.4) follows by substituting $p=1/2$ in the above formulas.

✍️
$$
\begin{align*}
\mathrm{E}(W_+) &= \mathrm{E}\left[\sum_{i=1}^n i\,Z_i\right] \\
&= \sum_{i=1}^n i\,\mathrm{E}(Z_i) \\
&= p \sum_{i=1}^n i \\
&= p\,n(1+n)/2\,.
\end{align*}
$$

Similarly,

$$
\begin{align*}
\mathrm{Var}(W_+) &= \mathrm{Var}\left[\sum_{i=1}^n i\,Z_i\right] \\
&= \sum_{i=1}^n i^2\,\mathrm{Var}(Z_i) \\
&= p\,(1-p) \sum_{i=1}^n i^2 \\
&= p\,(1-p)\,n(n+1)(2n+1)/6\,.
\end{align*}
$$