In [42]:
import numpy as np
import matplotlib.pyplot as plt
import scipy

In [43]:
np.random.seed(42)

Part (a)
---

We perform a Wald test for the equality of the means $\mu_1$ and $\mu_2$. The null hypothesis is $H_0:\mu_1=\mu_2$. The difference $\theta=\mu_1-\mu_2$. So the null hypothesis is $H_0:\theta=0$. We compute the plug-in estimator $\hat{\theta}$ of the difference $\theta$.

In [44]:
Twain = np.array([.225, .262, .217, .240, .230, .229, .235, .217])

In [45]:
Snodgrass = np.array([.209, .205, .196, .210, .202, .207, .224, .223, .220, .201])

In [46]:
m = len(Twain)

In [47]:
n = len(Snodgrass)

In [48]:
mu_1_hat = np.mean(Twain)

In [49]:
mu_2_hat = np.mean(Snodgrass)

In [50]:
theta_hat = mu_1_hat - mu_2_hat

In [51]:
print('Mean proportion of 3-letter words in Twain essays: {0:.5f}'.format(mu_1_hat))
print('Mean proportion of 3-letter words in Snodgrass essays: {0:.5f}'.format(mu_2_hat))
print('Plug-in estimator for difference: {0:.5f}'.format(theta_hat))

Mean proportion of 3-letter words in Twain essays: 0.23187
Mean proportion of 3-letter words in Snodgrass essays: 0.20970
Plug-in estimator for difference: 0.02218


We estimate the standard error of $\hat{\theta}$ by $\widehat{\text{se}}=\sqrt{\widehat{\text{se}}(\mu_1)^2 + \widehat{\text{se}}(\mu_2)^2}$

In [52]:
se_1_hat = np.std(Twain) / np.sqrt(m)

In [53]:
se_2_hat = np.std(Snodgrass) / np.sqrt(n)

In [54]:
se_hat = np.sqrt(se_1_hat**2 + se_2_hat**2)

In [55]:
w = theta_hat / se_hat

In [56]:
print('Estimated standard error of mean proportions from Twain: {0:.5f}'.format(se_1_hat))
print('Estimated standard error of mean proportions from Snodgrass: {0:.5f}'.format(se_2_hat))
print('Estimated standard error of difference: {0:.5f}'.format(se_hat))
print('Observed Wald statistic: {0:.5f}'.format(w))

Estimated standard error of mean proportions from Twain: 0.00482
Estimated standard error of mean proportions from Snodgrass: 0.00290
Estimated standard error of difference: 0.00562
Observed Wald statistic: 3.94462


By Theorem 10.13 the p-value for the Wald test is $2\Phi(-w)$. A 95% confidence interval for $\theta$ is $(\hat{\theta} - 2\widehat{\text{se}}, \hat{\theta} + 2\widehat{\text{se}})$.

In [57]:
from scipy.stats import norm

In [58]:
wald_p_val = 2*norm.cdf(-w)

In [59]:
print('Estimated p-value: {0:.5f}'.format(wald_p_val))
print('Estimated 95% confidence interval: ({0:.5f}, {1:.5f})'.format(
    theta_hat - 2*se_hat, theta_hat + 2*se_hat))

Estimated p-value: 0.00008
Estimated 95% confidence interval: (0.01093, 0.03342)


Thus, using a size threshold of 0.05, we do reject the null hypothesis that $\mu_1 - \mu_2 = \theta = 0$ using the Wald test. I.e. we conclude that the mean proportions of three letter words are different for the two authors.

Part (b)
--

Now we use the permutation test to test the null hypothesis that $\mu_1-\mu_2=\theta=0$. Thus, we will consider the tuple $(X_1,\ldots,X_m,Y_1,\ldots,Y_n)$ where $X_i$ are the proportions of 3-letter words in the Twain essays and $Y_i$ are the proportions of 3-letter words in the Snodgrass essays. We will consider $t_{\text{obs}} = \overline{X_m} - \overline{Y_n}$. We will then permute $(X_1,\ldots,X_m,Y_1,\ldots,Y_n)$ randomly to form $(X_1',\ldots,X_m',Y_1',\ldots,Y_n')$ some large number $B$ of times. We will compute the difference $\overline{X_m'} - \overline{Y_n'}$ of means. The p-value will be estimated as the mean of $I(\overline{X_m'} - \overline{Y_n'} > t_{\text{obs}})$ over all these permutations.

In [60]:
t_obs = theta_hat

In [61]:
B = int(1e6)

In [62]:
T_boot = np.empty(B)

In [63]:
full_ds = np.concatenate((Twain, Snodgrass))

In [64]:
for i in range(B):
    perm = np.random.permutation(full_ds)
    mu = np.mean(perm[:m])
    nu = np.mean(perm[m:])
    T_boot[i] = 1 if nu-mu > t_obs else 0

In [65]:
perm_p_val = np.mean(T_boot)

In [66]:
print('p-value from permutation test: {0:.6f}'.format(perm_p_val))

p-value from permutation test: 0.000260


Since the estimated p-value $0.00026 < 0.05$, using the permutation test, we reject the null hypothesis that $\mu_1=\mu_2$. I.e. we conclude that the mean proportions of three letter words are different for the two authors.