# Week 10 - Hypothesis tests concerning two populations

This is a Jupyter notebook to explore the material in (Ross, 2017, Chp. 10). 



In [7]:
%matplotlib inline
# from now on we'll start each notebook with the library imports
# and special commands to keep these things in one place (which
# is good practice). The line above is jupyter command to get 
# matplotlib to plot inline (between cells)
# Next we import the libraries and give them short names
import numpy as np
import scipy.stats as stats
import matplotlib.pyplot as plt
from collections import Counter
from collections import defaultdict

## Exercise A

Complete question 1. from (Ross, 2017, Sec. 10.2 Problems). The text is repeated below for convenience:

> 1. An experiment is performed to test the difference in effectiveness of two
methods of cultivating wheat. A total of 12 patches of ground are treated
with shallow plowing and 14 with deep plowing. The average yield per
ground area of the first group is 45.2 bushels, and the average yield for
the second group is 48.6 bushels. Suppose it is known that shallow plow-
ing results in a ground yield having a standard deviation of 0.8 bushels,
while deep plowing results in a standard deviation of 1.0 bushels.
>
>    (a) Are the given data consistent, at the 5 percent level of significance,
with the hypothesis that the mean yield is the same for both methods?
>
>    (b) What is the p value for this hypothesis test?

> 5. In this section, we presented the test of
>
>    H 0 : μ x ≤ μ y
>
>    against
>
>    H 1 : μ x > μ y
>
>    Explain why it was not necessary to separately present the test of
>
>    H 0 : μ x ≥ μ y
>
>    against
>
>    H 1 : μ x < μ y

> 6. The device used by astronomers to measure distances results in mea-
surements that have a mean value equal to the actual distance of the
object being surveyed and a standard deviation of 0.5 light-years. An as-
tronomer is interested in testing the widely held hypothesis that asteroid
A is at least as close to the earth as is asteroid B. To test this hypothesis,
the astronomer made 8 independent measurements on asteroid A and
12 on asteroid B. If the average of the measurements for asteroid A was
22.4 light-years and the average of those for asteroid B was 21.3, will the
hypothesis be rejected at the 5 percent level of significance? What is the
p value?

*complete your answers in Markdown using the code-block below for computation*

#### question 1.

We have two conditions, shallow (RVs $W_i$) and deep (RVs $D_i$) with unknown respective means of $\mu_W$ and $\mu_D$ and known SDs of $\sigma_W = 0.8$ and $\sigma_D = 1.0$. We draw a sample from each population with sizes $n_W = 12$ and $n_D=14$ with resulting sample means of $\bar{W} = 45.2$ and $\bar{D} = 48.6$ respectively.


>    (a) Are the given data consistent, at the 5 percent level of significance,
with the hypothesis that the mean yield is the same for both methods?

This suggests a two sided test with a null hypothesis of
$$H_0 : \mu_W = \mu_D$$

And a corresponding alternative hypothesis of
$$H_1 : \mu_W \neq \mu_D$$

$$TS = \frac{\bar{W} - \bar{D}}{\sqrt{\sigma_W^2/n_W +\sigma_D^2/n_D}} = -9.626$$

And $|-9.626| >= z_{0.025} = 1.960$, so Reject $H_0$.


>    (b) What is the p value for this hypothesis test?

$$\text{p_value} = 2 \cdot Pr(Z >= |-9.626|) = 0.0$$

The p-value is so small as to be indistinguishable from $0$ with our tools.

#### Question 5.

If we switch the labels $X$ and $Y$ then we can apply the second test with no other changes.

#### Question 6.

Let the two RVs $A$ and $B$ be the recorded distances to the respective asteroids with true distances (means) $\mu_A$ and $\mu_B$ light-years and SDs $\sigma_A = \sigma_B = 0.5$ light-years respectively. 

Our astronomer is interested in testing the hypothesis that asteroid
$A$ is at least as close to the earth as is asteroid $B$. This suggests that a significant finding for the researcher would be that the current view is false, so the null hypothesis should be:

$$H_0 : \mu_A \leq \mu_B$$

with alternative hypothesis:

$$H_0 : \mu_A > \mu_B$$

(A one sided test.)

There are $n_A = 8$ distance measurments of asteroid A  with sample mean $\bar{A} = 22.4$ and $n_B = 12$ measurements of the distance of asteroid $B$ with sample mean $\bar{B} = 21.3$

> will the hypothesis be rejected at the 5 percent level of significance?

$$TS = \frac{\bar{A} - \bar{B}}{\sqrt{\sigma_A^2/n_A +\sigma_B^2/n_B}} =  4.820$$

As $4.820 >= z_{0.05} = 1.645$, we reject $H_0$.

>What is the p value?

$$\text{p_value} = Pr(Z >= 4.820) = 7.18e-07$$

This is a very small p-value, and so a very significant result.


In [84]:
## supporting code for Exercise A

# question 1
print("Question 1. (a)")
sigma_W = 0.8
sigma_D = 1.0
n_W = 12
n_D=14
Wbar = 45.2
Dbar = 48.6
print(f"sigma_W = {sigma_W}")
print(f"sigma_D = {sigma_D}")
print(f"n_W = {n_W}")
print(f"n_D = {n_D}")
print(f"Wbar = {Wbar}")
print(f"Dbar = {Dbar}")
print("H_0 : mu_W = mu_D")
print("H_1 : mu_W != mu_D")
# our test statistic
TS = (Wbar - Dbar)/np.sqrt( sigma_W**2/n_W + sigma_D**2/n_D)
print(f"TS = {TS:.3f}")
# at the alpha level of significance
alpha = 0.05
print(f"alpha = {alpha}")
# we reject if our standard normal has magnitude greater than
z0025 = stats.norm.ppf(1-alpha/2)
reject = np.abs(TS) >= z0025
print(f"|{TS:.3f}| {'>=' if reject else '<'} {z0025:.3f}, ", end="")
print(f"so {'reject H0' if reject else 'do not reject H0'}")
print()


print("Question 1. (a)")
p_value = 2*(1 - stats.norm.cdf(np.abs(TS)))
print(f"p_value = 2 Pr(Z >= |{TS:.3f}|) = {p_value}")
print()



print("Question 6.")
sigma_A = 0.5
sigma_B = 0.5
n_A = 8
n_B=12
Abar = 22.4
Bbar = 21.3
print(f"sigma_A = {sigma_A}")
print(f"sigma_B = {sigma_B}")
print(f"n_A = {n_A}")
print(f"n_B = {n_B}")
print(f"Abar = {Abar}")
print(f"Bbar = {Bbar}")
print("H_0 : mu_A <= mu_B")
print("H_1 : mu_A > mu_B")
# our test statistic
TS = (Abar - Bbar)/np.sqrt( sigma_A**2/n_A + sigma_B**2/n_B)
print(f"TS = {TS:.3f}")
# at the alpha level of significance
alpha = 0.05
print(f"alpha = {alpha}")
# we reject if our standard normal has magnitude greater than
z005 = stats.norm.ppf(1-alpha)
reject = TS >= z005
print(f"{TS:.3f} {'>=' if reject else '<'} {z005:.3f}, ", end="")
print(f"so {'reject H0' if reject else 'do not reject H0'}")
# we can output the p-value in scientific notation to 3 significant figures
# using the g flag.
p_value = 1 - stats.norm.cdf(TS)
print(f"p_value = Pr(Z >= {TS:.3f}) = {p_value:.3g}")
print()

Question 1. (a)
sigma_W = 0.8
sigma_D = 1.0
n_W = 12
n_D = 14
Wbar = 45.2
Dbar = 48.6
H_0 : mu_W = mu_D
H_1 : mu_W != mu_D
TS = -9.626
alpha = 0.05
|-9.626| >= 1.960, so reject H0

Question 1. (a)
p_value = 2 Pr(Z >= |-9.626|) = 0.0

Question 6.
sigma_A = 0.5
sigma_B = 0.5
n_A = 8
n_B = 12
Abar = 22.4
Bbar = 21.3
H_0 : mu_A <= mu_B
H_1 : mu_A > mu_B
TS = 4.820
alpha = 0.05
4.820 >= 1.645, so reject H0
p_value = Pr(Z >= 4.820) = 7.18e-07



### Exercise B

Complete question 1 from Problems for (Ross, 2017, Sec. 10.3). The text of each is repeated below for convenience:

> 1. A high school is interested in determining whether two of its instructors
are equally able to prepare students for a statewide examination in ge-
ometry. Seventy students taking geometry this semester were randomly
divided into two groups of 35 each. Instructor 1 taught geometry to the
first group, and instructor 2 to the second. At the end of the semester, the
students took the statewide examination, with the following results:
>
>    Class of instructor 1
>
>    X = 72.6
>
>    S x 2 = 6.6
>    
>    Class of instructor 2
>
>    Y = 74.0
>
>    S y 2 = 6.2
>
>    Can we conclude from these results that the instructors are not equally
able in preparing students for the examinations? Use the 5 percent level
of significance. Give the null and alternative hypotheses and the resulting
p value.

*complete your answers in Markdown using the code-block below for computation*


We have two instructors 1 (RVs $X_i$) and 2 (RVs $Y_i$). With sample sizes $n_X = n_Y = 35$ (large enough for sample variances to be accurate), population means $\mu_X$ and $\mu_Y$, sample means $\bar{X}= 72.6$ and $\bar{Y} = 74.0$, and sample SDs $S_X = 6.6$ and $S_Y = 6.2$

Null hypothesis is that the instructors are equally able to prepare students and hence:

$$H_0 : \mu_X = \mu_Y$$

with alternative hypothesis:

$$H_0 : \mu_X \neq \mu_Y$$

(A two sided test.)

Our test statistic is
$$TS = \frac{\bar{X} - \bar{Y}}{\sqrt{S_X^2/n_X +S_Y^2/n_Y}} = -0.915$$


As $|-0.915| < 1.960$, we do not reject H0 at the 5% significance level.

The p-value is the probability we would see at least as extreme a difference so:

$$\text{p_value} = 2 Pr(Z >= |-0.915|) = 0.36$$

So at least as extreme a value would have a probability of $0.36$ under the null hypothesis and we cannot exclude this at a reasonable significance level.

In [85]:
## supporting code for Exercise B

n_X = 35
n_Y = 35
Xbar= 72.6
Ybar = 74.0
S_X = 6.6
S_Y = 6.2
print("H_0 : mu_X = mu_Y")
print("H_1 : mu_X != mu_Y")
# our test statistic
TS = (Xbar - Ybar)/np.sqrt( S_X**2/n_X + S_Y**2/n_Y)
print(f"TS = {TS:.3f}")
# at the alpha level of significance
alpha = 0.05
print(f"alpha = {alpha}")
# we reject if our standard normal has magnitude greater than
z0025 = stats.norm.ppf(1-alpha/2)
# evaluate whether we reject
reject = np.abs(TS) >= z0025
# then output
print(f"|{TS:.3f}| {'>=' if reject else '<'} {z0025:.3f}, ", end="")
print(f"so {'reject H0' if reject else 'do not reject H0'}")

# p-value
p_value = 2*(1 - stats.norm.cdf(np.abs(TS)))
print(f"p_value = 2 Pr(Z >= |{TS:.3f}|) = {p_value:.3g}")
print()


H_0 : mu_X = mu_Y
H_1 : mu_X != mu_Y
TS = -0.915
alpha = 0.05
|-0.915| < 1.960, so do not reject H0
p_value = 2 Pr(Z >= |-0.915|) = 0.36



## Exercise C

Complete question 3.  from (Ross, 2017, Sec. 10.4 Problems). The text is repeated below for convenience:

> In the following problems, assume that the population distributions are nor-
mal and have equal variances.
>
> 2. A study was instituted to learn how the diets of women changed during
the winter and the summer. A random group of 12 women were observed
during the month of July, and the percentage of each woman’s calories
that came from fat was determined. Similar observations were made on
a different randomly selected group of size 12 during the month of Jan-
uary. Suppose the results were as follows:
>
>    July: 32.2, 27.4, 28.6, 32.4, 40.5, 26.2, 29.4, 25.8, 36.6, 30.3,
28.5, 32.0
>
>    January: 30.5, 28.4, 40.2, 37.6, 36.5, 38.8, 34.7, 29.5, 29.7, 37.2,
41.5, 37.0
>
>    Test the hypothesis that the mean fat intake is the same for both months.
>
>    Use the
>
>    (a) 5 percent
>
>    (b) 1 percent
>
>    level of significance.

> 3. A consumer organization has compared the time it takes a generic pain
reliever tablet to dissolve with the time it takes a name-brand tablet. Nine
tablets of each were checked. The following data resulted:
>
>    Generic: 14.2, 14.7, 13.9, 15.3, 14.8, 13.6, 14.6, 14.9, 14.2
>
>    Name: 14.3, 14.9, 14.4, 13.8, 15.0, 15.1, 14.4, 14.7, 14.9
>
>    (a) Do the given data establish, at the 5 percent level of significance,
that the name-brand tablet is quicker to dissolve?
>
>    (b) What about at the 10 percent level of significance?



*complete your answers in Markdown using the code-block below for computation*

#### Question 2

We have two RVs, $X$ and $Y$, corresponding to the summer and winter percentages of calories that came from fats. The populations have respective means $\mu_X$ and $\mu_Y$ and respective variances $\sigma_X^2$ and $\sigma_Y^2$ (all unknown).

Summer data ($X_i$s):

32.2, 27.4, 28.6, 32.4, 40.5, 26.2, 29.4, 25.8, 36.6, 30.3, 28.5, 32.0

Winter data ($Y_i$s):

30.5, 28.4, 40.2, 37.6, 36.5, 38.8, 34.7, 29.5, 29.7, 37.2, 41.5, 37.0

Sample sizes $n_X = 12$ and $n_Y = 12$. There is a two sided test with hypotheses:

$$H_0 : \mu_X = \mu_Y$$

$$H_1 : \mu_X \neq \mu_Y$$


Small data-sets but we can assume that the population variances (and hence SDs) are equal, e.g. $\sigma_X^2 = \sigma_Y^2$. The two sample means are:

$\bar{X} = 30.8 \qquad \text{and} \qquad Ybar = 35.1$$

The two sample variances are:

$$S^2_X = 18.5 \qquad \text{and} \qquad S^2_Y = 20.3$$

We calculate the pooled sample variance as:

$$S^2_P = 19.4$$

Our test-statistic is t-distributed with $n_X+n_Y -2 = 22$ degrees of freedom and value:
	$$TS = -2.394$$
    
>    Test the hypothesis that the mean fat intake is the same for both months.
>
>    Use the
>
>    (a) 5 percent

For significance-level-0.05 test:

$$t_{n+m-2,\alpha/2} = t_{22,0.025} = 2.074$$

Since $|-2.394| >= 2.074$, we reject H0.

>
>    (b) 1 percent

For significance-level-0.01 test

$$t_{n+m-2,\alpha/2} = t_{22,0.005} = 2.819$$

Since $|-2.394| < 2.819$, we do not reject H0.

>    level of significance.

$$\text{p_value} = 2 \cdot \Pr\left(T_{n+m-2} >= \left|-2.394\right|\right) = 0.0256$$


#### Question 3.


Call the times to disolve for generic brand $X_i$ and times to disolve for name brand $Y_i$. The populations have respective means $\mu_X$ and $\mu_Y$ and respective variances $\sigma_X^2$ and $\sigma_Y^2$ (all unknown).

>    (a) Do the given data establish, at the 5 percent level of significance,
that the name-brand tablet is quicker to dissolve?

To be able establish this (where the data allows) we must have the following null and alternative hypotheses:

$$H_0 : \mu_X \leq \mu_Y$$
$$H_1 : mu_X > mu_Y$$

A one sided test, where rejection of the null hypothesis (strong) establishes that the name brand has a faster time to disolve.

Small data-sets ($n_X = $ and $n_Y=$) but we can assume that the population variances (and hence SDs) are equal, e.g. $\sigma_X^2 = \sigma_Y^2$. The two sample means are:

$\bar{X} = 14.5 \qquad \text{and} \qquad Ybar = 14.6$$

The two sample variances are:

$$S^2_X = 0.285 \qquad \text{and} \qquad S^2_Y = 0.176$$

We calculate the pooled sample variance as:

$$S^2_P = 0.231$$

Our test-statistic is t-distributed with $16$ degrees of freedom and value:

$$TS = -0.638$$

Significance-level-0.05 test:

$$t_{n+m-2,\alpha} = 1.746$$

As $-0.638 < 1.746$, we do not reject $H_0$.

>    (b) What about at the 10 percent level of significance?

Significance-level-0.1 test:

$$t_{n+m-2,\alpha} = 1.337$$

As $-0.638 < 1.337$, we do not reject $H_0$.


In fact, we could have noted that the mean time to disolve of the generic was actually lower than the mean time to disolve oof the name brand and so concluded that we were going to have a p-value greater than $0.5$. As such we would not reject the null hypothesis at any reasonable significance leve.

In [86]:
## supporting code for exercise C

print("Question 2")
dataX = np.array([32.2, 27.4, 28.6, 32.4, 40.5, 26.2, 29.4, 25.8, 36.6, 30.3, 28.5, 32.0])
dataY = np.array([30.5, 28.4, 40.2, 37.6, 36.5, 38.8, 34.7, 29.5, 29.7, 37.2, 41.5, 37.0])
Xbar = np.mean(dataX)
Ybar = np.mean(dataY)
print("Hypotheses:")
print("\tH_0 : mu_X = mu_Y")
print("\tH_1 : mu_X != mu_Y")
print(f"The two sample means are:\n\tXbar = {Xbar:.1f} and Ybar = {Ybar:.1f}")
n_X = dataX.size
n_Y = dataY.size
print(f"Sample sizes are:\n\tn_X = {n_X} and n_Y = {n_Y}")
S2_X = np.var(dataX, ddof=1)
S2_Y = np.var(dataY, ddof=1)
print(f"The two sample variances are:\n\tS2_X = {S2_X:.1f} and S2_Y = {S2_Y:.1f}")
S2_P = ((n_X-1)*S2_X + (n_Y-1)*S2_Y)/(n_X + n_Y - 2)
print(f"We calculate the pooled sample variance as:\n\tS2_P = {S2_P:.1f}")
dof = n_X + n_Y - 2
TS = (Xbar - Ybar)/np.sqrt(S2_P*(1/n_X + 1/n_Y))
print(f"Our test-statistic is t-distributed with {dof} degrees of freedom", end="")
print(f" and value:\n\tTS = {TS:.3f}")
# part a
alpha = 0.05
# we reject if our test statistic has magnitude greater than
t0025 = stats.t.ppf(1-alpha/2, dof)
# evaluate whether we reject
reject = np.abs(TS) >= t0025
# then output
print(f"Significance-level-{alpha} test:")
print(f"\tt(n+m-2,alpha/2) = {t0025:.3f}")
print(f"\t|{TS:.3f}| {'>=' if reject else '<'} {t0025:.3f}, ", end="")
print(f"so {'reject H0' if reject else 'do not reject H0'}")
# part b
alpha = 0.01
# we reject if our test statistic has magnitude greater than
t0005 = stats.t.ppf(1-alpha/2, dof)
# evaluate whether we reject
reject = np.abs(TS) >= t0005
# then output
print(f"Significance-level-{alpha} test:")
print(f"\tt(n+m-2,alpha/2) = {t0005:.3f}")
print(f"\t|{TS:.3f}| {'>=' if reject else '<'} {t0005:.3f}, ", end="")
print(f"so {'reject H0' if reject else 'do not reject H0'}")

# p-value
p_value = 2*(1 - stats.t.cdf(np.abs(TS), dof))
print(f"p_value = 2 Pr(T(n+m-2) >= |{TS:.3f}|) = {p_value:.3g}")
print()
print()

print("Question 3.")

dataX = np.array([14.2, 14.7, 13.9, 15.3, 14.8, 13.6, 14.6, 14.9, 14.2])
dataY = np.array([14.3, 14.9, 14.4, 13.8, 15.0, 15.1, 14.4, 14.7, 14.9])
Xbar = np.mean(dataX)
Ybar = np.mean(dataY)
print("Hypotheses:")
print("\tH_0 : mu_X <= mu_Y")
print("\tH_1 : mu_X > mu_Y")
print(f"The two sample means are:\n\tXbar = {Xbar:.1f} and Ybar = {Ybar:.1f}")
n_X = dataX.size
n_Y = dataY.size
print(f"Sample sizes are:\n\tn_X = {n_X} and n_Y = {n_Y}")
S2_X = np.var(dataX, ddof=1)
S2_Y = np.var(dataY, ddof=1)
print(f"The two sample variances are:\n\tS2_X = {S2_X:.3f} and S2_Y = {S2_Y:.3f}")
S2_P = ((n_X-1)*S2_X + (n_Y-1)*S2_Y)/(n_X + n_Y - 2)
print(f"We calculate the pooled sample variance as:\n\tS2_P = {S2_P:.3f}")
dof = n_X + n_Y - 2
TS = (Xbar - Ybar)/np.sqrt(S2_P*(1/n_X + 1/n_Y))
print(f"Our test-statistic is t-distributed with {dof} degrees of freedom", end="")
print(f" and value:\n\tTS = {TS:.3f}")
# part a
alpha = 0.05
# we reject if our test statistic has value greater than
t005 = stats.t.ppf(1-alpha, dof)
# evaluate whether we reject
reject = TS >= t005
# then output
print(f"Significance-level-{alpha} test:")
print(f"\tt(n+m-2,alpha) = {t005:.3f}")
print(f"\t{TS:.3f} {'>=' if reject else '<'} {t005:.3f}, ", end="")
print(f"so {'reject H0' if reject else 'do not reject H0'}")
# part b
alpha = 0.1
# we reject if our test statistic has value greater than
t01 = stats.t.ppf(1-alpha, dof)
# evaluate whether we reject
reject = TS >= t01
# then output
print(f"Significance-level-{alpha} test:")
print(f"\tt(n+m-2,alpha) = {t01:.3f}")
print(f"\t{TS:.3f} {'>=' if reject else '<'} {t01:.3f}, ", end="")
print(f"so {'reject H0' if reject else 'do not reject H0'}")


Question 2
Hypotheses:
	H_0 : mu_X = mu_Y
	H_1 : mu_X != mu_Y
The two sample means are:
	Xbar = 30.8 and Ybar = 35.1
Sample sizes are:
	n_X = 12 and n_Y = 12
The two sample variances are:
	S2_X = 18.5 and S2_Y = 20.3
We calculate the pooled sample variance as:
	S2_P = 19.4
Our test-statistic is t-distributed with 22 degrees of freedom and value:
	TS = -2.394
Significance-level-0.05 test:
	t(n+m-2,alpha/2) = 2.074
	|-2.394| >= 2.074, so reject H0
Significance-level-0.01 test:
	t(n+m-2,alpha/2) = 2.819
	|-2.394| < 2.819, so do not reject H0
p_value = 2 Pr(T(n+m-2) >= |-2.394|) = 0.0256


Question 3.
Hypotheses:
	H_0 : mu_X <= mu_Y
	H_1 : mu_X > mu_Y
The two sample means are:
	Xbar = 14.5 and Ybar = 14.6
Sample sizes are:
	n_X = 9 and n_Y = 9
The two sample variances are:
	S2_X = 0.285 and S2_Y = 0.176
We calculate the pooled sample variance as:
	S2_P = 0.231
Our test-statistic is t-distributed with 16 degrees of freedom and value:
	TS = -0.638
Significance-level-0.05 test:
	t(n+m-2,alpha

## Exercise D

Complete questions  from problems for (Ross, 2017, Sec. 10.5). The text is repeated below for convenience:

> 6. Consider Prob. 2 of Sec. 10.4. Suppose that the same women were used
for both months and that the data in each of the columns referred to the
same woman’s fat intake during the summer and winter.
>
>    (a) Test the hypothesis that there is no difference in fat intake during
summer and winter. Use the 5 percent level of significance.
>
>    (b) Repeat (a), this time using the 1 percent level.


*complete your answers in Markdown using the code-block below for computation*

#### Question 2

We have two sets of RVs, $X_i$ and $Y_i$, corresponding to the summer and winter percentages of calories that came from fats. Now, we assume that person $i$ produced the pair of results $(X_i, Y_i)$. The populations have respective means $\mu_X$ and $\mu_Y$.

>    (a) Test the hypothesis that there is no difference in fat intake during
summer and winter. Use the 5 percent level of significance.

Sample sizes $n = 12$. And there is a two sided test with hypotheses:

$$H_0 : \mu_X = \mu_Y$$

$$H_1 : \mu_X \neq \mu_Y$$

We now define the difference RVs, $D_i = X_i - Y_i$. This allows us to rewrite the hypotheses as the following:

$$H_0 : \mu_D = 0$$

$$H_1 : \mu_D \neq 0$$

The sample mean of the difference is $\bar{D} = -4.31$ and sample size is $n = 12$. The sample SD of the difference RVs is:
$$S_D = 6.39$$

Our test-statistic is t-distributed with $n-1=11$ degrees of freedom and value:
$$TS = -2.337$$


Significance-level-0.05 test:

$$t_{n-1,\alpha/2} = t_{11,0.025} = 2.201$$

As $|-2.337| >= 2.201$, we reject H0 at the 5% level.

>    (b) Repeat (a), this time using the 1 percent level.

Significance-level-0.01 test:

$$t_{n-1,\alpha/2} = t_{11,0.005} = 3.106$$

As $|-2.337| < 3.106$, we do not reject H0 at the 1% level.


In [87]:
## supporting code for Exercise D

print("Question 2")
dataX = np.array([32.2, 27.4, 28.6, 32.4, 40.5, 26.2, 29.4, 25.8, 36.6, 30.3, 28.5, 32.0])
dataY = np.array([30.5, 28.4, 40.2, 37.6, 36.5, 38.8, 34.7, 29.5, 29.7, 37.2, 41.5, 37.0])
print("Hypotheses:")
print("\tH_0 : mu_X = mu_Y")
print("\tH_1 : mu_X != mu_Y")
dataD = dataX - dataY
Dbar = np.mean(dataD)
print("Equivalent Hypotheses:")
print("\tH_0 : mu_D = 0")
print("\tH_1 : mu_D != 0")
print(f"The sample mean of the difference is:\n\tDbar = {Dbar:.2f}")
n = dataX.size
print(f"Sample size is:\n\tn = {n}")
S2_D = np.var(dataD, ddof=1)
print(f"The sample variance of the difference is:\n\tS2_D = {S2_D:.2}")
S_D = np.std(dataD, ddof=1)
print(f"The sample SD of the difference is:\n\tS_D = {S_D:.2f}")
dof = n-1
TS = np.sqrt(n)*Dbar/S_D
print(f"Our test-statistic is t-distributed with {dof} degrees of freedom", end="")
print(f" and value:\n\tTS = {TS:.3f}")
# part a
alpha = 0.05
# we reject if our test statistic has magnitude greater than
t0025 = stats.t.ppf(1-alpha/2, dof)
# evaluate whether we reject
reject = np.abs(TS) >= t0025
# then output
print(f"Significance-level-{alpha} test:")
print(f"\tt(n-1,alpha/2) = {t0025:.3f}")
print(f"\t|{TS:.3f}| {'>=' if reject else '<'} {t0025:.3f}, ", end="")
print(f"so {'reject H0' if reject else 'do not reject H0'}")
# part b
alpha = 0.01
# we reject if our test statistic has magnitude greater than
t0005 = stats.t.ppf(1-alpha/2, dof)
# evaluate whether we reject
reject = np.abs(TS) >= t0005
# then output
print(f"Significance-level-{alpha} test:")
print(f"\tt(n-1,alpha/2) = {t0005:.3f}")
print(f"\t|{TS:.3f}| {'>=' if reject else '<'} {t0005:.3f}, ", end="")
print(f"so {'reject H0' if reject else 'do not reject H0'}")


Question 2
Hypotheses:
	H_0 : mu_X = mu_Y
	H_1 : mu_X != mu_Y
Equivalent Hypotheses:
	H_0 : mu_D = 0
	H_1 : mu_D != 0
The sample mean of the difference is:
	Dbar = -4.31
Sample size is:
	n = 12
The sample variance of the difference is:
	S2_D = 4.1e+01
The sample SD of the difference is:
	S_D = 6.39
Our test-statistic is t-distributed with 11 degrees of freedom and value:
	TS = -2.337
Significance-level-0.05 test:
	t(n-1,alpha/2) = 2.201
	|-2.337| >= 2.201, so reject H0
Significance-level-0.01 test:
	t(n-1,alpha/2) = 3.106
	|-2.337| < 3.106, so do not reject H0


## Exercise E

Complete questions 1, 4 and 11 from problems for (Ross, 2017, Sec. 10.6). The text is repeated below for convenience:

> 1. Two methods have been proposed for producing transistors. If method
1 resulted in 20 unacceptable transistors out of a total of 100 produced
and method 2 resulted in 12 unacceptable transistors out of a total of 100
produced, can we conclude that the proportions of unacceptable transis-
tors that will be produced by the two methods are different?
>
>    (a) Use the 5 percent level of significance.
>
>    (b) What about at the 10 percent level of significance?

> 4. A large swine flu vaccination program was instituted in 1976. Approxi-
mately 50 million of the roughly 220 million North Americans received
the vaccine. Of the 383 persons who subsequently contracted swine flu,
202 had received the vaccine.
>
>    (a) Test the hypothesis, at the 5 percent level, that the probability of
contracting swine flu is the same for the vaccinated portion of the
population as for the unvaccinated.
>
>    (b) Do the results of part (a) indicate that the vaccine itself was causing
the flu? Can you think of any other possible explanations?


> 11. A birthing class run by the University of California has recently added
a lecture on the importance of the use of automobile car seats for children.
This decision was made after a study of the results of an experiment
in which the lecture was given in some of the birthing classes and not in
others. A follow-up interview, carried out 1 year later, questioned 82
couples who had heard the lecture and 120 who had not. A total of 78 of the
couples who had heard the lecture stated that they always used an infant
car seat, whereas a total of 90 of those couples not attending the lecture
made the same claim.
>
>    (a) Assuming the accuracy of the given information, is the difference
significant enough to conclude that instituting the lecture will result
in increased use of car seats? Use the 5 percent level of significance.
>
>    (b) What is the p value?

*complete your answers in Markdown using the code-block below for computation*


#### Question 1
The two methods give two different populations. The proportion of population 1 (using method 1) which are unacceptable is $p_1$, and the proportion of population 2 (using method 2) is $p_2$.

We have a sample from each of size $n_1 = 100$ and $n_2 = 100$ with $X_1 = 20$ unacceptable in the sample from population 1 and $X_2=12$ unacceptable in the sample from population 2.

Estimates of the two population proportions are given by sample proportions:

$$\hat{p}_1  = 0.200
\qquad \text{and} \qquad
\hat{p}_2 = 0.120$$

>    (a) Use the 5 percent level of significance.

We have a significance leve test with $\alpha = 0.05$.

We will reject if our test statistic:

$$TS \geq z_{0.025} = 1.96$$

As $|1.543| < 1.960$, we do not reject H0 at the 5% significance level

>    (b) What about at the 10 percent level of significance?

We have a significance leve test with $\alpha = 0.1$.

We will reject if our test statistic:

$$TS >= z_{0.05} = 1.64$$

As $|1.543| < 1.645$, we do not reject H0 at the 10% significance level

For completeness:

$$\text{p_value} = 2 Pr(Z >= |1.543|) = 0.123$$

#### Question 4.

This requires a little interpretation. We have two samples with (very large) sizes $n_1 = 50000000 = 5e7$ and $n_2 = 220000000 = 2.2e8$ people. These can be thought of as samples from an even larger population (say if we were to institute the same practice worldwide). Of these the number of positive cases in each sample were $X_1 = 202$ and $X_2 = 383 - 202 = 181$.

>    (a) Test the hypothesis, at the 5 percent level, that the probability of
contracting swine flu is the same for the vaccinated portion of the
population as for the unvaccinated.

This suggests the following hypotheses:

$$H_0 : p_1 = p_2$$

$$H_1 : p_1 \neq p_2$$

So let's test this. Sample proportions are: 

$$\hat{p}_1 = 4.04e-06 = 4.04  \times 10^{-6}
\qquad \text{and}\qquad
\hat{p}_2 = 8.23e-07 = 8.23 \times 10^{-7}$$

Pooled proportion estimate under H0:
$$\hat{p} = 1.42e-06 = 1.42  \times 10^{-6}$$

Our test statistic is therefore:
	
$$TS = \frac{\hat{p}_1 - \hat{p}_2}{\sqrt{(1/n_1 + 1/n_2)\hat{p}(1-\hat{p})}}= 17.2$$
    
For $\alpha = 0.05$, we will reject if our test statistic:

$$TS >= z_{0.025} = 1.960$$

As $|17.242| >= 1.960$, we reject H0 at the 5% significance level.

In fact, this is highly indicative that the two populations have a different  probability of contracting swine flu. The vaccinated population has a much higher chance. Whatever significance value we chose, this would be a significant result. The p-value is indistinguishable from $0$.

>    (b) Do the results of part (a) indicate that the vaccine itself was causing
the flu? Can you think of any other possible explanations?

There are a number of alternative explanations. It may be that higher risk individuals were targetted for vaccination. It may be that vaccinations occured in regions that also suffered higher prevalence of the disease.


#### Question 11.

>    (a) Assuming the accuracy of the given information, is the difference
significant enough to conclude that instituting the lecture will result
in increased use of car seats? Use the 5 percent level of significance.

We can treat the first $n_1 = 82$ (who had heard of the study) as a sample from population $1$ (those that have heard of the lecture), and likewise treat the remaining $n_2 = 120$ as a sample from  population $2$ (those who have not). We then have $X_1 = 78$ and $X_2 = 90$ counts from the respective samples of those who always use an infant car seat.

Our hypotheses are:

$$H_0 : p_1 = p_2 \qquad \text{The lecture makes no difference}$$

$$H_1 : p_1 \neq p_2\qquad \text{The lecture makes a difference}$$

Sample proportions are: 

$$\hat{p}_1 = 0.951
\qquad \text{and}\qquad
\hat{p}_2 = 0.75
$$

Pooled proportion estimate under H0:
$$\hat{p} = 0.832
$$

Our test statistic is therefore:
	
$$TS = \frac{\hat{p}_1 - \hat{p}_2}{\sqrt{(1/n_1 + 1/n_2)\hat{p}(1-\hat{p})}}= 3.754
$$
    
For $\alpha = 0.05$, we will reject if our test statistic:

$$TS >= z_{0.025} = 1.960$$

As $|3.754| >= 1.960$, we reject H0 at the 5% significance level.

>    (b) What is the p value?

And we calculate the p- value as:

$$\text{p_value} = 2 Pr(Z >= |3.754|) = 0.000174 = 1.74e-4$$

This is highly suggestive that there is a difference between the conditions.

In [97]:
## supporting code for Exercise E

print("Question 1.")
n_1 = 100
n_2 = 100
X_1 = 20
X_2=12
phat1 = X_1 / n_1 
phat2 = X_2 / n_2 
print("H0 : p1 = p2")
print("H1 : p1 != p2")
print(f"n_1 = {n_1}")
print(f"n_2 = {n_2}")
print(f"X_1 = {X_1}")
print(f"X_2 = {X_2}")
print(f"Sample proportions: phat1 = {phat1:.3f} and  phat2 = {phat2:.3f}")
phat = (X_1 + X_2)/(n_1 + n_2)
TS = (phat1 - phat2)/np.sqrt((1/n_1 + 1/n_2)*phat*(1-phat))
print("part (a)")
# at the 5% level of significance
alpha = 0.05
print(f"alpha = {alpha}")
# we reject if our standardised RV has magnitude greater than
z0025 = stats.norm.ppf(1-alpha/2)
print(f"We will reject if our test statistic:\n\tTS >= z_(0.025) = {z0025:.3f}")
# evaluate whether we reject
reject = np.abs(TS) >= z0025
# then output
print(f"|{TS:.3f}| {'>=' if reject else '<'} {z0025:.3f}, ", end="")
print(f"we {'reject H0' if reject else 'do not reject H0'} at the {alpha*100:.0f}% significance level")
print("part (b)")
# at the 5% level of significance
alpha = 0.1
print(f"alpha = {alpha}")
# we reject if our standardised RV has magnitude greater than
z005 = stats.norm.ppf(1-alpha/2)
print(f"We will reject if our test statistic:\n\tTS >= z_(0.05) = {z005:.3f}")
# evaluate whether we reject
reject = np.abs(TS) >= z005
# then output
print(f"As |{TS:.3f}| {'>=' if reject else '<'} {z005:.3f}, ", end="")
print(f"we {'reject H0' if reject else 'do not reject H0'} at the {alpha*100:.0f}% significance level")
# p-value
p_value = 2*(1 - stats.norm.cdf(np.abs(TS)))
print(f"p_value = 2 Pr(Z >= |{TS:.3f}|) = {p_value:.3g}")
print()

print("Question 4.")
n_1 = 50000000
n_2 = 220000000
X_1 = 202
X_2 = 383 - 202
print("H0 : p1 = p2")
print("H1 : p1 != p2")
print(f"n_1 = {n_1}")
print(f"n_2 = {n_2}")
print(f"X_1 = {X_1}")
print(f"X_2 = {X_2}")
phat1 = X_1 / n_1 
phat2 = X_2 / n_2
print(f"Sample proportions: phat1 = {phat1} and  phat2 = {phat2}")
phat = (X_1 + X_2)/(n_1 + n_2)
print(f"Pooled proportion estimate under H0:\n\tphat = {phat}")
TS = (phat1 - phat2)/np.sqrt((1/n_1 + 1/n_2)*phat*(1-phat))
print(f"Our test statistic is therefore:\n\tTS = {TS}")
print("part (a)")
# at the 5% level of significance
alpha = 0.05
print(f"alpha = {alpha}")
# we reject if our standardised RV has magnitude greater than
z0025 = stats.norm.ppf(1-alpha/2)
print(f"We will reject if our test statistic:\n\tTS >= z_(0.025) = {z0025:.3f}")
# evaluate whether we reject
reject = np.abs(TS) >= z0025
# then output
print(f"|{TS:.3f}| {'>=' if reject else '<'} {z0025:.3f}, ", end="")
print(f"we {'reject H0' if reject else 'do not reject H0'} at the {alpha*100:.0f}% significance level")
print("part (b)")
# at the 5% level of significance
alpha = 0.1
print(f"alpha = {alpha}")
# we reject if our standardised RV has magnitude greater than
z005 = stats.norm.ppf(1-alpha/2)
print(f"We will reject if our test statistic:\n\tTS >= z_(0.05) = {z005:.3f}")
# evaluate whether we reject
reject = np.abs(TS) >= z005
# then output
print(f"As |{TS:.3f}| {'>=' if reject else '<'} {z005:.3f}, ", end="")
print(f"we {'reject H0' if reject else 'do not reject H0'} at the {alpha*100:.0f}% significance level")
# p-value
p_value = 2*(1 - stats.norm.cdf(np.abs(TS)))
print(f"p_value = 2 Pr(Z >= |{TS:.3f}|) = {p_value:.3g}")
print()

print("Question 11.")
n_1 = 82
n_2 = 120
X_1 = 78
X_2 = 90
print("Our hypotheses are:")
print("\tH0 : p1 = p2 -- The lecture makes no difference")
print("\tH1 : p1 != p2 -- The lecture makes a difference")
print(f"n_1 = {n_1}")
print(f"n_2 = {n_2}")
print(f"X_1 = {X_1}")
print(f"X_2 = {X_2}")
phat1 = X_1 / n_1 
phat2 = X_2 / n_2
print(f"Sample proportions: phat1 = {phat1} and  phat2 = {phat2}")
phat = (X_1 + X_2)/(n_1 + n_2)
print(f"Pooled proportion estimate under H0:\n\tphat = {phat}")
TS = (phat1 - phat2)/np.sqrt((1/n_1 + 1/n_2)*phat*(1-phat))
print(f"Our test statistic is therefore:\n\tTS = {TS}")
print("part (a)")
# at the 5% level of significance
alpha = 0.05
print(f"alpha = {alpha}")
# we reject if our standardised RV has magnitude greater than
z0025 = stats.norm.ppf(1-alpha/2)
print(f"We will reject if our test statistic:\n\tTS >= z_(0.025) = {z0025:.3f}")
# evaluate whether we reject
reject = np.abs(TS) >= z0025
# then output
print(f"|{TS:.3f}| {'>=' if reject else '<'} {z0025:.3f}, ", end="")
print(f"we {'reject H0' if reject else 'do not reject H0'} at the {alpha*100:.0f}% significance level")
print("part (b)")
# p-value
p_value = 2*(1 - stats.norm.cdf(np.abs(TS)))
print(f"p_value = 2 Pr(Z >= |{TS:.3f}|) = {p_value:.3g}")



Question 1.
H0 : p1 = p2
H1 : p1 != p2
n_1 = 100
n_2 = 100
X_1 = 20
X_2 = 12
Sample proportions: phat1 = 0.200 and  phat2 = 0.120
part (a)
alpha = 0.05
We will reject if our test statistic:
	TS >= z_(0.025) = 1.960
|1.543| < 1.960, we do not reject H0 at the 5% significance level
part (b)
alpha = 0.1
We will reject if our test statistic:
	TS >= z_(0.05) = 1.645
As |1.543| < 1.645, we do not reject H0 at the 10% significance level
p_value = 2 Pr(Z >= |1.543|) = 0.123

Question 4.
H0 : p1 = p2
H1 : p1 != p2
n_1 = 50000000
n_2 = 220000000
X_1 = 202
X_2 = 181
Sample proportions: phat1 = 4.04e-06 and  phat2 = 8.227272727272728e-07
Pooled proportion estimate under H0:
	phat = 1.4185185185185185e-06
Our test statistic is therefore:
	TS = 17.241900761040032
part (a)
alpha = 0.05
We will reject if our test statistic:
	TS >= z_(0.025) = 1.960
|17.242| >= 1.960, we reject H0 at the 5% significance level
part (b)
alpha = 0.1
We will reject if our test statistic:
	TS >= z_(0.05) = 1.645
As |17.242|