# Chapter 9 Inferences for Proportions and Count Data

In [1]:
import polars as pl
from polars import col, lit
from scipy import stats, special
import numpy as np

RNG = np.random.default_rng()

## 9.1 Inferences on Proportion

This chapter begins with inference procedures for an unknown proportion $p$ in a Bernoulli population. The sample proportion $\hat{p}$ from a random sample of size $n$ is an unbiased estimate of $p$. Inferences on p are based on the central limit theorem (CLT) result that for large $n$, the sample proportion $\hat{p}$ is approximately normal with mean = $p$ and standard deviation = $\sqrt{pq/n}$ . A large sample two-sided 100(1- $\alpha$)% confidence interval for $p$ is given by

$$
\left[ \hat{p} \pm z_{\alpha /2} \sqrt{\frac{\hat{p} \hat{q}}{n}}\;\right]
$$

where $\hat{q}$ = 1 - $\hat{p}$ and $z_{\alpha/2}$ is the upper $\alpha/2$ critical point of the standard normal distribution. A large sample test on $p$ to test $H_0: p = p_0$ can be based on the test statistic

$$
z = \frac{\hat{p} - p_0}{\sqrt{\hat{p}\hat{q}/n}} \quad \text{or} \quad 
z = \frac{\hat{p} - p_0}{\sqrt{p_0 q_0 / n}}.
$$

Both these statistics are asymptotically standard normal under $H_0$.

### Ex 9.1

A business journal publisher plans to survey a sample of the subscribers to estimate the proportion $p$ with annual household incomes over $100.000.

#### (a)

How many subscribers must be surveyed to obtain a 99% CI for $p$ with a margin of error no greater than 0.05? Assume that no prior estimate of $p$ is available.

✍️ The margin of error

$$
E = z_{\alpha/2} \sqrt{\frac{p q}{n}} \text{.}
$$

Therefore,

$$
n = \frac{z_\alpha^2 p q}{E^2}
$$

Because no previous estimate of $p$ is available, we use $1/2$ as a conservative estimate.

In [7]:
α = 1 - 0.99
p = 1/2
n = stats.norm.ppf(1-α/2)**2 * p * (1-p)/ 0.05**2
print(np.ceil(n))

664.0


So 664 subscribers should be serveyed.

#### (b)

The marketing department thinks that $p$ = 0.30 would be a reasonable guess. What is the corresponding sample size?

In [10]:
α = 1 - 0.99
p = 0.3
n = stats.norm.ppf(1-α/2)**2 * p * (1-p)/ 0.05**2
print(np.ceil(n))

558.0


#### (c)

Refer to the sample size obtained in (b). If a 40% nonresponse rate is anticipated, how many surveys need to be mailed? How may such a high nonresponse rate cause bias in the estimate?

In [13]:
non_response_rate = 0.4
mails = n / (1 - non_response_rate)
print(np.ceil(mails))

929.0


Assuming the responses tend to come from higher-income households, the result may overestimate $p$.

### Ex 9.2

While imprisoned by the Germans during World War II, the English mathematician John Kerrich tossed a coin 10,000 times and obtained 5067 heads. Let $p$ be the probability of a head on a single toss. We wish to check if the data are consistent with the hypothesis that the coin was fair.

#### (a)
Set up the hypotheses. Why should the alternative be two-sided?

✍️

$$
\begin{align*}
H_0: & p = 1/2\text{, v.s.} \\
H_1: & p \ne 1/2 \text{.}
\end{align*}
$$

$H_1$ is two-sided because if the coin is not fair, $p$ could be either $> 1/2$ or $< 1/2$.

#### (b)
Calculate the $P$-value. Can you reject $H_0$ at the .05 level?

In [28]:
n = 10_000
p = 1/2
q = 1 - p
z = (5067/n - p)/np.sqrt(p*q/n)
p_val = 2 * stats.norm.sf(z)
print(p_val)

0.18024534492890254


Because 0.18 > 0.05, cannot reject $H_0$.

#### (c)
Find a 95% CI for the proportion of heads for Kerrich's coin.

In [27]:
α = 1 - 0.95
margin_of_error = float(stats.norm.ppf(1-α/2) * np.sqrt(p*q/n))
ci = (5067/n - margin_of_error, 5067/n + margin_of_error)
print(ci)

(0.49690018007729975, 0.5164998199227003)


### Ex 9.3

Calls to technical support service of a software company are monitored on a sampling basis for quality assurance. Each monitored call is classified as satisfactory or unsatisfactory by the supervisor in terms of the quality of help offered. A random sample of 100 calls was monitored over one month for a new trainee; 8 calls were classified as unsatisfactory.

#### (a)
Calculate a 95% CI for the actual proportion of unsatisfactory calls during the month.
Use both formulas (9.1) and (9.3) and compare the results.

✍️ Formula (9.1):

$$
\hat{p} - z_{\alpha/2}\sqrt{\frac{\hat{p}\hat{q}}{n}} \le p \le \hat{p} + z_{\alpha/2}\sqrt{\frac{\hat{p}\hat{q}}{n}} 
$$

In [29]:
α = 1 - 0.95
n = 100
p = 8/100
q = 1 - p
margin_of_error = float(stats.norm.ppf(1-α/2) * np.sqrt(p*q/n))
ci = (p - margin_of_error, p + margin_of_error)
print(ci)

(0.02682751000723329, 0.1331724899927667)


Formula (9.3):

$$
\frac{\hat{p} + \frac{z^2}{2n} - \sqrt{\frac{\hat{p}\hat{q}z^2}{n} + \frac{z^4}{4n^2}}}{1 + \frac{z^2}{n}} 
\le p \le  
\frac{\hat{p} + \frac{z^2}{2n} + \sqrt{\frac{\hat{p}\hat{q}z^2}{n} + \frac{z^4}{4n^2}}}{1 + \frac{z^2}{n}} 
$$

where $z$ = $z_{\alpha/2}$.

In [34]:
α = 1 - 0.95
z = stats.norm.ppf(1-α/2)
n = 100
p = 8/100
q = 1 - p
denom = 1 + z**2/n
margin = np.sqrt(p*q*z**2/n + z**4/(4*n**2)) / denom
mid = (p + z**2/(2*n)) / denom 
ci = (float(mid - margin), float(mid + margin))
print(ci)

(0.04109346148438062, 0.14998107700948735)


Formula (9.3) gives a slightly higher CI than formula (9.1). (shifts to the right by about 0.015)

#### (b) 
This CI is used to test $H_0: p = 0.10$ vs. $H_1: p \ne 0.10$. If $H_0$ is not rejected, then monitoring of the trainee is continued at the same frequency; if $H_0$ is rejected in the lower tail, then monitoring frequency is reduced; and if $H_0$ is rejected in the upper tail, then the trainee is provided additional training. Based on the CI calculated in (a), what action should be taken on this trainee?

✍️ Because 0.1 is contained in the CI (for both formulas), $H_0$ is not rejected. Therefore monitoring will continue at the same frequency.

### Ex 9.4

The fraction defective in a high volume production process is to be estimated using a 95% CI with a margin of error of 0.2%.

#### (a) 
If the a priori guess at the fraction defective is 1%, how many parts should be sampled? Compare this number with the sample size that you would need if no a priori information regarding the true fraction defective is assumed.

✍️ 

$$
n = \frac{z_\alpha^2 p q}{E^2}
$$

In [43]:
α = 1 - 0.95
p = np.array([0.01, 0.5])
n = stats.norm.ppf(1-α/2)**2 * p * (1-p)/ 0.002**2
print(np.ceil(n))

[  9508. 240092.]


#### (b)
One problem with estimating a very low fraction defective is that no defectives may be obtained in the sample, making it impossible to calculate a CI. What sampling method would you use to ensure that there will be sufficient number of defectives in the sample to provide reliable information on the true fraction defective?

✍️ Start with a small sample and, if no defectives are found, continue to add to the sample until a certain number of defectives are observed and the current sample provides enough evidence about the proportion of defectives.

### Ex 9.5

A quarterback from a Big Ten college football team worked to improve his proportion of completed passes. His career average had been 46.5% completed passes. His record halfway into the new season is 82 completed passes out of 151 attempted.

#### (a) 
Set up the hypotheses to test whether his proportion of completed passes has improved. Should the alternative be one-sided or two-sided? Explain.

✍️ Because we are looking for signs of improvement, one-sided alternative should be used.

$$
H_0: p = p_0 = 46.5\% \quad \text{v.s.} \quad H_1: p > p_0 
$$

#### (b) 
Perform a test at $\alpha$ = .05. Is there a significant improvement?

In [49]:
k, n = 82, 151
p0 = 0.465
z = (k/n - p0) / np.sqrt(p0*(1-p0)/n)
pvalue = stats.norm.sf(z)
print(pvalue)

0.027251575849474706


Or use the exact, binomial test:

In [47]:
test = stats.binomtest(k=82, n=151, p=0.465, alternative='greater')
print(test.pvalue)

0.032948694746468936


Both methods result in $p < \alpha$, so yes, there is significant improvement.

#### (c) 
At least how many passes out of 151 should he have completed in order to demonstrate significant improvement at $\alpha$ = .025?

✍️ The $z$ statistic should be at least

In [51]:
α = .025
z = stats.norm.ppf(1-α)
print(z)

1.959963984540054


Therefore,

In [53]:
k = n * (z * np.sqrt(p0*(1-p0)/n) + p0)
print(np.ceil(k))

83.0


At $\alpha$ = .025, he should complete 83 passes or more.

### Ex 9.6

A blood test intended to identify patients at "high risk" of cardiac disease gave positive results on 80 out of 100 known cardiac patients, but also on 16 out of 200 known normal patients.

#### (a) 
Find a 90% CI for the sensitivity of the test, which is defined as the probability that a cardiac patient is correctly identified.

In [3]:
α = 1 - 0.9
n = 100
p = 80/100
margin = float(stats.norm.ppf(1-α/2) * np.sqrt(p*(1-p)/n))
ci = (p - margin, p + margin)
print(ci)

(0.7342058549219411, 0.865794145078059)


Compare it with

In [14]:
(
    stats.binomtest(k=80, n=100, alternative='two-sided')
    .proportion_ci(confidence_level=0.9)
)

ConfidenceInterval(low=0.7227997503290864, high=0.8633386747541124)

#### (b) 
Find a 90% CI for the specifidty of the test, which is defined as the probability that a normal patient is correctly identified.

In [15]:
(
    stats.binomtest(k=200-16, n=200, alternative='two-sided')
    .proportion_ci(confidence_level=0.9)
)

ConfidenceInterval(low=0.8810282256716788, high=0.9491782829700732)

### Ex 9.7

People at high risk of sudden cardiac death can be identified using the change in a signal averaged electrocardiogram before and after prescribed activities. The current method is about 80% accurate. The method was modified, hoping to improve its accuracy. The new method is tested on 50 people and gave correct results on 46 patients. Is this convincing
evidence that the new method is more accurate?

#### (a) 
Set up the hypotheses to test that the accuracy of the new method is better than that of the current method.

✍️ 

$$
H_0: p = 0.8 \quad \text{v.s.} \quad H_1: p > 0.8 .
$$

#### (b) 
Perform a test of the hypotheses at $\alpha$ = .05. What do you conclude about the accuracy of the new method?

In [9]:
n = 50
p0 = 0.8
p = 46/50
z = (p - p0) / np.sqrt(p0*(1-p0)/n)
pval = stats.norm.sf(z)
print(pval)

0.016947426762344633


In [8]:
test = stats.binomtest(k=46, n=50, p=0.8, alternative='greater')
print(test.pvalue)

0.018496015060209342


The new method is significantly more accurate.

### Ex 9.8

Refer to the previous exercise.

#### (a) 
If the new method actually has 90% accuracy, what power does a sample of 50 have to demonstrate that the new method is better, using a .05-level test?

✍️ Knowing that $\frac{\hat{p} - p_1}{\sqrt{p_1 q_1 / n}} \sim N(0,1)$, the power

$$
\begin{align*}
\pi &= \mathrm{P}\left\{\frac{\hat{p} - p_0}{\sqrt{\frac{p_0 q_0}{n}}} > z_{\alpha}\right\}\\
&= \mathrm{P}\left\{ \frac{\hat{p} - p_1}{\sqrt{\frac{p_1 q_1}{n}}} > 
    z_{\alpha} \sqrt{\frac{p_0 q_0}{p_1 q_1}} + \frac{p_0 - p_1}{\sqrt{\frac{p_1 q_1}{n}}} \right\} \\
&= 1-\Phi\left( z_{\alpha} \sqrt{\frac{p_0 q_0}{p_1 q_1}} + \frac{p_0 - p_1}{\sqrt{\frac{p_1 q_1}{n}}}\right) \\
&= \Phi\left( \frac{(p_1 - p_0) \sqrt{n} - z_\alpha \sqrt{p_0 q_0}}{\sqrt{p_1 q_1}} \right) \text{.}
\end{align*}
$$

In [2]:
α = .05
p0 = 0.8
p1 = 0.9
n = 50
z = ((p1-p0)*np.sqrt(n) - stats.norm.ppf(1-α)*np.sqrt(p0*(1-p0))) / np.sqrt(p1*(1-p1))
power = float(stats.norm.cdf(z))
print(power)

0.5650889396286685


#### (b) 
How many patients should be tested in order for this power to be at least 0.75?

✍️ Using the result from (a) and letting $\pi = 0.75 = 1-\beta$, we have

$$
\frac{(p_1 - p_0) \sqrt{n} - z_\alpha \sqrt{p_0 q_0}}{\sqrt{p_1 q_1}} = z_{\beta} \text{,}
$$

therefore,

$$
n = \left( \frac{z_\beta \sqrt{p_1 q_1} + z_\alpha \sqrt{p_0 q_0}}{p_1 - p_0} \right)^2 \text{.}
$$

In [3]:
β = 1 - 0.75
n = ((stats.norm.ppf(1-β)*np.sqrt(p1*(1-p1)) + stats.norm.ppf(1-α)*np.sqrt(p0*(1-p0)))/(p0 - p1))**2
print(np.ceil(n))

75.0


### Ex 9.9

A preelection poll is to be planned for a senatorial election between two candidates. Previous polls have shown that the election is hanging in delicate balance. If there is a shift (in either direction) by more than 2 percentage points since the last poll, then the polling agency would like to detect it with probability of at least 0.80 using a .05-level test. Determine how many voters should be polled. If actually 2500 voters are polled, what is the value of this probability?

✍️ We are considering the case when the real poll percentage is either > 52% or < 48%, but not both. So it is essentially the same as Ex. 9.8 except we should substitute $z_{\alpha/2}$ for $z_\alpha$ in the formula for $n$, giving

$$
n = \left( \frac{z_\beta \sqrt{p_1 q_1} + z_{\alpha/2} \sqrt{p_0 q_0}}{p_1 - p_0} \right)^2 \text{.}
$$

Note that for $p_1 = 1/2 \pm \delta$, the result would be the same.

In [4]:
β = 1 - 0.8
α = .05
p1 = 0.52 # or 0.48, the result will be the same.
p0 = 0.5
n = ((stats.norm.ppf(1-β)*np.sqrt(p1*(1-p1)) + stats.norm.ppf(1-α/2)*np.sqrt(p0*(1-p0)))/(p0 - p1))**2
print(np.ceil(n))

4904.0


If actually 2500 voters are polled, the power

$$
\begin{align*}
\pi &= \mathrm{P}\left\{\frac{\hat{p} - p_0}{\sqrt{\frac{p_0 q_0}{n}}} > z_{\alpha/2}\right\}
    + \mathrm{P}\left\{\frac{\hat{p} - p_0}{\sqrt{\frac{p_0 q_0}{n}}} < -z_{\alpha/2}\right\}\\
&= \Phi\left( \frac{(p_1 - p_0) \sqrt{n} - z_{\alpha/2} \sqrt{p_0 q_0}}{\sqrt{p_1 q_1}} \right)
    + \Phi\left( \frac{(p_0 - p_1) \sqrt{n} - z_{\alpha/2} \sqrt{p_0 q_0}}{\sqrt{p_1 q_1}} \right) \text{.}
\end{align*}
$$

Again, for $p_1 = 1/2 \pm \delta$, the result would be the same.

In [28]:
α = .05
p0 = 0.5
p1 = 0.52 # or 0.48, the result will be the same
n = 2500
z1 = ((p1-p0)*np.sqrt(n) - stats.norm.ppf(1-α/2)*np.sqrt(p0*(1-p0))) / np.sqrt(p1*(1-p1))
z2 = ((p0-p1)*np.sqrt(n) - stats.norm.ppf(1-α/2)*np.sqrt(p0*(1-p0))) / np.sqrt(p1*(1-p1))
power = float(stats.norm.cdf(z1) + stats.norm.cdf(z2))
print(power)

0.5160175620301128


## 9.2 Inferences for Comparing Two Proportions

Next we consider the problem of comparing two Bernoulli proportions, $p_1$ and $p_2$, based on two independent random samples of sizes $n_1$ and $n_2$. The basis for inferences on $p_1 - p_2$ is the result that for large $n_1$ and $n_2$, the difference in the sample proportions, $\hat{p}_1 - \hat{p}_2$, is approximately normal with mean = $p_1 - p_2$ and standard deviation = $\sqrt{p_1 q_1 / n_1 + p_2 q_2 / n_2}$ . A large sample two- sided 100(1 - $\alpha$)% confidence interval for $p_1 - p_2$ is given by

$$
\left[ \hat{p}_1 - \hat{p}_2 \pm z_{\alpha/2} \sqrt{\frac{\hat{p}_1 \hat{q}_1}{n_1} + \frac{\hat{p}_2 \hat{q}_2}{n_2}}\; \right].
$$

A large sample two-sided $z$-test can be used to test $H_0: p_1 = p_2$ vs. $H_1: p_1 \ne  p_2$ by using the test statistic

$$
z = \frac{\hat{p}_1 - \hat{p}_2}{\sqrt{\frac{\hat{p}_1 \hat{q}_1}{n_1} + \frac{\hat{p}_2 \hat{q}_2}{n_2}}}
\quad \text{or} \quad 
z = \frac{\hat{p}_1 - \hat{p}_2}{\sqrt{\hat{p}\hat{q}\left(\frac{1}{n_1} + \frac{1}{n_2}\right)}}
$$


where $\hat{p} = (n_1\hat{p}_1 + n_2\hat{p}_2)/(n_1 + n_2)$ is the pooled sample proportion. Small sample tests to compare $p_1$ and $p_2$ are also given for independent samples (Fisher's exact test) and matched pairs designs (McNemar's test).

### Ex 9.10

To gauge a change in opinion regarding the public view on bilingual education, a telephone poll was taken in September 1993 and again in September 1995. The results based on the survey of 1000 American adults contacted in each poll were that 40% from the 1993 poll and 48% from the 1995 poll favored teaching all children in English over bilingual alternatives. Has there been a significant change in opinion? Answer by doing a two-sided test for the significance of the difference in two proportions at $\alpha$ = .05. Why is a two-sided alternative appropriate here?

In [9]:
p1 = 0.4
p2 = 0.48
n1 = n2 = 1000
p = (n1 * p1 + n2 * p2) / (n1 + n2)
z = (p1 - p2) / np.sqrt(p*(1-p)*(1/n1 + 1/n2))
pval = 2 * stats.norm.cdf(z)
print(pval)

0.00031365894525581494


$P$-value < $\alpha$, so there has been a significant change in opinion. A two-sided alternative was used because the question was to detect *change* without specifying a direction.

### Ex 9.11

A high school had 17 students receive National Merit recognition (semifinalist or commendation) out of 482 seniors in 1992 and 29 students out of 503 seniors in 1995. Does this represent a significant change in the proportion recognized at this school? Answer by doing a two-sided test for the significance of the difference in two proportions at $\alpha$ = .10. Why is a two-sided alternative appropriate here?

In [12]:
p1 = 17/482
p2 = 29/503
n1 = 482
n2 = 503
p = (n1 * p1 + n2 * p2) / (n1 + n2)
z = (p1 - p2) / np.sqrt(p*(1-p)*(1/n1 + 1/n2))
pval = 2 * stats.norm.cdf(z)
print(pval)

0.09603175006295907


$P$-value < $\alpha$, so there has been a significant change (but barely). A two-sided alternative was appropriate because the question was to detect *change* without specifying a direction.

### Ex 9.12

The following data set from a study by the well-known chemist and Nobel Laureate Linus Pauling (1901-1994) gives the incidence of cold among 279 French skiers who were randomized to the Vitamin C and Placebo groups.


Group | Cold: Yes | Cold: No | Total
---|---|---|---
Vitamin C | 17 | 122 | 139
Placebo | 31 | 109 | 140

Is there a significant difference in the incidence rates for cold between the Vitamin C and Placebo groups at $\alpha$ = .05? What do you conclude about the effectiveness of Vitamin C in preventing cold?

In [2]:
p1 = 17/139
p2 = 31/140
n1 = 139
n2 = 140
p = (n1 * p1 + n2 * p2) / (n1 + n2)
z = (p1 - p2) / np.sqrt(p*(1-p)*(1/n1 + 1/n2))
pval = 2 * stats.norm.cdf(z)
print(pval)

0.0282718602468226


$P$-value < $\alpha$, so there has been a significant difference between Vitamin C and Placebo groups, indicating that Vitamin C is effective in preventing cold.

### Ex 9.13

The graduate degrees of faculty from a research group within a medical school were tabulated by gender, giving the following results.

Gender | Degree: M.D. | Degree: Ph.D.
---|---|---
Male | 5 | 1 
Female | 3 | 6

#### (a) 
Set up the hypotheses to determine whether the proportion of male M.D.'s differs from the proportion of female M.D.'s. Which statistical test is appropriate to test the hypotheses?

✍️ Let $p_\text{male}$ and $p_\text{female}$ denote the proportion of male and female M.D.'s respectively in that medical school.

$$
H_0: p_\text{male} = p_\text{female} \quad \text{v.s.} \quad H_1: p_\text{male} \ne p_\text{female} 
$$

Because of the small sample size, we should use Fisher's eact test for this.

#### (b)
Calculate the $P$-value of the test. What is your conclusion using $\alpha$ = .05?

✍️ This $P$-value is the probability that the number of male M.D.'s is ≥ 5:

In [6]:
pval = 2 * stats.hypergeom.sf(5-1, 15, 8, 6)
print(pval)

0.16783216783216784


$P$-value > $\alpha$, so no significant difference.

Compare with `fisher_exact`. It uses a different convention to calculate the two-sided $P$-value from that used in the Book. When the null distribution is asymmetric (hypergeometric, e.g.), there are multiple conventions for computing a two-sided $P$-value. So the result is not exactly the same, but still comparable.

In [12]:
test = stats.fisher_exact(table=[[5,1],[3,6]], alternative='two-sided')
print(test.pvalue)

0.11888111888111888


### Ex 9.14

A study evaluated the urinary-thromboglobulin excretion in 12 normal and 12 diabetic patients. Summary results are obtained by coding values of 20 or less as "low" and values above 20 as "high", as shown in the following table.

Excretion: | Low | High
---|---|---
Normal | 10 | 2 
Diabetic | 4 | 8

#### (a) 
Set up the hypotheses to determine whether there is a difference in the urinary-thromboglobulin excretion between normal and diabetic patients. Which statistical test is appropriate to test the hypotheses?

✍️ Let $p_1$ and $p_2$ denote the proportion of "low" excretion among normal and diabetic patients, respectively.

$$
H_0: p_1 = p_2 \quad \text{v.s.} \quad H_1: p_1 \ne p_2
$$

Because the numbers are very small, we should use Fisher's eact test for this.

#### (b) 
Calculate the $P$-value of the test. What is your conclusion using $\alpha$ = .05?

In [13]:
pval = 2 * stats.hypergeom.sf(10-1, 24, 14, 12)
print(pval)
test = stats.fisher_exact(table=[[10, 2],[4, 8]], alternative='two-sided')
print(test.pvalue)

0.036074841836047915
0.03607484183604793


$P$-value < $\alpha$, we should conclude there is significant difference between the excretion levels of normal and diabetic patients.

Also note that in this case the null distribution is symmetric (12 patients each), and `fisher_exact` gives the same result.

### Ex 9.15

A matched pairs study was conducted to compare two topical anesthetic drugs for use in dentistry. The two drugs were applied on the oral mucous membrane of the two sides of each patient's mouth, and after a certain period of time it was noted whether or not the membrane remained anesthetized. Data on 45 patients showed the following responses.

Drug 1 \ Drug 2 | Anesthetized | Not Anesthetized
---|---|---
Anesthetized | 15 | 13 
Not Anesthetized | 3 | 14

#### (a) 
Set up the hypotheses to determine whether there is a statistically significant difference between the two drugs. Which statistical test is appropriate to test the hypotheses?

✍️ Denote by $p_i$ the probability that the membrane remains anesthetized under drug $i$.
$$
H_0: p_1 = p_2 \quad \text{v.s.} \quad H_1: p_1 \ne p2 \text{.}
$$
McNemar's test should be appropriate.

#### (b) 
Calculate the $P$-value of the test. What is your conclusion using $\alpha$ = .05?

✍️ The null distribution of the McNemar's test is binomial, so here we use the `binomtest` directly.

In [16]:
test = stats.binomtest(k=13, n=16, p=1/2, alternative='two-sided')
print(test.pvalue)

0.021270751953125


$P$-value < $\alpha$, therefore there is significant difference between the two drugs.

### Ex 9.16

In a speech class two persuasive speeches, one pro and the other con, were given by two students on requiring guest lists for fraternity/sorority parties. The opinions of the other 52 students in the class were obtained on this issue before and after the speeches with the following responses.

Before \ After | Pro | Con
---|---|---
Pro | 2 | 8 
Con | 26 | 16

#### (a)
Set up the hypotheses to determine whether or not there is a change in opinion of the students. Which statistical test is appropriate to test the hypotheses?

✍️ Denote by $p_1$ the proportion of students that are pro on the topic before the speeches, and by $p_2$, after.
$$
H_0: p_1 = p_2 \quad \text{v.s.} \quad H_1: p_1 \ne p2 \text{.}
$$
McNemar's test should be appropriate.

#### (b) 
Calculate the $P$-value of the test. What is your conclusion using $\alpha$ =.05?

In [17]:
test = stats.binomtest(k=8, n=26+8, p=1/2, alternative='two-sided')
print(test.pvalue)

0.0029350556433200836


$P$-value < $\alpha$, therefore conclude there is a change of opinion.

## 9.3 Inferences for One-way Count Data

A generalization of the test on the binomial proportion $p$ is a test on the cell probabilities of a multinomial distribution. Based on a random sample of size $n$ from a $c$-cell multinomial distribution (one-way count data) with cell probabilities $p_1, p_2, \ldots, p_c$, the test of

$$
H_0: p_1 = p_{10},\, p_2 = p_{20},\, \ldots, \, p_c = p_{c0} \quad \text{vs.} \quad
H_1: \text{At least one}\, p_i \ne p_{i0}
$$

is based on the **chi-square statistic** having the general form:

$$
\chi^2 = \sum \frac{(\text{observed} - \text{expected})^2}{\text{expected}}
$$

where "observed" refers to the observed cell counts $n_i$ and "expected" refers to the expected cell counts $e_i = n p_{i0}$ under $H_0$. The degrees of freedom (d.f.) of the chi-square statistic are $c$ - 1. The primary use of this statistic is for the **goodness of fit** test of a specified distribution to a set of data. If any parameters of the distribution are estimated from the data, then one d.f. is deducted for each independent estimated parameter from the total d.f. $c$ - 1.

### Ex 9.17

Use the following data to test the hypothesis that a horse's chances of winning are unaffected by its position on the starting lineup. The data give the starting position of each of 144 winners. where position 1 is closest to the inside rail of the race track.

||||||||||
---|---|---|---|---|---|---|---|---
Starting Position | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 
Number of Wins | 29 | 19 | 18 | 25 | 17 | 10 | 15 | 11

State the hypotheses and perform a test at $\alpha$ = .05.

✍️ Let $p_i$ denote the chance of winning from starting position $i$.
$$
H_0: p_1 = p_2 = \ldots = p_8 = 1/8 \quad \text{v.s.} \quad H_1: p_j \ne 1/8 \text{ for some } j\text{.}
$$

In [19]:
test = stats.chisquare([29, 19, 18, 25, 17, 10, 15, 11])
print(test.pvalue)

0.022239477462390588


$P$-value < $\alpha$, reject $H_1$ and conclude that chances of winning are affected by starting position.

### Ex 9.18

The number of first births to 700 women are shown by month from the University Hospital of Basel, Switzerland.

||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---
Month | Jan | Feb | Mar | Apr | May | June | July | Aug | Sept | Oct | Nov | Dec
Births | 66 | 63 | 64 | 48 | 64 | 74 | 70 | 59 | 54 | 51 | 45 | 42

State the hypotheses to test that births are spread uniformly through the year. Perform a test of the hypotheses at $\alpha$ = .05.

✍️ Let $p_i$ denote the proportion of births in Month $i$.
$$
H_0: p_1 = p_2 = \ldots = p_{12} = 1/12 \quad \text{v.s.} \quad H_1: p_j \ne 1/12 \text{ for some } j\text{.}
$$

In [20]:
test = stats.chisquare([66, 63, 64, 48, 64, 74, 70, 59, 54, 51, 45, 42])
print(test.pvalue)

0.04924720128163976


Although $P$-value slightly < $\alpha$, this is very border case. Let's be conservative and say that data is inconclusive to suggest a difference in the proportion of births among different months.

### Ex 9.19

The Hutterite Brethren is a religious group that is essentially a closed population with almost all marriages within the group. The following table shows the distribution of sons in families with 7 children whose mothers were born between 1879 and 1936.

|<td colspan=8>Number of Sons in Families with Seven Children|||||||||
---|---|---|---|---|---|---|---|---
Sons | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7
Count | 0 | 6 | 14 | 25 | 21 | 22 | 9 | 1

#### (a) 
State the hypotheses to test that the number of sons follows a binomial distribution with $p$ = 0.5, where $p$ is the probability that a child is male. Conduct a test of the hypotheses using a $\chi^2$-test at $\alpha$ = .10.

✍️ Denote by $p_i$ ($i = 0 \ldots 7$) the proportion of 7-children families having $i$ sons. 
$$
H_0: p_i = \binom{n}{i} p^i (1-p)^{n-i} = \frac{1}{2^7}\binom{7}{i} \quad \text{v.s.} \quad H_1: \text{ not } H_0 \text{.}
$$

To use the $\chi^2$-test, each count should be ≥ 5. So we combine the first and last 2 cells before applying the test.

In [68]:
f_obs = np.array([0, 6, 14, 25, 21, 22, 9, 1])
f_obs_combined = np.concat(([0+6], f_obs[2:-2], [9+1]))
f_exp = stats.binom(n=7, p=1/2).pmf(np.arange(8)) * f_obs.sum()
f_exp_combined = np.concat(([f_exp[:2].sum()], f_exp[2:-2], [f_exp[-2:].sum()]))
test = stats.chisquare(f_obs_combined, f_exp_combined)
print(test.pvalue)

0.2800706126649032


$P$-value > $\alpha$, concude that data is consistent with a binomial distribution.

#### (b) 
State the hypotheses to test that the number of sons follows a binomial distribution (with unspecified $p$). Conduct a test of the hypotheses using a $\chi^2$-test at $\alpha$ = .10. How does the result of this test compare with the result from part (a)?

In [69]:
# estimate probability of sons
p = f_obs @ np.arange(8) / (f_obs.sum()*7)
print(f"estimated p = {p}")
# and use the estimated p to generate the expected counts.
f_exp = stats.binom(n=7, p=p).pmf(range(8)) * f_obs.sum()
f_exp_combined = np.concat(([f_exp[:2].sum()], f_exp[2:-2], [f_exp[-2:].sum()]))
# use ddof=1 to account for the estimated p
test = stats.chisquare(f_obs_combined, f_exp_combined, ddof=1)
print(f"P-value = {test.pvalue}")

estimated p = 0.5306122448979592
P-value = 0.5212706259986852


$P$-value > $\alpha$, concude that data is consistent with a binomial distribution, in agreement with (a). Note also that the $P$-value with the estimated parameter is greater than that with a fixed parameter (0.52 > 0.28), because the estimation fits the data better.

### Ex 9.20

A genetics experiment on characteristics of tomato plants provided the following data on the numbers of offspring expressing four phenotypes.

Phenotype | Frequency
---|---
Tall, cut-leaf | 926
Dwarf, cut-leaf | 293
Tall, potato-leaf | 288
Dwarf, potato-leaf | 104
|
Total | 1611

#### (a) 
State the hypotheses to test that theoretically the four phenotypes will appear in the proportion 9:3:3:1.

✍️ Let $p_i$ ($i = 1 \ldots 4$) correspond to the probabilities of the 4 phenotypes in the order listed.
$$
H_0: p_1 = 9/16 \text{, } p_2 = p_3 = 3/16 \text{, } p_4 = 1/16 \quad \text{v.s.} \quad H_1:\text{ not } H_0 \text{.}
$$

#### (b) 
Test the hypotheses. Use $\alpha$ = .05.

In [42]:
f_obs = np.array([926, 293, 288, 104])
f_exp = np.array([9, 3, 3, 1]) / 16 * f_obs.sum()
test = stats.chisquare(f_obs, f_exp)
print(test.pvalue)

0.6895078646142022


$P$-value > $\alpha$, conclude that data is consistent with $H_0$.

### Ex 9.21

During World War II, a 36 sq. km area of South London was gridded into 0.25 km squares to record bomb hits. The following data give the number of squares receiving 0 hits, 1 hit, etc. If hits were random, a Poisson model would fit the data. Test using $\alpha$ = .05 to see if this is the case.

||||||||||
---|---|---|---|---|---|---|---|---
Number of Hits | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7
Number of 0.25 km Squares | 229 | 211 | 93 | 35 | 7 | 0 | 0 | 1

✍️ Let $p_i$ ($i = 0 \ldots 7$) correspond to the probabilities of a 0.25 km square having $i$ hits.
$$
H_0: p_i = e^{-\mu} \frac{\mu^i}{i!} \quad \text{v.s.} \quad H_1:\text{ not } H_0 \text{.}
$$

We need combine the last 4 cells and then apply the $\chi^2$-test.

In [61]:
f_obs = np.array([229, 211, 93, 35, 7, 0, 0, 1])
# estimate μ
μ = np.arange(8) @ f_obs / f_obs.sum()
# combine cells
f_obs = np.append(f_obs[:4], f_obs[4:].sum())
f_exp = np.append(stats.poisson(μ).pmf(np.arange(4)), 
                  stats.poisson(μ).sf(3)) * f_obs.sum()
# ddof=1 for estimated μ
test = stats.chisquare(f_obs, f_exp, ddof=1)
print(test.pvalue)

0.7969959425043197


$P$-value > $\alpha$, conclude that Poisson is a plausible model for the data.

### Ex 9.22

Studies were made to estimate the number of passengers (other than the driver) per car in urban traffic. The numbers of passengers carried in 1011 cars traveling through the Wilshire and Bundy boulevard intersection in Los Angeles between 10:00 A.M. and 10:20 A.M. on March 24, 1959, are given below.

||||||||
---|---|---|---|---|---|---
Number of Passengers | 0 | 1 | 2 | 3 | 4 | ≥5
Frequency | 678 | 227 | 56 | 28 | 8 | 14

#### (a)
The average number of passengers per car is 0.519. Test to see if the number of passengers follows a Poisson distribution.

In [80]:
f_obs = np.array([678, 227, 56, 28, 8, 14])
μ = 0.519 # estimated from data
f_exp = np.append(stats.poisson(μ).pmf(np.arange(5)), 
                  stats.poisson(μ).sf(4)) * f_obs.sum()
test = stats.chisquare(f_obs, f_exp, ddof=1)
print(test.pvalue)

1.3474982979796068e-214


Number of passengers is not Poisson.

#### (b) 
Consider a geometric distribution for the number of occupants. (The number of occupants= 1 + the number of passengers). Recall that the geometric distribution is given by
$$
P(X = x) = (1 - p)^{x-1}p, x = 1,2, \ldots
$$
where $p = 1/\mu$. Estimate $p$ using $\hat{\mu}$ = 1.519 occupants. Test to see if the number of occupants follows a geometric distribution.

In [81]:
f_obs = 1 + np.array([678, 227, 56, 28, 8, 14])
p = 1/1.519 # estimated from data
f_exp = np.append(stats.geom(p).pmf(np.arange(1, 6)), 
                  stats.geom(p).sf(5)) * f_obs.sum()
test = stats.chisquare(f_obs, f_exp, ddof=1)
print(test.pvalue)

1.0778618711649194e-05


$P$-value is still too small, so number of passengers is not geometric either.

#### (c) 
Which distribution fits the data better?

✍️ The geometric distribution fits the data better, for it has a much larger $P$-value than the Poisson.

### Ex 9.23

Consider the problem of testing $H_0: p = p_0$ vs. $H_1: p \ne p_0$, where $p$ is the success probability of a Bernoulli population from which we have a random sample of size $n$. Equation (9.6) gives the following test statistic for this problem:
$$
z = \frac{y - n\, p_0}{\sqrt{n p_0(1-p_0)}}
$$
where $y$ is the number of successes.

#### (a) 
Show that $z^2 = \chi^2$ , where the $\chi^2$-statistic is given by (9.14). This $\chi^2$-statistic has 1 d.f.

✍️ There are 2 categories in the sample: success and failure. If we are to perform a $\chi^2$ goodness-of-fit test on the data, the observed and expected numbers will be the following:

$i=$ | 1 | 2
---|---|---
$e_i$ | $n p_0$ | $n (1-p_0)$
$n_i$ | $y$ | $n-y$

And the statistic (with 1 d.f.) will be
$$
\begin{align*}
\chi^2 & = \frac{(n_1 - e_1)^2}{e_1} + \frac{(n_2 - e_2)^2}{e_2} \\
    &= \frac{(y - n p_0)^2}{n p_0} + \frac{[n-y-n(1-p_0)]^2}{n(1-p_0)} \\
    &= \frac{(y - np_0)^2}{np_0(1-p_0)} \\
    &= z^2 \text{.}
\end{align*}
$$

#### (b) 
Show that the two-sided $z$-test and the $\chi^2$-test are equivalent.

✍️ The critical region of a two-sided $z$-test is
$$
|z| < z_{\alpha/2}\text{, or } z^2 <  z_{\alpha/2}^2 \text{.}
$$
Using the result from (a), this is equivalent to
$$
\chi^2 < z_{\alpha/2}^2 = \chi^2_{1,\alpha}
$$
with 1 d.f.

### Ex 9.24

The NBA final is a seven game series, and the first team to win four games wins the series. Denote by $p$ the probability that the Eastern Conference team wins a game and by $q = 1 - p$ that the Western Conference team wins a game. Assume that these probabilities remain constant from game to game and that the outcomes of the games are mutually independent. Assume the result from Exercise 2.70 that the probability that the series ends in j games is given by
$$
\binom{j-1}{3} [p^4 q^{j-4} + q^4 p^{j-4}], \quad  j = 4,5,6,7.
$$
There have been 52 finals in NBA's history (from 1947 to 1998). The number of finals that have gone for 4, 5, 6 and 7 games has been as follows:

- 4 games: 6 finals,
- 5 games: 11 finals,
- 6 games: 21 finals,
- 7 games: 14 finals.

Sppose we assume that the two finalists are evenly matched, so that $p = q = 1/2$. Show that the above model fits these data well.

In [7]:
f_obs = np.array([6, 11, 21, 14])
j = np.arange(4, 8)
f_exp = special.comb(j-1, 3) / 2**(j-1) * f_obs.sum()
test = stats.chisquare(f_obs, f_exp)
print(test.pvalue)

0.5628827711754665


$P$-value > 0.1, so the model is consistent with the data.

## 9.4 Inferences for Two-way Count Data

Two-way count data result when

1. a single sample is cross-classified based on two categorical variables into $r$ rows and $c$ columns (**multinomial sampling**), or 
2. independent samples are drawn from $r$ multinomial distributions with the same $c$ categories
(**product multinomial sampling**). 

In both cases, the data are summarized in the form of an $r \times c$ **contingency table** of counts. In case (1), the null hypothesis of interest is the **independence hypothesis** between the row and column variables; in case (2), it is the **homogeneity hypothesis**. In both cases, the chi-square statistic has the same general form given above, with the expected count for the $(i, j)$th cell (under $H_0$) being the $i$th row total times the proportion of all observations falling in the $j$ th column. The d.f. of the chi-square statistic equal $(r - 1)(c - 1)$. Thus association between the row and the column variable is demonstrated at level $\alpha$ if $\chi^2 > \chi^2_{(r-1)(c-1), \alpha}$·

### Ex. 9.25

Tell in each of the following instances whether sampling is multinomial or product multinomial. State mathematically the null hypothesis of no association between the two categorical variables under study.

#### (a) 
A sample of 500 people is cross-classified according to each person's religious affiliation (Christian, Jewish, Agnostic, Other) and political party affiliation (Republican, Democrat, Independent, Other).

✍️ This is multinomial sampling.
$$
H_0: p_{ij} = p_{i \cdot}\ p_{\cdot j} \text{ for all } i, j \text{,}
$$
where $p_{ij}$ represents the probability of people having religious affiliation $i$ and political party affiliation $j$, and $p_{i \cdot}$ and $p_{\cdot j}$ the marginal probabilities.

#### (b) 
To compare the performances of different types of mutual funds, five groups of funds are sampled (municipal bond fund, corporate bond fund, equity income fund, blue chip stock fund, and aggressive growth fund) with 20 funds from each group. The sampled funds are classified according to whether the fund's return over a five year period is low (less than 5%), medium (5% to 10%), or high (greater than 10%).

✍️ This is product multinomial sampling.
$$
H_0: p_{ij} = p_j \text{ for all } i, j \text{,}
$$
where $p_{ij}$ represents the probability of funds coming from group $i$ and with return category $j$, and $p_j$ the common return category probabilities across all fund groups.

### Ex. 9.26

Tell in each of the following instances whether sampling is multinomial or product multinomial. State mathematically the null hypothesis of no association between the two categorical variables under study.

#### (a) 
To see if there is an association between age and willingness to use internet grocery ordering services, 100 people are surveyed in each of four age groups (21-35, 36-50, 50-65, and over 65). The people surveyed are asked whether or not they would use the service, if they had the software.

✍️ This is multinomial sampling.
$$
H_0: p_{ij} = p_{i \cdot}\ p_{\cdot j} \text{ for all } i, j \text{,}
$$
where $i$ refers to the age group and $j$ to willingness for online orders (yes or no), and $p_{i \cdot}$ and $p_{\cdot j}$ are the marginal probabilities.

#### (b)
A sample of 1000 traffic accidents is cross-classified according to the severity of injury (none, minor, disabling, death) and the use of a safety restraint (none, seat belt, seat and shoulder belt).

✍️ This is also multinomial sampling.
$$
H_0: p_{ij} = p_{i \cdot}\ p_{\cdot j} \text{ for all } i, j \text{,}
$$
where $i$ refers to the severity of injury and $j$ to the safety restraint used, and $p_{i \cdot}$ and $p_{\cdot j}$ are the marginal probabilities.

### Ex. 9.27

Evidence for authorship of a document can be based on the distribution of word lengths. In 1861 the _New Orleans Crescent_ published a set of ten letters signed Quintus Curtius Snodgrass. It has been claimed that the author was Mark Twain. A way to test this claim is to see if the word length distribution of the Q.C.S. letters matches the distribution of the word lengths in a sample of Mark Twain's writing. Here are the data for the two distributions.

Word Length | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13+
---|---|---|---|---|---|---|---|---|---|---|---|---|---
Mark Twain | 312 | 1146 | 1394 | 1177 | 661 | 442 | 367 | 231 | 181 | 109 | 50 | 24 | 12
Q.C.S. | 424 | 2685 | 2752 | 2302 | 1431 | 992 | 896 | 638 | 465 | 276 | 152 | 101 | 61

#### (a) 
Which type of sampling is used here -- multinomial or product multinomial?

✍️ This is product multinomial sampling.

#### (b) 
Do the appropriate test at $\alpha$ = .01. State your conclusion.

In [12]:
table = [
    [312, 1146, 1394, 1177, 661, 442, 367, 231, 181, 109, 50, 24, 12],
    [424, 2685, 2752, 2302, 1431, 992, 896, 638, 465, 276, 152, 101, 61]]
test = stats.chi2_contingency(table)
print(test.pvalue)

2.836740984550758e-16


$P$-value < $\alpha$, so the word-length distributions of Mark Twain and Q.C.S. differ.

### Ex. 9.28

Reader's Digest conducted an experiment to find out how honest people are in different cities. Three cities of each type were selected: big cities (Atlanta, GA, Seattle, WA, and St. Louis, MO), suburbs (Boston, MA, Houston, TX, and Los Angeles, CA), medium cities (Dayton, OH, Greensboro, PA, and Las Vegas, NV) and small cities (Cheyenne, WY, Concord, NH, and Meadville, PA). In each selected city 10 wallets were left in public places. Each wallet contained $50 cash and a telephone number and an address where the owner could be reached. A record was kept of the number of wallets returned by people who found the wallets. The data are summarized in the table below.

Type of Cities | Wallet: Returned | Wallet: Kept || Row Total
---|---|---|---|---
Big Cities | 21 | 9 || 30
Suburbs | 18 | 12 || 30
Medium Cities | 17 | 13 || 30
Small Cities | 24 | 6 || 30
||||
Column Total | 80 | 40 || 120

#### (a) 
Which type of sampling is used here -- multinomial or product multinomial?

✍️ This is product multinomial sampling.

#### (b) 
Set up the hypotheses to check whether there are significant differences between the percentages of people who returned the wallets in different types of cities.

✍️
$$
H_0: p_i = p \text{ for all } i \quad \text{v.s.} \quad H_1: p_i \ne p \text{ for some $i$,}
$$
where $i$ refers to the different types of cities, and $p$ is the common percentage of people who return the wallets.

#### (c) 
Do a $\chi^2$-test of the hypotheses. Can you conclude that there are significant differences among the return rates in different types of cities? Use $\alpha$ = .10.

In [13]:
table = [
    [21, 9],
    [18, 12],
    [17, 13],
    [24, 6]]
test = stats.chi2_contingency(table)
print(test.pvalue)

0.21229028736013367


$P$-value > $\alpha$, so so significant differences among the different types of cites.

### Ex. 9.29

Biologists studying sandflies at a site in eastern Panama looked at the relationship between gender and the height at which the flies were caught in a light trap, resulting in the following data.

Gender \ Height | 3 feet | 35 feet || Row Total
---|---|---|---|---
Males | 173 | 125 || 298
Females | 150 | 73 || 223
||||
Column Total | 323 | 198 || 521

#### (a) 
Which type of sampling is used here -- multinomial or product multinomial?

#### (b) 
Set up the hypotheses to test whether the gender of the sandflies and trap height are associated.

#### (c)
Do a chi-square test of the hypotheses. Use $\alpha$ = .05.

### Ex. 9.30

The drug Dramamine was tested for its effectiveness to prevent airsickness compared to placebo. A total of 216 volunteers were randomly assigned to receive treatment (Dramamine) or control (placebo). Of the 108 volunteers receiving treatment, 31 became airsick; of the 108 volunteers receiving placebo, 60 became airsick.

#### (a) 
Make a 2 $\times$ 2 table displaying the results.

#### (b) 
Set up the hypotheses to test whether Dramamine is effective in reducing the chances of airsickness.

#### (c) 
Do a chi-square test of the hypotheses. Use $\alpha$ = .05.

### Ex. 9.31

The Western Collaborative Group Study investigated the risk of coronary heart disease in 3154 men. From these data, the 40 heaviest men were classified by their cholesterol measurements (mg per 100 ml) and behavior (Type A or Type B). Broadly speaking, Type A behavior is competitive, while Type B behavior is noncompetitive. There were 20 Type A men, of whom 8 had cholesterol levels above 250 mg per 100 ml. There were 20 Type B men, of whom 3 had cholesterol levels above 250 mg per 100 ml.

#### (a) 
Make a 2 $\times$ 2 table displaying the personality type by cholesterol levels (≤ 250 vs. > 250).

#### (b) 
Set up the hypotheses to test whether or not personality type is associated with cholesterol level. Conduct a test of the hypotheses. Use $\alpha$ = .10.

### Ex. 9.32

The following table gives the eye color and hair color of 592 students.

Eye Color \ Hair Color | Black | Brown | Red | Blond || Row Total
---|---|---|---|---|---|---
Brown | 68 | 119 | 26 | 7 || 220
Blue | 20 | 84 | 17 | 94 || 215
Hazel | 15 | 54 | 14 | 10 || 93
Green | 5 | 29 | 14 | 16 || 64
|||||
Column Total | 108 | 286 | 71 | 127 || 592

#### (a) 
Which type of sampling is used here -- multinomial or product multinomial?#### (b) 
Set up the hypotheses to test that the eye color and hair color are associated. Do a chi-square test at $\alpha$ = .05. What do you conclude?

#### (b) 
Set up the hypotheses to test that the eye color and hair color are associated. Do a chi-square test at $\alpha$ = .05. What do you conclude?

### Ex. 9.33

Following the nuclear mishap at Three Mile Island near Harrisburg, PA, a sample of 150 households was surveyed. One question asked was: "Should there have been a full evacuation of the immediate area?" The following table classifies the responses according to the distance from the accident.

Full Evacuation \ Distance (miles) | 1-3 | 4-6 | 7-9 | 10-12 | 13-15 | 15+ || Row Total
---|---|---|---|---|---|---|---|---
Yes | 7 | 11 | 10 | 5 | 4 | 29 || 66
No | 9 | 11 | 13 | 6 | 6 | 39 || 84
||||||||
Column Total | 16 | 22 | 23 | 11 | 10 | 68 || 150

#### (a) 
Set up the hypotheses to test whether distance from the accident and evacuation attitudes are associated.

#### (b)
Conduct a test of the hypotheses. Use $\alpha$ = .10.

### Ex. 9.34

To investigate whether there is a relationship between tonsil size and carriers of a particular bacterium, _Streptococcus pyrogenes_, 1398 school children were examined. The following table classifies the children according to tonsil size and carrier status.

Tonsil Size \ Carrier Status | Carrier | Non carrier || Row Total
---|---|---|---|---
Normal | 19 | 497 || 516
Large | 29 | 560 || 589
Very Large | 24 | 269 || 293
||||
Column Total | 72 | 1326 || 1398

#### (a) 
Set up the hypotheses to test whether tonsil size and carrier status are associated.

#### (b) 
Conduct a test of the hypotheses. Can you conclude that tonsil size and carrier status are associated? Use $\alpha$ = .05.

### Ex. 9.35

A study was done to investigate the association between the age at which breast cancer was diagnosed and the frequency of breast self-examination. The following table classifies the women in the sample according to these two criteria.

Age \ Frequency | Monthly | Occasionally | Never
---|---|---|---
< 45 | 91 | 90 | 51
45-59 | 150 | 200 | 155
60+ | 109 | 198 | 172

#### (a) 
Set up the hypotheses to test that frequency of self-examination is related to the age of breast cancer diagnosis.

#### (b)
Conduct a test of the hypotheses. Can you conclude association using $\alpha$ = .10?

### Ex. 9.36

Refer to Exercise 9.30. Calculate a 95% CI for the odds ratio. What does this say about the odds of airsickness if Dramamine is used?

### Ex. 9.37

Refer to Exercise 9.31. Calculate a 90% CI for the odds ratio. What does this say about the odds of higher cholesterol levels for a Type A personality?

## Advanced Exercises

### Ex. 9.38

A sample of employed men aged 18 to 67 were asked if they had carried out work on their home in the preceding year for which they would have previously employed a craftsman. The following table gives the summary of responses of 906 homeowners.

Work | Home Repair | Age: < 30 | Age: 31-45 | Age: 46+ || Row Total
---|---|---|---|---|---|---
Skilled | Yes | 56 | 56 | 35 || 147
Skilled | No  | 12 | 21 | 8  || 41
Unskilled | Yes | 23 | 52 | 49 || 124
Unskilled | No  | 9  | 31 | 51 || 91
Office | Yes | 54 | 191 | 102 || 347
Office | No  | 19 | 76  | 61  || 156
||||||
Column Totals || 173 | 427 | 306 || 906

#### (a) 
For each category of work. set up the hypotheses to test that frequency of home repair is related to the age. Conduct the test using $\alpha$ = .05.

#### (b) 
Create a table of home repair by age summing across all types of work. Test for association between the frequency of home repair and age group using $\alpha$ = .05.

#### (c) 
Compare the results in (a) and (b). What information is missed in (b)?

### Ex. 9.39

In this exercise we will derive the more accurate large sample CI (9.3) for the binomial proportion $p$.

#### (a) 
Begin with the probability statement
$$
P\left[ -z_{\alpha/2} \le \frac{\hat{p}-p}{\sqrt{pq/n}} \le z_{\alpha/2} \right]
= P\left[ (\hat{p} - p)^2 \le z_{\alpha/2}^2 \frac{pq}{n} \right] \simeq 1 - \alpha
$$
and simplify it to the form
$$
P\{A p^2 + B p + C \le 0\} \simeq 1 - \alpha.
$$

#### (b) 
Find the two roots of the quadratic equation $A p^2 + B p + C = 0$, and show that they are given by
$$
p_L = \frac{}{}\text{, } p_U = \frac{}{}
$$
where $z = z_{\alpha/2}$·

#### (c) 
Show that $A p^2 + B p + C \le 0$ is equivalent to $p_L \le p \le p_U$. (_Hint:_ Sketch the quadratic function $A p^2 + B p + C$ and note that it falls below zero when $p$ falls between its two roots, $p_L$ and $p_U$.) Therefore $P\{p_L \le p \le p_U\} \simeq 1 - \alpha$, and thus $[p_L, p_U]$ is an approximate $(1 - \alpha)$-level CI for $p$. Why is this CI more accurate than that given in (9.1)?

#### (d) 
Show that if the terms of the order of $1/n$ are ignored in comparison to the terms of the order of $1/\sqrt{n}$ when $n$ is large, the CI $[p_L, p_U]$ simplifies to (9.1).