# Chapter 6: Confidence Intervals

### Section 6.3: Confidence Intervals for Population Proportions

##### Objective 1: Finding a Point Estimate for a Population Proportion

##### Definitions:

The <em><b>point estimate for, $p$,</b></em> the population proportion of successes, is given by the proportion of successes in a sample and is denoted by $\hat{p} = \frac{x}{n}$ where $x$ is the number of successes in the sample and $n$ is the sample size.

The <em><b>point estimate for the population proportion of failures</b></em> is given by $\hat{q} = 1 - \hat{p}$.

##### Example:

In a random sample of 1000 U.S. adults, 372 had type O- blood. Find a point estimate for the population proportion of U.S. adults with type O- blood.

##### Solution:

$\hat{p} = \frac{x}{n} = \frac{372}{1000} = 0.372$. So we can expect about $37.2\%$ of U.S. adults to have type O- blood.

##### Objective 2: Constructing and Interpreting Confidence Intervals for a Population Proportion

##### Definition:

A <em><b>$c$-confidence interval for a population proportion, $p$,</b></em> is given by $\hat{p} - E < p < \hat{p} + E$, where $E = z_c \cdot \sqrt{\frac{\hat{p}\hat{q}}{n}}$.

The probability that the confidence interval contains $p$ is $c$, assuming that the estimation process is repeated a large number of times.

We can assumed the sampling distribution of $\hat{p}$ is approximately normally distributed provided $n\hat{p} \geq 5$ <em>and</em> $n\hat{q} \geq 5$.

Here our mean is $u_{\hat{p}}= p$ and standard error is $\sigma_{\hat{p}} = \sqrt{\frac{pq}{n}}$.

##### Constructing a Confidence Interval for a Population Proportion:

1. Identify the sampel statistics $n$ and $x$.

2. Find the point estimate $\hat{p}$ (if $\hat{p}$ is given, then it is not necessary to find $x$)

3. Verify that the sampling distribution of $\hat{p}$ can be approximated by a normal distribution ($n\hat{p} \geq 5$ and $n\hat{q} \geq 5$).

4. Find the critical value $z_c$  that corresponds to the confidence level $c$ (we always use $z_c$ for confidence intervals for population proportions).

5. Find the margin of error $E = z_c \cdot \sqrt{\frac{\hat{p}\hat{q}}{n}}$.

6. Find the left and right endpoints and form the confidence interval:

\begin{eqnarray}
\text{Left endpoint:} && \hat{p} - E \\
\text{Right endpoint: } && \hat{p} + E \\
\text{Interval: } && \hat{p} - E < p < \hat{p} + E
\end{eqnarray}

##### Example:

In a random sample of 1000 U.S. adults, 372 had type O- blood. Construct a $95\%$ confidence interval for the population proportion of U.S. adults who have type O- blood.

##### Solution:

1. We know that $n = 1000$ and $x = 372$.

2. Therefore we have $\hat{p} = \frac{x}{n} = \frac{372}{1000} = 0.372 \Rightarrow \hat{q} = 1 - \hat{p} = 0.628$.

3. We have $n\hat{p} = 372 \geq 5$ and $n\hat{q} = 628 \geq 5$

4. The level of confidence gives us a critical value: $c = 0.95 \Rightarrow z_c = 1.96$

5. The margin of error is: $E = z_c \cdot \sqrt{\frac{\hat{p}\hat{q}}{n}} = 1.96 \sqrt{\frac{0.372 \cdot 0.628}{1000}} \approx 0.030$

6. The interval is then:

\begin{eqnarray}
\text{Left endpoint:} && \hat{p} - E = 0.372 - 0.030 = 0.342 \\
\text{Right endpoint: } && \hat{p} + E = 0.372 + 0.030 = 0.402 \\
\text{Interval: } && \boxed{0.342 < p < 0.402} \text{ or } \boxed{(0.342,0.402)}
\end{eqnarray}

So with $95\%$ confidence, we can say that the population proportion of U.S. adults who have type O- blood is between $34.2\%$ and $40.2\%$.

##### Example:

A survey of 498 U.S. adults asking, "Who are the more dangerous drivers?" resulted in $71\%$ saying teenagers, $25\%$ saying people over the age of 65, and $4\%$ having no opinion. Construct a $99\%$ confidence interval for the population proportion of U.S. adults who think that teenagers are the more dangerous drivers.

##### Solution:

1. Here we know that $n = 498$ U.S. adults are part of the survey. We actually don't need to find the number of successes, $x$, since:

2. $71\%$ said teenagers $\Rightarrow \hat{p} = 0.71 \Rightarrow \hat{q} = 1 - \hat{p} = 0.29$

3. $n\hat{p} = 498 \cdot 0.71 = 353.58 \geq 5$ and $n\hat{q} = 498 \cdot 0.29 = 144.42 \geq 5$ so we can approximate with a normal distribution.

4. The level of confidence gives and critical value: $c = 0.99 \Rightarrow z_c = 2.575$

5. The margin of error is: $E = z_c \cdot \sqrt{\frac{\hat{p}\hat{q}}{n}} = 2.575 \cdot \sqrt{\frac{0.71 \cdot 0.29}{498}} \approx 0.052$

6. The interval is then:

\begin{eqnarray}
\text{Left endpoint:} && \hat{p} - E = 0.71 - 0.052 = 0.658 \\
\text{Right endpoint: } && \hat{p} + E = 0.71 + 0.052 = 0.762 \\
\text{Interval: } && \boxed{0.658 < p < 0.762} \text{ or } \boxed{(0.658,0.762)}
\end{eqnarray}

So with $99\%$ confidence, we can say that the population proportion of U.S. adults who say teenagers are the more dangerous drivers is between $65.8\%$ and $76.2\%$.

##### Objective 3: Determine the Minimum Sample Size Required When Estimating a Population Proportion

Given a $c$-confidence level and margin of error, $E$, the minimum sample size, $n$, needed to estimate the population proportion $p$ is given by

$\require{\cancel}$

\begin{eqnarray}
E & = & z_c \cdot \sqrt{\frac{\hat{p}\hat{q}}{n}} \\
E & = & z_c \cdot \color{blue}{\frac{\sqrt{\hat{p}\hat{q}}}{\sqrt{n}}} \\
E \cdot \color{blue}{\sqrt{n}} & = & \frac{\hat{p}\hat{q}}{\sqrt{n}} \color{blue}{\cdot \sqrt{n}} \\
E \cdot \sqrt{n} & = & z_c \cdot \frac{\hat{p}\hat{q}}{\cancel{\sqrt{n}}} \color{blue}{\cdot \cancel{\sqrt{n}}} \\
E \cdot \sqrt{n} & = & z_c \cdot \sqrt{\hat{p}\hat{q}} \\
\frac{E \cdot \sqrt{n}}{\color{blue}{E}} & = & z_c \cdot \frac{\sqrt{\hat{p}\hat{q}}}{\color{blue}{E}} \\
\frac{\cancel{E} \cdot \sqrt{n}}{\color{blue}{\cancel{E}}} & = & z_c \cdot \frac{\sqrt{\hat{p}\hat{q}}}{\color{blue}{E}} \\
\sqrt{n} & = & z_c \cdot \frac{\sqrt{\hat{p}\hat{q}}}{E} \\
\left( \sqrt{n} \right)^{\color{blue}{2}} & = & \left( z_c \cdot \frac{\sqrt{\hat{p}\hat{q}}}{E} \right)^{\color{blue}{2}} \\
n & = & \hat{p}\hat{q}\left( \frac{z_c}{E} \right)^2
\end{eqnarray}

So the minimum random sample we require is $\boxed{n = \hat{p}\hat{q}\left( \frac{z_c}{E} \right)^2}$.

##### Important Notes:

Just as before when finding a minimum sample size for constructing a $c\%$ confidence interval, we need to round $n$ <em>up</em>.

When no perliminary information is given to find $\hat{p}$, assumed that $\hat{p} = 0.50$.

##### Example:

You are running a political campaign and wish to estimate, with $95\%$ confidence, the poulation proportion of registered voters who will vote for your candidate. Your estimate must be accurate within $3\%$ of the population proportion. Find the minimum sample size needed when (a) no preliminary estimate is available and (b) a preliminary estimate gives $\hat{p} = 0.31$. Compare your results.

##### Solution:

(a)

We are looking for the <em>minimum</em> sample size for a confidence interval involving a population proportion $\Rightarrow$ we will need $n = \hat{p}\hat{q} \cdot \left( \frac{z_c}{E} \right)^2$. With no preliminary info: $\hat{p} = 0.50 \Rightarrow \hat{q} = 0.50$. Since $c = 0.95$ our critical value is $z_c = 1.96$ and the margin of error is $E = 0.03$ (we wish to be within $3\%$ in our estimate). So we have $$n = \hat{p}\hat{q} \cdot \left( \frac{z_c}{E} \right)^2 = 0.50 \cdot 0.50 \left( \frac{1.96}{0.03} \right)^2  = 1067.11111... \approx 1068$$

Don't forget to round <em>up</em> when finding a <em>minimum</em> sample size.

(b)

We are looking for the <em>minimum</em> sample size for a confidence interval involving a population proportion $\Rightarrow$ we will need $n = \hat{p}\hat{q} \cdot \left( \frac{z_c}{E} \right)^2$. We now have preliminary info: $\hat{p} = 0.31 \Rightarrow \hat{q} = 0.69$. Since $c = 0.95$ our critical value is $z_c = 1.96$ and the margin of error is $E = 0.03$ (we wish to be within $3\%$ in our estimate). So we have $$n = \hat{p}\hat{q} \cdot \left( \frac{z_c}{E} \right)^2 = 0.31 \cdot 0.69 \left( \frac{1.96}{0.03} \right)^2 = 913.02026... \approx 914$$

##### End of Section