## Inference in a One-Way Table

### Example
In order to assess the quality of a certain make of automobile, the manufacturer surveys a random sample of 200 owners of a model with manual transmission. Among the questions asked is “Have you had to have the clutch replaced?” Of the 200 responses, 36 answered “yes”. Estimate the proportion of all such owners who have had to have the clutch replaced.

Denote by $X$ the total number of cars that have the clutch replaced, then $X \sim B(n, p)$, where $n=200$ and $p$ is the probability of a single car that has clutch replaced. It's also true that $p$ be the true popuplation propotion of broken-clutch cars. We have that $\hat p$ = 36/200 = .18. We now need to calculate level L confidence intervel of $p$. For this example, L = 95%.

From CLT, we have that:

\begin{align}
\frac{X - \mu}{\hat \sigma} \sim \mathit t_{n-2}.
\end{align}


### Calculating the Statistics

In [4]:
n = 200
p = 36/200
# standard error
se = sqrt(n*p*(1-p))
cat('SE:', round(se, digits=2), '\n')

SE: 5.43 


### CI for the Propotion Population

In [5]:
# t-critical value
t_star = qt(.025, n-2, lower.tail=F)
cat('t_star:', round(t_star, digits=2), '\n')


ci_l = (n*p - se*t_star)/n
ci_u = (n*p + se*t_star)/n
cat('(', round(ci_l, digits=3), ',', round(ci_u, digits=3), ')')

t_star: 1.97 
( 0.126 , 0.234 )

## Inference in a One-Way Table

### Example
The random sample of 200 car owners consisted of 52 who were 25 years old or younger. Of these 52, 17 had had to replace the clutch on their cars. We will estimate the difference in the proportions of younger (i.e. 25 years and under) and older drivers in the population who have had to replace the clutch.

Denote by $p_{y}$ and $p_{o}$ the proportion of cars that have clutch replaced for young and old car owners respectively. We need the estimates for $p_{y} - p_{o}$. Based on the assumption that the two populations of car owners, namely the old ones and the young ones, the difference in means of the two populations $X = P_{y} - P_{o}$  is normally distributed. And also

\begin{align}
\frac{X - (\hat{p}_{y} - \hat{p}_{o})}{\hat{\sigma}_{p_{y}, p_{o}}} \sim \mathit{t}_{n-2}
\end{align}

### Calculating the Statistics

In [15]:
py_hat = 17/52
po_hat = 19/148
diff_hat = py_hat - po_hat
cat('py_hat - po_hat:', round(diff_hat, digits=4))
se = 0.0706

py_hat - po_hat: 0.1985

### CI for the Mean Difference

In [18]:
# t-critical value
t_star = qt(.025, n-2, lower.tail=F)
cat('t_star:', round(t_star, digits=2), '\n')

ci_l = (diff_hat - se*t_star)
ci_u = (diff_hat + se*t_star)
cat('(', round(ci_l, digits=4), ',', round(ci_u, digits=4), ')')

t_star: 1.97 
( 0.0593 , 0.3378 )