# Two Sample Test of Proportions

$$N\Big(\pi_1 - \pi_2, \frac{\pi_1(1-\pi_1)}{n_1} + \frac{\pi_2(1-\pi_2)}{n_2}\Big)$$

Example:

We collect a sample of $n_1$ = 222 women and determine that 78 of them have pre-diabetes. We collect a sample of $n_2$ = 121 men and determine that 47 have pre-diabetes. A 99% CI for the difference in proportion of women and men who have pre-diabetes is given below.


```
p1 = 78 / 222 = 0.3514 ~ 0.35
p2 = 47 / 121 = 0.3884 ~ 0.39
```

In [20]:
p1 = 78 / 222
n1 = 222
p2 = 47 / 121
n2 = 121
p  = (n1 * p1 + n2 * p2) / (n1 + n2)
cat("",
    "p1:    ", p1, "\n",
    "n1:    ", n1, "\n",
    "p2:    ", p2, "\n",
    "n2:    ", n2, "\n",
    "pooled:", p)

 p1:     0.3513514 
 n1:     222 
 p2:     0.3884298 
 n2:     121 
 pooled: 0.3644315

$$(0.3514 - 0.3884)  +/- 2.575 * \sqrt{ 0.35 * (1-0.35) / 222 + 0.39 * (1-0.39) / 121}$$

In [21]:
sp2 = p * (1 - p) / n1 + p * (1 - p) / n2
sp  = sp2^0.5
cat("", 
    "pooled var", sp2, "\n",
    "pooled std", sp)

 pooled var 0.002957563 
 pooled std 0.05438348

In [30]:
t_stat = (p1 - p2) / sp
x_sq = t_stat^2
cat("", 
    "t statistics:", t_stat, "\n", 
    "X-squared:    ", x_sq)

 t statistics: -0.6817953 
 X-squared:     0.4648448

In [31]:
2 * pnorm(t_stat)

# Build-in Function: [prop.test](https://www.rdocumentation.org/packages/stats/versions/3.4.3/topics/prop.test)

In [34]:
prop.test(
    x = c(78, 47), 
    n = c(222, 121), 
    alternative = "two.sided",
    conf.level = 0.95, 
    correct = FALSE)


	2-sample test for equality of proportions without continuity
	correction

data:  c(78, 47) out of c(222, 121)
X-squared = 0.46484, df = 1, p-value = 0.4954
alternative hypothesis: two.sided
95 percent confidence interval:
 -0.14424799  0.07009119
sample estimates:
   prop 1    prop 2 
0.3513514 0.3884298 


In [35]:
prop.test(
    x = c(78, 47), 
    n = c(222, 121), 
    alternative = "less",
    conf.level = 0.95, 
    correct = FALSE)


	2-sample test for equality of proportions without continuity
	correction

data:  c(78, 47) out of c(222, 121)
X-squared = 0.46484, df = 1, p-value = 0.2477
alternative hypothesis: less
95 percent confidence interval:
 -1.00000000  0.05286115
sample estimates:
   prop 1    prop 2 
0.3513514 0.3884298 


In [37]:
prop.test(
    x = c(78, 47), 
    n = c(222, 121), 
    alternative = "greater",
    conf.level = 0.95, 
    correct = FALSE)


	2-sample test for equality of proportions without continuity
	correction

data:  c(78, 47) out of c(222, 121)
X-squared = 0.46484, df = 1, p-value = 0.7523
alternative hypothesis: greater
95 percent confidence interval:
 -0.127018  1.000000
sample estimates:
   prop 1    prop 2 
0.3513514 0.3884298 
