# API-201 ABC REVIEW SESSION #8

**Friday, November 4**

# Table of Contents
1. [Lecture Recap](#Lecture-Recap)
2. [Exercise 1 - Applying Statistical Significance](#Exercise1)
3. [Exercise 2 - Factors Associated with Infant Birth Weight](#Exercise2)

# Lecture Recap <a class="anchor" id="Lecture-Recap"></a>

## Statistical power 

* **Type 1 error:** null hypothesis is true, but the test incorrectly reject it: **P(Reject|H0 is true)**. 
    * Usually 5% because this corresponds to a rule of assuming the null hypothesis is true and rejecting it only if the results as extreme as observed would have occurred by chance 5% of the time if true.
    
* **Type 2 error:** null hypothesis is false, but the test fails to reject it: **P(Fail to reject|H0 is false)**. 

* Type 2 error is related to power, as **Power = 1 – P(Fail to reject|H0 is false).** Statistical power is the probability of detecting an effect when there is one.
* A study is **under-powered** when a study does not have enough power to detect practically significant effects. If a study had no detectable impact of an outcome of interest, then either there is no effect or the study is under-powered.
* Power increases with the size of the effect (in terms of absolute value) and with the sample size. 
    * A larger true effect size is further away from zero, and so the sampling distribution is centered around a number further from the null hypothesis. Therefore confidence intervals are less likely to include the null hypothesis of zero.
    * When you increase the sample size, if there is a true effect, the sampling distribution will become tighter around the true effect, shrinking the width of confidence intervals and making it less likely to reject the null.



### Power calculations in R

We can use the `pwr` package in R to perform power calculations. For calculating the power of a test of the difference of two proportions with the same sample size we can use the function `pwr.2p.test`. If we provide the sample size `n` and the true difference in proportions `h`, we can calculate the power: the probability a random sample will reject the null.

In [0]:
library(pwr)

# What is my power given effect size of 0.1 and sample size of 1000?
pwr.2p.test(h = 0.1, n = 1000)

We can use the same function to calculate the necessary sample size to get a given power, given a true difference in means.

In [0]:
# What sample size do I need to detect a significant difference in 80% of random samples 
# if the true difference is 10 percentage points?
pwr.2p.test(h = 0.1, power = 0.80)

Lastly, we can calculate the true difference that we could detect given power and sample size.

In [0]:
# For what true difference will I have power of 0.8 given a sample size of 1000?
pwr.2p.test(n = 1000, power = 0.80)

**When would each of these calculations be helpful?**

Note, the `pwr` package also has other functions for calculating the power of many other tests, including a difference of proportions with different sample sizes.

# Exercise 1: Applying Statistical Significance <a class="anchor" id="Exercise1"></a>

## Example 1
**Despite following the same protocols, replications of published experiments frequently find effects of smaller magnitude or opposite sign than those in the initial studies. Provide one explanation for this fact.**



## Example 2
**When asked about health insurance coverage during the past 6 months, some people report whether they used health services in the last 6 months instead. Is this an example of measurement error? If so, do you think it is a systematic or random error?**



## Example 3 (inspired by https://www.scribbr.com/research-bias/attrition-bias/):
**You want to investigate whether an educational program can reduce the risk of drug abuse for college students. You have a treatment and control group, and will complete three waves of data collection to compare outcomes: a pre-test survey, one surveys during the program, and a post-test survey. Many students drop out over time, leading to a smaller sample at each point in time. We call this phenomenon *attrition*.**


**1. You find no statistically significant differences between students who leave and those who stay while checking your study data. Is attrition a concern in this case?**


**2. According to the data, participants who leave report significantly higher levels of drug use than participants who stay. Is attrition a concern in this case?**



## Example 4
**You suspect that female HKS students are more likely to have dogs than male HKS students. However your fellow student tells you they randomly surveyed 30 male students and 30 female students and calculated a p-value of 0.27 for the difference of proportions, so they say there is no difference. You obtain the data from your peer's study and find that 20% of female students have dogs and 10% of male students have dogs. The 95% confidence interval for the difference of proportions is [-0.08, 0.28]. Who is right?**

## Example 5
**A growing literature finds evidence of *racial concordance* in medicine, wherein black patients have better health outcomes if they have black doctors rather than doctors of another race. You are interested if black women are more likely to have a black doctor than black men, so you sample 500 black women and 500 black men. You present the following bar graph of the proportions with 95% confidence intervals.**

![](concordance_plot.png)

**You test the null hypothesis $H_0: p_1 - p_2 = 0$ and obtain a p-value of 0.026; therefore, you reject the null hypothesis and state that there is a statistically significant difference between black men and women. Your classmate says that cannot be true because the 95% confidence intervals overlap. Who is right?**

## Example 6
**You are put in charge of a new initiative to help women pay their medical expenses. Your organization is committed to addressing disparities between men and women, so you want your resources to go to groups with the largest difference in out-of-pocket medical costs between men and women. You compare estimates from two sources:**

1. A random sample of 400 men and 400 women ages 18-64. In the sample, women pay $500 on health care more than men on average each year. The 95% confidence interval is [-100, 1100].

2. A large administrative dataset on the health spending of Americans ages 65+. In the sample, women pay $50 more than men on average each year. The 95% confidence interval is [40, 60].

**Based on these results, would you direct funds to women under 65 or over 65?**



## Example 7
**After a long and storied career, you find yourself on an FDA panel that will decide whether the newest migraine drug will be approved. At question is a study that found no significant difference in migraine symptoms between two groups that randomly received either the drug or a placebo. The study *did find* a significant improvement in symptoms between women ages 18-29 who received the drug relative to placebo, but there was no improvement in any other gender or age group. The company developing the drug is seeking approval for use only by women ages 18-29. If there are no known safety concerns with the drug, would you approve it?**



# Exercise 2: Factors Associated with Infant Birth Weight <a class="anchor" id="Exercise2"></a>

In [0]:
library(tidyverse)
library(MASS)
data(birthwt)

For this exercise, we will identify factors associated with low infant birth weight using data on 189 births collected at the Baystate Medical Centre, Springfield, Massachusetts during 1986. The variables of interest are `bwt` (birth weight) and the binary variable `low` (low birth weight).

## Data Dictionary

* `low`: indicator of birth weight less than 2,500 grams
* `age`: mother's age in years
* `lwt`: mother's weight at last menstrual period
* `race` : mother's race (1 = white, 2 = black, 3 = other).
* `smoke` : smoking status during pregnancy.
* `ptl` : number of previous premature labours.
* `ht` : history of hypertension.
* `ui` : presence of uterine irritability.
* `ftv` : number of physician visits during the first trimester.
* `bwt` : birth weight in grams.

**1. Examine the first 10 rows of the data. What are some relationships that you would be interested in exploring?**

In [0]:
head(birthwt)

**2. A baby is considered to have a low birth weight when they weigh less than 2,500 grams. Using the variable `low`, calculate the number of observations and the proportion of kids with low birth weight by mother's smoking behavior.**


In [0]:
# Your answer here!



**3. Denote $\hat{p_{NS}}$ as the sample proportion of kids with low birth weight in the non-smoking group and $\hat{p_S}$ for the smoking group. Using your results from (2), calculate the difference in proportions and the standard error of $\hat{p_{NS}} - \hat{p_S}$.**

In [0]:
# Your answer here!



**4. What is the 95% confidence interval of $p_{NS} - p_S$?**

In [0]:
# Your answer here!



**5. What is the Z-score corresponding to the null hypothesis $p_{NS} - p_{S} = 0$?**

In [0]:
# Your answer here!



**6. What is the p-value corresponding to the Z-score? Is the difference in means statistically significant?**

In [0]:
# Your answer here!



**7. Suppose that you are interested in estimating the mean birth weight first for kids whose mother didn't smoke during pregnancy, and then for kids whose mother did. Compute a 95% confidence interval for each one. Report your mean values and confidence intervals below.**

In [0]:
# Your answer here!



**8. Plot your estimates from (7) and add error bars for the 95% confidence interval. Is the difference in mean statistically significant?**

In [0]:
# Your answer here!



The two individual confidence intervals overlap, so it could go either way. We need to conduct a more careful analysis to determine whether or not there is a statistically significant difference between the
two parameters (i.e., need to construct the confidence interval for the difference in means to be sure).

**9. Is there a positive association between mother's weight at last menstrual period and infant birth weight? Plot both variables and calculate the correlation coefficient.**

In [0]:
ggplot(birthwt)+
  geom_point(aes(x = lwt, y = bwt))

cor(birthwt$lwt, birthwt$bwt)