In [None]:
# Initialize Otter
import otter
grader = otter.Notebook("lab09.ipynb")

# Lab 9: Confidence Interval Part 2

Welcome to Lab 9! We will continue to investigate one of the the major causes of death in the world: cardiovascular disease. The [Python Reference](http://data8.org/su22/python-reference.html) has information that will be useful for this lab.

**Lab Queue**: You can find the Lab Queue at [lab.data8.org](https://lab.data8.org/). Whenever you feel stuck or need some further clarification, add yourself to the queue to get help from a GSI or tutor! If you are in an online lab, please list your name, breakout room number, and purpose on your ticket!

**Grading**: Here are the policies for getting full credit:

1. You can either attend your assigned lab section, make progress substantial enough for your work to be checked off, and submit your lab (even if it is incomplete) by the end of the lab period.

2. OR complete the lab on your own, submit the completed lab by **Wednesday July 27th at 11am**, and pass all autograder tests (100% of tests passed) to receive credit. 

**Submission**: Once you’re finished, run all cells besides the last one, select File > Save Notebook, and then execute the final cell. The result will contain a zip file that you can use to submit on Gradescope.

Let's begin by setting up the tests and imports by running the cell below.

In [1]:
# Run this cell to set up the notebook, but please don't change it.
from datascience import *
import numpy as np

%matplotlib inline
import matplotlib.pyplot as plots
plots.style.use('fivethirtyeight')
np.set_printoptions(legacy='1.13')

import d8error

In the following analysis, we'll examine the effect that hormone replacement therapy has on the risk of coronary heart disease for post-menopausal women using data from the Nurses' Heart Study.

### The Nurses' Health Study
The Nurses' Health Study (NHS) is another very large observational study which has brought many insights into women's health. It began in 1976, by Dr. Frank Speizer, with questionnaires that were mailed to 121,964 female registered nurses in the United States asking about their medical history, cholesterol and blood pressure, current medications, and so on (one of the benefits of studying nurses is their ability to give reliably accurate answers to these questions). 

The study's initial focus was on investigating the long-term health effects of oral contraceptives, whose use had become much more widespread in the U.S. during the 1960s, but the focus soon expanded to investigating a wide variety of questions on women's health. The NHS is still ongoing, and is now tracking its third generation of nurses in the US.

**One of the most consequential early findings from the NHS was about hormone replacement therapy (HRT)**: supplementary estrogen and progesterone for post-menopausal women to relieve side effects of declining hormone levels due to menopause. The NHS found that HRT in post-menopausal women was negatively associated with heart attack risk. In other words, higher levels of HRT in post-menopausal women were associated with lower risk of heart attack. In a landmark 1985 paper in the *New England Journal of Medicine* (NEJM), Speizer and his coauthors wrote that **women on HRT are half as likely to suffer a heart attack over a certain time period.** [(Stampfer et al., 1985)](https://www.ncbi.nlm.nih.gov/pubmed/4047106) We'll define the term "relative risk" later in this section, and we'll also investigate the interpretation of these claims and their statistical basis.

**Question 1.1:** Based on the passage above, which of the following statements can you infer about the Nurses' Health Study? Assign `nhs_true_statements` to an array of integer(s) corresponding to the statement(s) you believe are correct.

1. Hormone replacement therapy is most commonly used by young women.
2. Since only nurses were included in the study, there's a chance that confounding factors influence our dataset.
3. The study found that estrogen and progesterone use had an association with CHD rates in post-menopausal women (CHD is a leading cause of heart attacks).
4. The study uses data that was self-reported by nurses for the analysis.

*Note:* If there’s a hash error, the answer is wrong.

<!--
BEGIN QUESTION
name: q1_1
-->

In [2]:
nhs_true_statements = ...
nhs_true_statements

In [None]:
grader.check("q1_1")

**The scientists running the NHS wanted to compare post-menopausal women who had taken HRT with post-menopausal women who had never taken HRT, excluding all women who were not post-menopausal or who had previously suffered a heart attack.** This study design complicates the analysis because it creates a variety of reasons why women might drop in and out of the relevant comparison groups. They sent out surveys in 1976, 1978, and 1980, so they could receive information at different timestamps and thus participants might "change groups" midway through. 

If you're interested, read more about the study [here](https://pubmed.ncbi.nlm.nih.gov/4047106/).

Because women could (and did) drop into and out of the comparison groups in the middle of the study, it is difficult to make a table like we usually would, with one row per participant. In medical studies, individuals are typically weighted by the *amount of time* that they enrolled in the study. A more convenient sampling unit is a **person-month at risk**, which is one month spent by a particular woman in one of the comparison groups, during which she might or might not suffer a heart attack. Here, "at risk" just means the woman is being tracked by the survey in either of the two comparison groups, so that if she had a heart attack it would be counted in our data set.

**Example**: The table below tracks the histories of two hypothetical post-menopausal women in a six-month longitudinal study, who both enter the study in January 1978:
1. Alice has never been on HRT. She has a heart attack in March and is excluded for the remainder of the study period. 
2. Beatrice begins taking HRT for the first time in April and stays healthy throughout the study period.

| Name     | Month    | HRT | Heart Attack   |                                             
|----------|----------|-----|----------------|
| Alice    | Jan 1978 |  0  | 0              |
| Alice    | Feb 1978 |  0  | 0              |
| Alice    | Mar 1978 |  0  | 1              |
| Beatrice | Jan 1978 |  0  | 0              | 
| Beatrice | Feb 1978 |  0  | 0              |
| Beatrice | Mar 1978 |  0  | 0              |
| Beatrice | Apr 1978 |  1  | 0              |
| Beatrice | May 1978 |  1  | 0              |
| Beatrice | Jun 1978 |  1  | 0              |



Incidence refers to the proportion of persons who develop a condition during a particular time period. Since we want to examine the risk of developing a heart attack, we can define the **incidence rate of a heart attack** as the probability that a heart attack will happen to a given at-risk person in a given time period. The NHS calculated its effects in terms of the **relative risk**, which is simply the incidence rate for *person-months* in the HRT (Group A) group divided by the incidence rate in the no-HRT (Group B) group.

$$\text{Relative Risk} = \frac{\text{Incidence Rate(Treatment Group)}}{\text{Incidence Rate(Control Group)}}$$


**Question 1.2:** Complete the following statements, by setting the variable names to the value that correctly fills in the blank.

If the incidence rate of the treatment group is greater than the incidence rate of the control group, the relative risk will be \_\_`blank_1a`\_\_ one. This means that individuals in the treatment group are at \_\_`blank_1b`\_\_ risk of having a heart attack compared to those in the control group.

If the incidence rate of the treatment group is less than the incidence rate of the control group, the relative risk will be \_\_`blank_2a`\_\_ one. This means that individuals in the treatment group are at \_\_`blank_2b`\_\_ risk of having a heart attack compared to those in the control group.

If the incidence rate of the treatment group is equal to the incidence rate of the control group, the relative risk will be \_\_`blank_3a`\_\_ one. This means that individuals in the treatment group are at \_\_`blank_3b`\_\_ risk of having a heart attack compared to those in the control group.

`blank_1a`, `blank_2a`, `blank_3a` should be set to one of the following strings: "less than", "equal to", or "greater than"

`blank_1b`, `blank_2b`, `blank_3b` should be set to one of the following strings: "lower", "equal", or "higher" 

<!--
BEGIN QUESTION
name: q1_2
-->

In [4]:
blank_1a = ...
blank_1b = ...
blank_2a = ...
blank_2b = ...
blank_3a = ...
blank_3b = ...

In [None]:
grader.check("q1_2")

Most statistical methods that deal with this type of data assume that we can treat a table like the one above as though it is a sample of independent random draws from a much larger population of person-months at risk in each group. **We will take this assumption for granted throughout the rest of this section.**

Instead of *person-months* at risk, the NHS used *person-years* at risk. It reported 51,478 total person-years at risk in the no-HRT group with 60 heart attacks occurring in total, as well as 54,309 person-years at risk in the HRT group with 30 heart attacks occurring in total. The table NHS below has one row for each person-year at risk. The two columns are 'HRT', recording whether it came from the HRT group (1) or no-HRT group (0), and 'Heart Attack', recording whether the participant had a heart attack that year (1 for yes, 0 for no).

In [7]:
NHS = Table.read_table('NHS.csv')
NHS.show(5)

Using the NHS data, we can now conduct a hypothesis test to investigate the relationship between HRT and risk of CHD. As a reminder, the **incidence rate** is defined as the proportion of people who died in a specific group out of the total number who participated in the study from that group.

We'll use the following null and alternative hypotheses and test statistic:

**Null Hypothesis:** HRT *does not* affect the risk of CHD, and the true relative risk is equal to 1. Any deviation is due to random chance.

**Alternative Hypothesis:** HRT *decreases* the risk of CHD, and the true relative risk is less than 1.

**Test Statistic:** Relative risk of CHD between post-menopausal women receiving HRT and post-menopausal women not receiving HRT (the definition of relative risk is repeated here for your convenience):

$$\text{Relative Risk} = \frac{\text{Incidence Rate(Treatment Group)}}{\text{Incidence Rate(Control Group)}}$$

**Note:** Remember that we assume, under the null, that the two populations are derived from the same much larger population—under this assumption $\text{Incidence Rate(Treatment Group)} = \text{Incidence Rate(Control Group)}$. After simulation, we test this hypothesis by viewing the `relative_risk` for our simulated samples.

**Question 1.3:** Fill in the missing code below to write a function called `relative_risk` that takes in a table with the column labels `HRT` and `Heart Attack`, and computes the sample relative risk as an estimate of the population relative risk. Do *not* round your answer.

<!--
BEGIN QUESTION
name: q1_3
-->

In [8]:
def relative_risk(tbl):
    """Return the ratio of the incidence rates (events per person-year) for the two groups"""
    ...
    
relative_risk(NHS)

In [None]:
grader.check("q1_3")

**Question 1.4:** Fill in the function `one_bootstrap_rr` so that it **generates one bootstrap sample from the original NHS data and computes the relative risk**. Assign `bootstrap_rrs` to 15 estimates of the population relative risk.

*Note:* We are only doing 15 estimates because the code is slow! The cell may take a few seconds to run.

<!--
BEGIN QUESTION
name: q1_4
-->

In [11]:
def one_bootstrap_rr():
    return ...

bootstrap_rrs = ...
for i in np.arange(...):
    new_bootstrap_rr = ...
    bootstrap_rrs = ...

In [None]:
grader.check("q1_4")

**Question 1.5:** The file `bootstrap_rrs.csv` contains a one-column table with 2001 saved bootstrapped relative risks. Use these bootstrapped values to compute a 95% confidence interval, storing the left endpoint as `ci_left` and the right endpoint as `ci_right`. 

Note that our method isn't exactly the same as the method employed by the study authors to get their confidence interval.

<!--
BEGIN QUESTION
name: q1_5
-->

In [16]:
bootstrap_rrs_from_tbl = Table.read_table('bootstrap_rrs.csv').column(0)
ci_left = ...
ci_right = ...

# Please don't change this line.
print("Middle 95% of bootstrappped relative risks: [{:f}, {:f}]".format(ci_left, ci_right))

In [None]:
grader.check("q1_5")

The code below plots the confidence interval on top of the bootstrap histogram.

In [19]:
# Just run this cell
Table().with_column("Relative Risks", bootstrap_rrs_from_tbl).hist()
plots.plot([ci_left, ci_right], [.05,.05], color="gold");

**Question 1.6:** The following text is an excerpt from the abstract of the original 1985 paper. 
> As compared with the risk in women who had never used postmenopausal hormones, the age-adjusted relative risk of coronary disease in those who had ever used them was 0.5 (95 per cent confidence limits, 0.3 and 0.8; P = 0.007)... These data support the hypothesis that the postmenopausal use of estrogen reduces the risk of severe coronary heart disease. [(Stampfer et al., 1985)](https://www.ncbi.nlm.nih.gov/pubmed/4047106)

The authors give a 95% confidence interval of [0.3, 0.8] for the relative risk. Which of the following statements can be justified based on that confidence interval? Assign `ci_statements` to an array of integer(s) corresponding to the statement(s) you believe are correct.

1. There is a 95% chance the relative risk is between 0.3 and 0.8.
2. If we used a P-value cutoff of 5%, we would reject the null hypothesis that HRT does not affect the risk of CHD.
3. If we redo the procedure that generated the interval [0.3, 0.8] on a fresh sample of the same size, there is a 95% chance it will include the true relative risk.
4. There is between a 30% and 80% chance that any woman will suffer a heart attack during the study period.

<!--
BEGIN QUESTION
name: q1_6
-->

In [20]:
ci_statements = ...

In [None]:
grader.check("q1_6")

**Question 1.7:** What can you conclude from this test? Was hormone replacement therapy associated with an increased or decreased risk of heart attacks? Can we say that HRT caused an change in the risk of heart attacks? Explain your reasoning in 2-4 sentences. Discuss with your peers or ask a staff member.

*Hint*: Refer back to Question 1.2 for the definition and interpretations of relative risk.

<!--
BEGIN QUESTION
name: q1_7
-->

_Type your answer here, replacing this text._

Partly as a result of evidence from the NHS and other observational studies that drew similar conclusions, HRT drugs became a very popular preventive treatment for doctors to prescribe to post-menopausal woman. Even though there were known or suspected risks to the treatment (such as increasing the risk of invasive breast cancer), it was thought that the reduction in heart disease risk was well worth it.

However, a later study, the [Heart and Estrogen-Progestin Replacement Study](https://jamanetwork.com/journals/jama/fullarticle/187879), found that HRT did **not** have a significant impact on a woman's risk of CHD. These findings contradicted the results of the Nurses' Heart study, challenging the efficacy of a treatment that had become the standard of care for heart disease prevention. The HERS study authors put forward a possible answer regarding why the NHS study might be biased:
> The observed association between estrogen therapy and reduced CHD risk might be attributable to selection bias if women who choose to take hormones are healthier and have a more favorable CHD profile than those who do not. Observational studies cannot resolve this uncertainty.

**Selection bias** occurs in observational studies when there is a systematic difference between participants that receive a treatment and participants that do not receive a treatment. When this type of bias is present, the observed treatment effect might be a result of an unmeasured confounding variable.

**Question 1.8**: If women who choose to take hormones are healthier to begin with than women who choose not to, why might that systematically bias the results of observational studies like the NHS? Would we expect observational studies to overestimate or underestimate the protective effect of HRT? Discuss with your peers or ask a staff member.

<!--
BEGIN QUESTION
name: q1_8
-->

_Type your answer here, replacing this text._

### Further reading

If you're interested in learning more, you can check out these articles:

* [Origin story of the Framingham Heart Study](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC1449227/)
* [NYT article on the Nurses' Health Study and the HERS study](https://www.nytimes.com/2003/04/22/science/hormone-studies-what-went-wrong.html)

You're done with Lab 9!

**Important submission information:** Be sure to run all the tests and verify that they all pass, then choose **Save** from the **File** menu, then **run the final cell** and click the link to download the zip file. Then, go to [Gradescope](https://www.gradescope.com/courses/397747) and submit the zip file to the corresponding assignment. The name of this assignment is "Lab 09 Autograder". **It is your responsibility to make sure your work is saved before running the last cell.**

---

To double-check your work, the cell below will rerun all of the autograder tests.

In [None]:
grader.check_all()

## Submission

Make sure you have run all cells in your notebook in order before running the cell below, so that all images/graphs appear in the output. The cell below will generate a zip file for you to submit. **Please save before exporting!**

In [None]:
# Save your notebook first, then run this cell to export your submission.
grader.export(pdf=False)