# Analysis I

In [1]:
import sys
sys.path.append('./cibin_folder')
from cibin import *
from onesided import *
from sterne import *

## Regeneron Data

In [2]:
# Regeneron data from 
# https://investor.regeneron.com/news-releases/news-release-details/phase-3-prevention-trial-showed-81-reduced-risk-symptomatic-sars
n=753
m=752
N=n+m
n01 = 59
n11 = 11
n00 = m-n01
n10 = n-n11

## Find CIs: Method 3

Find 95% lower (one-sided) and two-sided confidence intervals for the reduction in risk corresponding to the primary endpoint (data “Through day 29”), using method 3.

### One-sided

In [4]:
tau_lower_oneside(n11, n10, n01, n00, 0.05, 100)

{'tau_lower': -136.0,
 'tau_upper': 704.0,
 'N_accept': array([ 70., 398., 534., 503.])}

### Two-sided

In [3]:
tau_twosided_ci(n11, n10, n01, n00, 0.05, exact=False, reps=100)

([-149.0, -44.0],
 [array([ 69., 361., 405., 670.]), array([ 66., 198., 347., 894.])],
 [1097786, 100])

## Find CIs: Bonferroni approach

Also use the cruder conservative approach via simultaneous Bonferroni confidence bounds for N⋅1 and N1⋅ described in the notes on causal inference. (For the Bonferroni approach to two-sided intervals, use Sterne’s method for the underlying hypergeometric confidence intervals. Feel free to re-use your own code from the previous problem set.)

### One-sided

#### Lower

To construct a lower 1-sided $1 − \alpha$ confidence bound for $\tau$, we can find a lower 1-sided $1 − \alpha / 2$ confidence bound for $N_{+1}$, subtract an upper 1-sided $1 − \alpha / 2$ confidence bound for $N_{1+}$, and divide the result by $N$. ([Causal Inference Notes](https://ucb-stat-159-s21.github.io/site/Notes/causal-inference.html))

In [3]:
# alpha = 0.05
# 1- alpha/2 = 0.975
N_plusone = hypergeom_conf_interval(n, n11, N, 0.975, alternative="lower")
N_oneplus = hypergeom_conf_interval(m, n01, N, 0.975, alternative="upper")

# lower onesided 1-alpha CI
lower_one = (N_plusone[0] - N_oneplus[1]) / N

In [13]:
lower_one

-0.08571428571428572

#### Upper

To construct an upper 1-sided 1−$\alpha$ confidence bound for $\tau$, we can find an upper 1-sided 1−$\alpha$/2 confidence bound for $N_{+1}$, subtract a lower 1-sided 1−$\alpha$/2 confidence bound for $N_{1+}$, and divide the result by $N$. ([Causal Inference Notes](https://ucb-stat-159-s21.github.io/site/Notes/causal-inference.html))

In [5]:
# alpha = 0.05
# 1- alpha/2 = 0.975
N_plusone = hypergeom_conf_interval(n, n11, N, 0.975, alternative="upper")
N_oneplus = hypergeom_conf_interval(m, n01, N, 0.975, alternative="lower")

# lower onesided 1-alpha CI
upper_one = (N_plusone[1] - N_oneplus[0]) / N

In [12]:
upper_one

-0.04318936877076412

### Two-sided

To construct a 2-sided confidence interval for $\tau$, we can find a 2-sided $1 − \alpha / 2$ confidence bound for $N_{+1}$ and a 2-sided $1 − \alpha / 2$ confidence bound for $N_{1+}$. The lower endpoint of the $1 − \alpha$ confidence interval for $\tau$ is the lower endpoint of the 2-sided interval for $N_{+1}$ minus the upper endpoint of the 2-sided interval for $N_{1+}$, divided by $N$. The upper endpoint of the $1 − \alpha$ confidence interval for $\tau$ is the upper endpoint of the 2-sided interval for $N_{+1}$ minus the lower endpoint of the 2-sided interval for $N_{1+}$, divided by $N$. ([Causal Inference Notes](https://ucb-stat-159-s21.github.io/site/Notes/causal-inference.html))

In [7]:
N_plusone = hypergeom_conf_interval(n, n11, N, 1-0.05, alternative="two-sided")
N_oneplus = hypergeom_conf_interval(m, n01, N, 1-0.05, alternative="two-sided")

# two-sided CI
lower_two = (N_plusone[0] - N_oneplus[1]) / N
upper_two = (N_plusone[1] - N_oneplus[0]) / N

In [11]:
lower_two*N, upper_two*N

(-129.0, -65.0)

### Discuss the differences between the two sets of confidence intervals.

The Bonferroni method is said to be is said to be "conceptually simple, conservative, and only requires the ability to compute confidence intervals for $G$ for hypergeometric distributions" ([Causal Inference Notes](https://ucb-stat-159-s21.github.io/site/Notes/causal-inference.html)). This more conservative approach can result in intervals that are unnecessarily wide. The method 3 from Li and Ding looks at all of the potential outcomes. This means that method 3 requires more memory and time to run. The Bonferroni method is much faster. Still, both methods resulted in confidence intervals in the negative range. This means that they both show that the Regeneron has a significant effect on preventing COVID-19 infection.

### Is it statistically legitimate to use one-sided confidence intervals? Why or why not?

It is statistically legitimate to use one-sided confidence intervals. The approach is more conservative, but it does result in a confidence intervals at confidence level $1 - \alpha$.

### Are the two-sided confidence intervals preferable to the one-sided intervals? Why or why not?

The two-sided confidence intervals are preferable to the one-sided intervals. We care about if the treatment is effective and if the treatment is ineffective. The one-sided lower shows the effectivenness of the treatment at best. The one-sided upper shows the effectiveness of the treatment at worst. We would prefer all of the information taht the two-sided confidence intervals provide.