# Analysis Part I

In [2]:
import numpy as np
from cibin import tau_twosided_ci
from cibin import hypergeom_conf_interval

## One-sided confidence intervals

##### Upper one-sided bounds

In [3]:
# Regeneron data
n=753
m=752
N=n+m
n01 = 59
n11 = 11
n00 = m-n01
n10 = n-n11
alpha=0.05

In [18]:
# retrieve upper one-sided bounds via simultaneous Bonferroni confidence bounds for N_+1 and N_1+ 
Ndot_1 = hypergeom_conf_interval(n11*N/n, n11, N, 1-alpha/2, alternative="upper")
N1_dot = hypergeom_conf_interval(n01*N/70, n01, N, 1-alpha/2, alternative="lower")
upp = (Ndot_1[1] - N1_dot[0])/N
lower = (Ndot_1[0] - N1_dot[1])/N
ci = [lower, upp]
ci

[-1.0, 0.6737541528239203]

##### Lower one-sided bounds

In [19]:
# retrieve lower one-sided bounds via simultaneous Bonferroni confidence bounds for N_+1 and N_1+
Ndot_1 = hypergeom_conf_interval(n11*N/n, n11, N, 1-alpha/2, alternative="lower")
N1_dot = hypergeom_conf_interval(n01*N/70, n01, N, 1-alpha/2, alternative="upper")
upp = (Ndot_1[1] - N1_dot[0])/N
lower = (Ndot_1[0] - N1_dot[1])/N
ci = [lower, upp]
ci

[0.23255813953488372, 1.0]

## Two-sided confidence intervals

##### Two-sided bounds with Sterne method

In [20]:
# retrieve two-sided bounds via Sterne's method
Ndot_1 = hypergeom_conf_interval(round(n11*N/n), n11, N, 1-alpha, alternative="two-sided")
N1_dot = hypergeom_conf_interval(round(n01*N/70), n01, N, 1-alpha, alternative="two-sided")
lower = (Ndot_1[0] - N1_dot[1])/N
upp = (Ndot_1[1] - N1_dot[0])/N
ci=[lower,upp]
ci

[0.69375415, 0.2123581]

# Two-sided bounds with Li and Ding's method 3

In [7]:
# retrieve two-sided bounds via method 3 in Li and Ding
ci = tau_twosided_ci(n11, n10, n01, n00, 0.05,exact=False, reps=1)[0]
ci

[-0.15813, 0.4279]

## Discussion

The difference between the two-sided bounds are the coverage region of the result. From hypergeometric distribution with sterne method. The lower bound is postive, and the lower bound for method 3 is negative and the upper bound is larger for sterne is larger than the upper bound of method 3. The conclusion might be that the treatment effect might not have such good performance if we use method 3 to retrive to result, which means the press might assume the result to be overestimate.

## Legitimate

It is legitimate to use one-sided confidence  intervals if the outcome only leaves out one side of the result, depending on the type of data we are looking at. For treatment effect, it is not reasonable to have only one side of cumulative distribution, since the one-sided confience interval might not contain the outcome whether it is upper or lower one-sided. And it is not reasonable to have extreme value such as 1 and -1 in 95% confidence intervals. The average effect should be zero to fulfill the null hypothesis.

## Preferable

It is preferable to use two-sided confidence interval since one-sided intervals might accept extreme outcome or reject outcome that are acceptable if it is two-sided confidence interval. And also for treatment effect, it is reasonable that average confidence intervals contain 1-alpha probabilities by leaving out two sides of result since we can not guarantee that the result of the treatment will always have the worst performance or the best performance from the confidence intervals.