---
title: "Week 6 - Confidence Intervals"
date: 2024-05-09
date-format: full
author:
    - name:
          given: Pranav Kumar
          family: Mishra
      affiliations:
          - ref: rushsurg
          - ref: rushortho
      corresponding: true
      url: https://drpranavmishra.com
      email: pranav_k_mishra@rush.edu
      orcid: 0000-0001-5219-6269
      role: "Post Doctoral Research Fellow"

execute:
    enabled: false
    echo: true
    output: true
---


## Libraries

In [26]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from IPython.display import display, Markdown, Math, Latex

from scipy import stats
import math

import statsmodels.formula.api as smf
import statsmodels.api as sm
from statsmodels.stats.anova import AnovaRM
from statsmodels.regression.mixed_linear_model import MixedLMResults

import pingouin as pg

## Lecture

Standard error:

$$
SE = \frac{n}{\sqrt{\sigma}}
$$

Distance from kidney to spine in 10-year-old children

In [24]:
nondiseased = [18, 22, 23, 24, 25, 27, 27, 31, 34]
diseased = [10, 13, 14, 15, 22]

In [34]:
# Two sample (unpaired), equal variances, T-test
pg.ttest(nondiseased, diseased, paired=False, correction=False, alternative="two-sided", confidence=0.95)

Unnamed: 0,T,dof,alternative,p-val,CI95%,cohen-d,BF10,power
T-test,4.163097,12,two-sided,0.001316,"[5.18, 16.55]",2.322065,22.703,0.967966


Paris II Coronary Incidence Data

In [71]:
# pra
p_pra = 141/1563
n_pra = 1563

#PLBO
p_plbo = 185/1565
n_plbo = 1565

alpha = 0.05
t_alpha = 1.96 #95% CI

s_ppra_pplbo = round(math.sqrt(((p_pra*(1-p_pra))/n_pra)+((p_plbo*(1-p_plbo))/n_plbo)), 4)
s_ppra_pplbo

0.0109

In [72]:
#95% CI

ci95 = np.round([(p_pra - p_plbo - t_alpha*s_ppra_pplbo), (p_pra - p_plbo + t_alpha*s_ppra_pplbo)], decimals=4)

display(Markdown(f"""
The 95% Confidence Interval is: `{ci95}`

or

`{round((ci95[0] * 100), 4)}%` to `{round((ci95[1] * 100), 4)}%`
"""))


The 95% Confidence Interval is: `[-0.0494 -0.0066]`

or

`-4.94%` to `-0.66%`


In [62]:
pg.power_ttest2n(n_pra, n_plbo, alpha=0.05, power=0.8)

0.10021517218301977

### Smaller sample sizes - Binominal Distribution

If $np$ or $n(1-p)$ < 5, then we do not want to use the normal or t distribution. We want to use the binomial distribution instead.

$p$: probability of success  
$1-p$: probability of failure  

$x$ = number of successes in a sample size $n$



$$
Probability\ (x\ success\ out\ of\ n)\ =\ {N\choose x}p^{x}(1-p)^{n-x}
$$

#### Phase 2 Cancer Trial

In [5]:
# Phase 2 testing of a new chemotherapy

p = 0.20 # Probability of successful treatment
n = 14
x = 0 # No successes out of 14 patients

beta = 0.05
power = 1-beta # 95%

$$
Pr\ (No\ successes\ out\ of\ 14\ patients)\ =\ {14\choose 0}0.2^{0}(1-0)^{14-0}
$$

In [23]:
typeII = round((math.comb(n,x))*(pow(p, x))*(pow(1-p, (n-x))), 4)


display(Markdown(f"""
Type II error: `{typeII}`

Does the study have enough power? (Is Type II < Beta)

`{typeII}` < `{beta}` == `{typeII < beta}`

"""))


Type II error: `0.044`

Does the study have enough power? (Is Type II < Beta)

`0.044` < `0.05` == `True`



## Homework

7-2

7-3