# 1. Sample estimation, event based (AKI rate) #
* https://pubmed.ncbi.nlm.nih.gov/21829786/   
* Assumptions
  * 3 arms (Placebo : Low dose : High dose) 1:1:1  
  * Equal n for each arm, equal variance

**Formula**  
>$$ n = \frac{(Za + Zb)^2 * [p1 (1-p1) + p2 (1-p2)]}{(p1 – p2)^2} $$


In [7]:
import math
from scipy.stats import norm

# === Input Parameters ===
p1 = 0.25   # AKI rate in control group (e.g., 25%)
p2 = 0.1   # AKI rate in treatment group (e.g., 10%)
alpha = 0.05  # Significance level (2-sided)
power = 0.80  # Power (1 - beta)
dropout_rate = 0.10  # e.g., 10%
number_of_cohorts = 4 # multiarm study, placebo, high, mid, low

# === Z-scores ===
#Bonferroni correction = alpha / number of comparisons
alpha = alpha / 6 # 6 comparisons, placebo vs 3 arms each, high vs low, high vs mid, mid vs low
Z_alpha = norm.ppf(1 - alpha / 2)   #two-sided
print(f"Za is {Z_alpha:.2f}")
Z_beta = norm.ppf(power)
print(f"Zb is {Z_beta:.2f}")

# === Sample size per group (before dropout adjustment) ===
# === Pooled proportion ===
# p_bar = (p1 + p2) / 2
#numerator = (Z_alpha * math.sqrt(2 * p_bar * (1 - p_bar)) + Z_beta * math.sqrt(p1 * (1 - p1) + p2 * (1 - p2))) ** 2 #pooled proportion method

numerator = ((Z_alpha + Z_beta) ** 2) * ((p1 * (1 - p1)) + (p2 * (1 - p2)))
denominator = (p1 - p2) ** 2
n_per_group = math.ceil(numerator / denominator)
n_per_group_adjusted = math.ceil(n_per_group / (1 - dropout_rate))
total_sample_size = n_per_group_adjusted * number_of_cohorts

# === Print Output ===
print(f"Required number of patients per group (before dropout): {n_per_group}")
print(f"Adjusted for {int(dropout_rate * 100)}% dropout: {n_per_group_adjusted}")
print(f"Total sample size for {number_of_cohorts} arms (1:1:1): {total_sample_size:.0f}")


Za is 2.64
Zb is 0.84
Required number of patients per group (before dropout): 150
Adjusted for 10% dropout: 167
Total sample size for 4 arms (1:1:1): 668


# 2. Sample size estimation, mean based (serum creatinine etc) #  
* Assumptions: equal standard deviation, 0.45 mg/dL

**Formula**
> $$n = \frac{2 * (Z_\alpha + Z_\beta)^2 * \sigma^2}{(μ1 - μ2)^2}$$



In [8]:
# === Input Parameters ===
alpha = 0.05
power = 0.80
delta = 0.3     # (u1-u2), Mean difference in serum creatinine change from baseline (KDIGO: increase of 0.3 mg/dL within 48 hours)
sigma = 0.45    # Standard deviation
dropout_rate = 0.1
number_of_cohorts = 4 # multiarm study, placebo, high, mid, low

# === Z-scores ===
#Bonferroni correction = alpha / number of comparisons
alpha = alpha / 6 # 6 comparisons, placebo vs 3 arms each, high vs low, high vs mid, mid vs low
Z_alpha = norm.ppf(1 - alpha / 2)  # two-sided
print(f"Za is {Z_alpha:.2f}")
Z_beta = norm.ppf(power)
print(f"Zb is {Z_beta:.2f}")

# === Sample size per group ===
n_per_group = (2 * ((Z_alpha + Z_beta) ** 2) * (sigma ** 2)) / (delta ** 2)
n_per_group_adjusted = math.ceil( n_per_group / (1 - dropout_rate))
total_sample_size = n_per_group_adjusted * number_of_cohorts

# === Print Output ===
print(f"Required number of patients per group (before dropout): {math.ceil(n_per_group)}")
print(f"Adjusted for {int(dropout_rate * 100)}% dropout: {n_per_group_adjusted}")
print(f"Total sample size for {number_of_cohorts} arms: {total_sample_size}")

Za is 2.64
Zb is 0.84
Required number of patients per group (before dropout): 55
Adjusted for 10% dropout: 61
Total sample size for 4 arms: 244
