
# Clinical Study Designs and Statistical Power Analysis

This notebook explores the basics of clinical study designs, including sample size calculations and statistical power analysis. Links to the resources are provided for practice.
        


## Clinical Study Design Concepts

### Types of Studies

1. **Observational Studies:**
   - Includes cohort, cross-sectional, and case-control studies.
   - Observes outcomes without intervention.

2. **Experimental Studies:**
   - Includes randomized controlled trials (RCTs) and other interventional studies.
   - Involves assigning interventions to study subjects.

### Key Terms

- **Exposure**: A factor that may influence an outcome.
- **Outcome**: The result being studied, such as disease occurrence.
- **Confounders**: Variables that may distort the true relationship between exposure and outcome.
        


## Sample Size Calculation

Calculate the required sample size for a study based on effect size, significance level, and desired power.
        

In [None]:

import statsmodels.stats.power as smp

# Parameters for sample size calculation
effect_size = 0.5  # Medium effect size (Cohen's d)
alpha = 0.05       # Significance level
power = 0.8        # Desired power

# Calculate required sample size
sample_size = smp.tt_solve_power(effect_size=effect_size, alpha=alpha, power=power, alternative='two-sided')
print(f"Required Sample Size: {round(sample_size)}")
        


### Adjusting Parameters

Observe the impact of smaller effect size and higher power on sample size requirements.
        

In [None]:

# Adjust effect size
effect_size = 0.2  # Small effect size

# Recalculate sample size
sample_size_small_effect = smp.tt_solve_power(effect_size=effect_size, alpha=alpha, power=power, alternative='two-sided')
print(f"Required Sample Size (Small Effect Size): {round(sample_size_small_effect)}")

# Increase power
power = 0.95  # Higher power

# Recalculate sample size
sample_size_high_power = smp.tt_solve_power(effect_size=0.5, alpha=alpha, power=power, alternative='two-sided')
print(f"Required Sample Size (Higher Power): {round(sample_size_high_power)}")
        


## Prevalence and Incidence Calculations

Explore how to calculate prevalence and incidence from study data.
        

In [None]:

# Example data
population_size = 100000
prevalent_cases = 5000
new_cases = 500  # Cases over a time period

# Calculate prevalence
prevalence = (prevalent_cases / population_size) * 100  # as a percentage
print(f"Prevalence: {prevalence:.2f}%")

# Calculate incidence
incidence = (new_cases / population_size) * 100000  # per 100,000 population
print(f"Incidence: {incidence:.2f} per 100,000 population")
        


## Visualizing Study Data

Use histograms and scatter plots to visualize key aspects of study data.
        

In [None]:

import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd

# Example dataset
study_data = pd.DataFrame({
    "Age": [30, 35, 40, 45, 50, 55, 60, 65, 70],
    "Prevalence": [5, 6, 7, 8, 9, 10, 12, 15, 18]
})

# Histogram of prevalence
sns.histplot(study_data['Prevalence'], kde=True)
plt.title("Prevalence Distribution")
plt.xlabel("Prevalence (%)")
plt.ylabel("Frequency")
plt.show()

# Scatter plot of age vs prevalence
sns.scatterplot(x=study_data['Age'], y=study_data['Prevalence'])
plt.title("Age vs Prevalence")
plt.xlabel("Age")
plt.ylabel("Prevalence (%)")
plt.show()
        