
# Exemplar Project: Meta-Analysis of Survival Data in Clinical Research

This notebook implements a meta-analysis project focused on Tyrosine Kinase Inhibitors (TKIs) versus traditional chemotherapy for advanced Non-Small Cell Lung Cancer (NSCLC).

## Topics Covered
- DerSimonian and Laird inverse variance method
- Forest plots for treatment comparisons
- Publication bias assessment with funnel plots
- Mantel-Haenszel estimator for small meta-analyses

### Dataset Information
The dataset includes four Randomized Controlled Trials (RCTs) and their event data:
- Events in experimental (TKI) and control (chemo) groups
- Risk ratios for Progression-Free Survival (PFS)

### Prerequisites
Install the required package before running the code:
```bash
pip install PythonMeta matplotlib numpy pandas
```
        


## Meta-Analysis Dataset

The event data for the four RCTs is as follows:
| Study               | Events (TKI) | Subjects (TKI) | Events (Chemo) | Subjects (Chemo) |
|---------------------|--------------|----------------|----------------|------------------|
| Shi Y et al (2017)  | 91           | 138            | 106            | 122              |
| Fukuoka M et al (2011)| 76         | 96             | 90             | 94               |
| Mok T et al (2009)  | 533          | 609            | 586            | 608              |
| Sequist L et al (2013)| 153        | 230            | 104            | 115              |

We will use this data for the meta-analysis.
        


## DerSimonian and Laird Method

This method calculates the pooled risk ratio using the inverse variance weighting approach.
        

In [None]:

import numpy as np
import pandas as pd

# Define the dataset
data = pd.DataFrame({
    "Study": ["Shi Y et al (2017)", "Fukuoka M et al (2011)", "Mok T et al (2009)", "Sequist L et al (2013)"],
    "Events_TKI": [91, 76, 533, 153],
    "Subjects_TKI": [138, 96, 609, 230],
    "Events_Chemo": [106, 90, 586, 104],
    "Subjects_Chemo": [122, 94, 608, 115]
})

# Calculate risk ratios and variances
data["Risk_Ratio"] = (data["Events_TKI"] / data["Subjects_TKI"]) / (data["Events_Chemo"] / data["Subjects_Chemo"])
data["Variance"] = (1 / data["Events_TKI"]) + (1 / data["Subjects_TKI"]) + (1 / data["Events_Chemo"]) + (1 / data["Subjects_Chemo"])

# Calculate weights
data["Weight"] = 1 / data["Variance"]

# Pooled Risk Ratio
pooled_rr = np.sum(data["Risk_Ratio"] * data["Weight"]) / np.sum(data["Weight"])
pooled_variance = 1 / np.sum(data["Weight"])

print(f"Pooled Risk Ratio: {pooled_rr:.2f}")
print(f"Pooled Variance: {pooled_variance:.2f}")
        


## Forest Plot

The forest plot visualizes the individual study risk ratios and the pooled estimate.
        

In [None]:

import matplotlib.pyplot as plt

# Forest plot
plt.errorbar(data["Risk_Ratio"], range(len(data)), xerr=np.sqrt(data["Variance"]), fmt="o", label="Studies")
plt.axvline(pooled_rr, color="red", linestyle="--", label="Pooled Risk Ratio")
plt.yticks(range(len(data)), data["Study"])
plt.xlabel("Risk Ratio")
plt.title("Forest Plot")
plt.legend()
plt.gca().invert_yaxis()  # Reverse y-axis for readability
plt.show()
        


## Funnel Plot

The funnel plot evaluates potential publication bias by plotting precision against effect size.
        

In [None]:

# Calculate standard errors
data["StdError"] = np.sqrt(data["Variance"])

# Funnel plot
plt.scatter(data["Risk_Ratio"], 1 / data["StdError"], alpha=0.7)
plt.axvline(pooled_rr, color="red", linestyle="--", label="Pooled Risk Ratio")
plt.xlabel("Risk Ratio")
plt.ylabel("Precision (1 / StdError)")
plt.title("Funnel Plot")
plt.legend()
plt.show()
        


## Mantel-Haenszel Method

This method is more appropriate for meta-analysis with fewer studies (e.g., <10).
        

In [None]:

# Mantel-Haenszel weight calculation
data["MH_Weight"] = 1 / (data["Events_TKI"] + data["Events_Chemo"])

# Mantel-Haenszel pooled Risk Ratio
mh_rr = np.sum(data["Risk_Ratio"] * data["MH_Weight"]) / np.sum(data["MH_Weight"])
mh_variance = 1 / np.sum(data["MH_Weight"])

print(f"Mantel-Haenszel Risk Ratio: {mh_rr:.2f}")
print(f"Mantel-Haenszel Variance: {mh_variance:.2f}")
        