
# Survival Predictive Analysis and Meta-Analysis

This notebook explores survival predictive analysis and meta-analysis techniques, including hazard ratios, survival metrics, forest plots, funnel plots, and meta-regression.

## Dataset Information and Links

1. **Meta-analysis Example Data:**
   - Example datasets for meta-analysis are available in the `metafor` R package and can be downloaded as CSV files: [Metafor Data](https://www.metafor-project.org/doku.php/data/datasets).

2. **Survival Analysis Example Data:**
   - Kaplan-Meier and hazard ratio data can be synthesized for practice or sourced from clinical studies with survival metrics.

### Prerequisites

Install the required Python packages:
```bash
pip install PythonMeta statsmodels numpy pandas matplotlib seaborn
```
        


## Survival Metrics

Survival metrics provide insights into time-dependent clinical outcomes. Key metrics include:
- **Overall Survival (OS):** Time from treatment initiation to mortality.
- **Disease-Free Survival (DFS):** Time to disease recurrence or mortality.
- **Progression-Free Survival (PFS):** Time to disease progression or mortality.
- **Recurrence-Free Survival (RFS):** Time to recurrence or mortality.
        


## Kaplan-Meier Curve

A Kaplan-Meier curve estimates survival probabilities over time.
        

In [None]:

import numpy as np
import matplotlib.pyplot as plt

# Example data for survival
time_points = [1, 2, 3, 4, 5]
survival_probs = [0.9, 0.8, 0.7, 0.5, 0.3]

# Kaplan-Meier plot
plt.step(time_points, survival_probs, where="post", label="Survival Probability")
plt.xlabel("Time (Years)")
plt.ylabel("Survival Probability")
plt.title("Kaplan-Meier Curve")
plt.legend()
plt.show()
        


## Meta-Analysis: DerSimonian and Laird Method

The DerSimonian and Laird inverse variance method calculates pooled hazard ratios (HRs) and confidence intervals.
        

In [None]:

import pandas as pd

# Example data
data = pd.DataFrame({
    "Study": ["Study1", "Study2", "Study3"],
    "HR": [0.8, 0.6, 1.2],
    "Variance": [0.02, 0.03, 0.01]
})

# Calculate weights
data["Weight"] = 1 / data["Variance"]

# Pooled HR
pooled_hr = np.sum(data["HR"] * data["Weight"]) / np.sum(data["Weight"])

# Variance of pooled HR
pooled_variance = 1 / np.sum(data["Weight"])

print(f"Pooled HR: {pooled_hr:.2f}")
print(f"Variance of Pooled HR: {pooled_variance:.2f}")
        


## Forest Plot

Visualize study-level and pooled hazard ratios using a forest plot.
        

In [None]:

# Forest plot
plt.errorbar(data["HR"], range(len(data)), xerr=np.sqrt(data["Variance"]), fmt="o", label="Individual Studies")
plt.axvline(pooled_hr, color="red", linestyle="--", label="Pooled HR")
plt.yticks(range(len(data)), data["Study"])
plt.xlabel("Hazard Ratio")
plt.title("Forest Plot")
plt.legend()
plt.gca().invert_yaxis()  # Reverse y-axis for readability
plt.show()
        


## Funnel Plot

Assess publication bias by plotting hazard ratios against their standard errors.
        

In [None]:

# Standard errors
data["StdError"] = np.sqrt(data["Variance"])

# Funnel plot
plt.scatter(data["HR"], 1 / data["StdError"], alpha=0.7)
plt.axvline(pooled_hr, color="red", linestyle="--", label="Pooled HR")
plt.xlabel("Hazard Ratio")
plt.ylabel("Precision (1 / StdError)")
plt.title("Funnel Plot")
plt.legend()
plt.show()
        


## Meta-Regression

Examine the influence of moderators (covariates) on meta-analysis results using weighted regression.
        

In [None]:

from statsmodels.api import WLS

# Example data for meta-regression
n_studies = 5
effect_sizes = [0.8, 0.6, 1.2, 0.7, 0.9]
variances = [0.02, 0.03, 0.01, 0.04, 0.02]
weights = 1 / np.array(variances)
covariates = [10, 20, 30, 40, 50]

# Meta-regression model
X = pd.DataFrame({"Intercept": 1, "Covariates": covariates})
y = pd.Series(effect_sizes)
model = WLS(y, X, weights=weights).fit()

# Print regression summary
print(model.summary())
        