SuperSurv: A Unified Ecosystem for Machine Learning, Ensembles, and Interpretability in Survival Data 
SuperSurv provides a mathematically rigorous, unified machine learning ecosystem for right-censored time-to-event data.
At its core, it implements an advanced Ensemble Super Learner framework. By utilizing Inverse Probability of Censoring Weighting (IPCW), it automatically constructs optimal convex combinations of parametric, semi-parametric, and machine learning base algorithms to minimize cross-validated risk.
Beyond the super-learner, SuperSurv acts as a complete Explainable AI (XAI) pipeline, featuring automated hyperparameter tuning grids, rigorous time-dependent benchmarking, and seamless integration with XAI ecosystems like Kernel SHAP. The SuperSurv framework builds on recent advances in machine learning-based survival analysis, including treatment-specific survival curve estimation (Westling et al., 2024) and unified ensemble modeling framework for survival prediction (Lyu et al., 2026).
Ready to dive in? The best way to understand the power of SuperSurv is to see it in action.
👉 Click here to get started with Tutorial 0: Installation & Setup!
To keep the core SuperSurv package lightweight and incredibly fast to install, heavy machine learning libraries (like XGBoost, Random Survival Forests, and Elastic Net) are listed as Suggests rather than strict requirements.
This modular design means you only need to install the specific mathematical engines you actually plan to use! If you try to call a base learner that you haven't installed yet, SuperSurv will gently pause and remind you to install it.
SuperSurv standardizes the modeling API for 19 prediction algorithms and 6 high-dimensional screening algorithms.
- Machine Learning: Random Forest, XGBoost, Support Vector Machines, Gradient Boosting, BART, Ranger
- Penalized/High-Dimensional: Elastic Net, Ridge Regression
- Tree-Based: RPART, CoxBoost
- Parametric/Classical: Cox Proportional Hazards, Weibull, Exponential, Log-Logistic, Log-Normal, Generic Parametric
- Smoothing/Splines: Generalized Additive Models (GAM)
- Baselines: Kaplan-Meier
Used to automatically filter massive datasets (like genomic data) before fitting complex models:
- Keep All Features
- Marginal Cox Screening
- Variance-based Screening
- Penalized Screening (Elastic Net)
- Random Forest Variable Hunting
Visit our official website for a complete suite of in-depth tutorials:
- 0. Installation & Setup: Get your R environment ready.
- 1. The SuperSurv Ensemble: Build and train your first meta-learner.
- 2. Model Performance: Evaluate Time-Dependent Brier Scores and Uno's C-index.
- 3. Selection vs. Ensemble: Compare evaluation approaches.
- 4. Screening Methods: Handle high-dimensional genomic data.
- 5. Hyperparameter Tuning: Automate algorithmic grid searches.
- 6. Random Forests: A deep dive into machine learning wrappers.
- 7. Parametric Models: Classical statistical approaches.
- 8. SHAP Interpretability: Demystify black-box models with global and local XAI.
- 9. Causal Inference (RMST): Evaluate treatment effects over time.
- 10. Parallel Processing: Scale up your computations for massive datasets.
You can install the development version of SuperSurv directly from GitHub using devtools:
# install.packages("devtools")
devtools::install_github("yuelyu21/SuperSurv")If you use SuperSurv in your research, please cite:
Lyu, Y., Huang, X., Lin, S. H., & Li, Z. (2026).
SuperSurv: A Unified Framework for Machine Learning Ensembles in Survival Analysis.
bioRxiv.
https://doi.org/10.64898/2026.03.11.711010
Related methodological work:
Westling, T., Luedtke, A., Gilbert, P. B., & Carone, M. (2024).
Inference for treatment-specific survival curves using machine learning.
Journal of the American Statistical Association.
https://doi.org/10.1080/01621459.2023.2205060