Skip to content

Framework for in-silico reproduction of landmark oncology randomized controlled trials using machine learning survival models.

Notifications You must be signed in to change notification settings

xavier-orcutt/TrialTranslator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

87 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TrialTranslator

Introduction

The aim of this project is to reproduce landmark oncology randomized controlled trials (RCTs) and to assess their generalizability to patients in routine clinical practice. A total of 11 RCTs were reproduced across 4 different cancer types: advanced NSCLC, metastatic breast, metastaic prostate, and metastatic colorectal.

A machine-learning (ML) survival model was constructed for each of the 4 cancers. Next, the ML models stratified patients into 3 survival risk groups. Landmark RCTs were then simulated for each risk group using inverse probability of treatment-weighted survival analyses with Kaplan-Meier method.

Flatiron Health data was used to train and test the ML survival models and to reproduce the RCTs in-silico.

Notebooks

The project was coded in Python using Jupyter Notebooks. See requirement.txt for necessary packages and versions to run notebooks.

Each cancer file has 10 notebooks:

  1. data_wranging_tr: Data wranging of the training set
  2. data_wranging_te: Data wrangling of the test set
  3. crude_model_build: Building ML survival models using a crude imputation strategy
  4. cox_model_build: Building a standard Cox model
  5. mice_gbm_build: Building gradient-boosted survival models with multiple imputation strategy
  6. discrim_performance: Plotting the time-dependent AUCs for all ML models and Cox model
  7. gbm_final_build: Building the final gradient-boosted survival model
  8. rtrials_wt3r_33c: Reproducing landmark clinical trials across 3 risk groups using relaxed inclusion criteria
  9. strials_wt3r_33c: Reproducing landmark clinical trials across 3 risk groups using strict inclusion criteria
  10. rtrials_wt3r_33c_dc: Reproducing landmark clinical trials across 3 risk groups using relaxed inclusion criteria and ensuring patients receiving chemotherapy receive appropriate dosing

Website

See here for the website version of this project.

About

Framework for in-silico reproduction of landmark oncology randomized controlled trials using machine learning survival models.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published