Skip to content

nanxstats/bcpm-msaenet

master
Switch branches/tags
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Brain Cancer Predictive Modeling

Project Status: Active – The project has reached a stable, usable state and is being actively developed. License: MIT

Analysis pipeline for the precisionFDA Brain Cancer Predictive Modeling and Biomarker Discovery challenge using msaenet.

It is ranked as the 2nd place solution by predictive performance.

Team: Nan Xiao, Soner Koc, Kaushik Ghose from Seven Bridges.

Model

This solution features the following models:

  • Feature selection with the multi-step adaptive SCAD-net method (Xiao and Xu, 2015).
  • A relaxed version of the "Stability Selection" procedure (Meinshausen and Bühlmann, 2010) was used to aggregate the selected features from 100 perturbated models and only keep the consistently selected features.
  • Gradient boosting decision tree (GBDT) models for predictive modeling with the selected genomic features and all four clinical features. The tree models include xgboost (Chen and Guestrin, 2016), lightgbm (Ke et al., 2017), catboost (Prokhorenkova et al., 2018), and a two-layer stacking tree model (Wolpert, 1992). We created an R package stackgbm for doing this after the challenge ended.

Pipeline

Dependencies

Most of the depended R packages are installable from CRAN. Two special ones:

  • lightgbm: install from source. For macOS, it is advised to compile with a Homebrew gcc toolchain instead of the default LLVM toolchain.
  • catboost: install the latest compiled binary package from their GitHub releases.

Reproducibility

Open run.R and follow the steps. Note that some steps could take a few hours to run despite the fact that they are fully parallelized.

About

Solution for the precisionFDA Brain Cancer Predictive Modeling Challenge using msaenet

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages