Skip to content

nanxstats/stackgbm

master
Switch branches/tags
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
R
 
 
 
 
 
 
man
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

stackgbm

Lifecycle: experimental

stackgbm offers a minimalist implementation of model stacking (Wolpert, 1992) for gradient boosted tree models built by xgboost (Chen and Guestrin, 2016), lightgbm (Ke et al., 2017), and catboost (Prokhorenkova et al., 2018).

Install

First, install the R package catboost which is not yet available from CRAN as of December 2020. Follow its official installation guide.

Then install stackgbm from GitHub:

remotes::install_github("nanxstats/stackgbm")

Design

stackgbm implements a classic two-layer stacking model: the first layer generates "features" produced by gradient boosting trees. The second layer is a logistic regression that uses these features as inputs. The code is derived from our 2nd place solution for a precisionFDA brain cancer machine learning challenge in 2020.

To make sure the package is easy to understand, modify, and extend, we choose to build this package with base R without any special frameworks or dialects. We also only exposed the most essential tunable parameters for the boosted tree models (learning rate, maximum depth of a tree, and number of iterations).

License

stackgbm is free and open source software, licensed under GPL-3.