The state learner

This repository provides code that implements the state learner which is a super learner for right-censored data. We provide examples of how to use it as a stand-alone tool and in combination with targeted learning. We also provide code to reproduce the numerical experiments described in the article The state learner – a super learner for right-censored data (Munch & Gerds). Details are given in ./experiments/readme-experiments.org. The article describes a real data example, but as we cannot share the data from that study, the tutorial below uses an emulated data set with the same structure as the original data analysed in the article.

Setup and load data

To run the examples below, first run the following code to load the needed functions.

library(here)
library(targets)
tar_source(here("R-code/functions"))

We load and setup the emulated data as follows.

library(riskRegression)
canc_dt <- fread(here("emulated-data/emulated-data.txt"))
canc_dt[,stage:=factor(stage)]
canc_dt[,hormones:=factor(hormones)]

The data looks as follows.

canc_dt

          time status   logPSA  stage    ggtot       sDose hormones     A
         <num>  <int>    <num> <fctr>    <num>       <num>   <fctr> <num>
   1: 28.92331      1 2.043593   T3ab 9.851837 -0.76183395       No     0
   2: 14.20877      1 2.274948    T3c 6.881895 -0.35527003       No     0
   3: 15.89122      0 1.133611    T2a 7.794667 -1.59754020       No     0
   4: 61.61761      2 2.595044    T1c 8.879701 -0.81713882      Yes     1
   5: 26.01292      0 2.360323    T1c 5.804236  0.61412150      Yes     1
  ---                                                                    
1038: 44.26178      0 1.137501    T2b 5.942560 -0.06968029      Yes     1
1039: 13.93434      0 1.898109    T1c 6.785073  0.46413725       No     0
1040: 52.19202      1 3.521309   T3ab 5.539725 -1.02355475       No     0
1041: 10.06398      0 2.900889    T2b 5.906983  1.18218871      Yes     1
1042: 25.70154      0 1.888293    T3c 7.163754  1.25946357      Yes     1

Stand alone use of the state learner

We specify a list of learners. Each learner is specified through a list with a model entry (for now, cox, GLMnet, and rfsrc are implemented), and a x_form entry, which provides a formula for which and how the covariates enter the model. Additional arguments can be supplied to each learner by adding entries to the list (e.g., ntree for a random survival forest).

library(glmnet)
library(randomForestSRC)
learners <- list(
    cox = list(model = "cox", x_form = ~logPSA+stage+ggtot+sDose+hormones),
    cox_penalty = list(model = "GLMnet", x_form = ~logPSA+stage+ggtot+sDose+hormones),
    N_Aa = list(model = "cox", x_form = ~1),
    N_Aa_strat = list(model = "cox", x_form = ~strata(hormones)),
    rf = list(model = "rfsrc", x_form = ~logPSA+stage+ggtot+sDose+hormones, ntree = 50)
)

Below we use the same list of learners to learn the cumulative hazards for cause 1, cause 2, and censoring. The state learner returns a list consisting of

a data.table of triples of learners sorted according to their joint performance in predicting the state of the observed data in the interval 0 to time;
the top ranked triple of models fitted to the full data.

Here we use an interval of 36 month and show the 6 highest ranked triples.

set.seed(114)
sl = statelearner(learners = list(cause1 = learners,
				    cause2 = learners,
				    censor = learners),
		    data = canc_dt,
		    time = 36)
head(sl$cv_fit)

Key: <loss>
        cause1      cause2 censor     loss     b
        <fctr>      <fctr> <fctr>    <num> <int>
1:         cox cox_penalty    cox 8.608565     1
2:         cox        N_Aa    cox 8.608565     1
3: cox_penalty cox_penalty    cox 8.609008     1
4: cox_penalty        N_Aa    cox 8.609008     1
5:         cox         cox    cox 8.610104     1
6: cox_penalty         cox    cox 8.610470     1

Use for targeted estimation

We estimate effect of hormone therapy on tumor recurrence (cause=1) and death (cause=2). To do so, we need an estimate of the treatment mechanism, i.e., the probability of the presence of ulcers given other covariates.

canc_dt[,A := as.numeric(hormones)-1]
treat_fit <- GLMnet(A ~ logPSA+stage+ggtot+sDose, family = binomial, data = canc_dt)

We can now estimate the ATE using the fitted nuisance models.

ate_est <- os_abs_risk_ate(data = canc_dt, 
			     eval_times = seq(6,36,6),
			     fit_1 = sl$fitted_winners$cause1,
			     fit_2 = sl$fitted_winners$cause2,
			     fit_cens = sl$fitted_winners$censor,
			     fit_treat = treat_fit)

Below we show the estimated ATE for both tumor recurrence and death for 6-month intervals up to 3 years after baseline.

setorder(ate_est, cause, time)
ate_est[effect=="ATE" & est_type=="one-step"]

     cause  time effect est_type          est         see        lower      upper
    <char> <num> <char>   <char>        <num>       <num>        <num>      <num>
 1: cause1     6    ATE one-step  0.012155916 0.013268835 -0.013850999 0.03816283
 2: cause1    12    ATE one-step  0.010001492 0.017758405 -0.024804982 0.04480796
 3: cause1    18    ATE one-step -0.003063623 0.019611287 -0.041501745 0.03537450
 4: cause1    24    ATE one-step -0.005076999 0.023713768 -0.051555983 0.04140199
 5: cause1    30    ATE one-step  0.020519005 0.029185821 -0.036685203 0.07772321
 6: cause1    36    ATE one-step  0.034855532 0.032061411 -0.027984834 0.09769590
 7: cause2     6    ATE one-step  0.006344508 0.005310849 -0.004064755 0.01675377
 8: cause2    12    ATE one-step -0.004604911 0.009646595 -0.023512237 0.01430241
 9: cause2    18    ATE one-step -0.007466150 0.013269851 -0.033475058 0.01854276
10: cause2    24    ATE one-step -0.006363876 0.015592043 -0.036924281 0.02419653
11: cause2    30    ATE one-step  0.014983272 0.031678356 -0.047106306 0.07707285
12: cause2    36    ATE one-step  0.014682592 0.032697166 -0.049403852 0.07876904

Name		Name	Last commit message	Last commit date
Latest commit History 75 Commits
R-code		R-code
emulated-data		emulated-data
experiments		experiments
figures		figures
tex		tex
zelefsky-case-study		zelefsky-case-study
.gitignore		.gitignore
.here		.here
readme.org		readme.org

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

The state learner

Setup and load data

Stand alone use of the state learner

Use for targeted estimation

About

Releases

Packages

Contributors 2

Languages

amnudn/statelearner

Folders and files

Latest commit

History

Repository files navigation

The state learner

Setup and load data

Stand alone use of the state learner

Use for targeted estimation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages