Threshold tuning needs adjustment for ensemble filters #2699

pat-s · 2019-12-06T09:41:26Z

library(survival)
library(mlr)
#> Loading required package: ParamHelpers

data(veteran)
set.seed(24601)
vet.task <- makeSurvTask(id = "VET", data = veteran, target = c("time", "status"))
vet.task <- createDummyFeatures(vet.task)

cox.lrn <- makeLearner(cl="surv.coxph", id = "coxph", predict.type="response")
fval =generateFilterValuesData(vet.task, 
                             method = list("E-mean", c("univariate.model.score", "randomForestSRC_importance")),
                             more.args=list("univariate.model.score"=list(perf.learner=cox.lrn), "randomForestSRC_importance"=list(ntree=100))
)
fval = fval$data
fval
#>                   name    type                     method         value
#>  1:              prior numeric randomForestSRC_importance -0.0034653967
#>  2:                trt numeric randomForestSRC_importance -0.0002310563
#>  3:           diagtime numeric randomForestSRC_importance  0.0007997677
#>  4:                age numeric randomForestSRC_importance  0.0020504755
#>  5:     celltype.large numeric randomForestSRC_importance  0.0084291392
#>  6:     celltype.adeno numeric randomForestSRC_importance  0.0088886429
#>  7:  celltype.squamous numeric randomForestSRC_importance  0.0111077710
#>  8: celltype.smallcell numeric randomForestSRC_importance  0.0137876876
#>  9:              karno numeric randomForestSRC_importance  0.1285040870
#> 10:              prior numeric     univariate.model.score  0.4085623679
#> 11:                trt numeric     univariate.model.score  0.4474747475
#> 12:                age numeric     univariate.model.score  0.4882100750
#> 13:     celltype.large numeric     univariate.model.score  0.5371655104
#> 14:           diagtime numeric     univariate.model.score  0.5669050051
#> 15:  celltype.squamous numeric     univariate.model.score  0.5669882101
#> 16:     celltype.adeno numeric     univariate.model.score  0.5731948566
#> 17: celltype.smallcell numeric     univariate.model.score  0.5972083749
#> 18:              karno numeric     univariate.model.score  0.6612903226
#> 19:                age numeric                     E-mean  6.5000000000
#> 20:     celltype.adeno numeric                     E-mean  7.0000000000
#> 21:     celltype.large numeric                     E-mean  2.0000000000
#> 22: celltype.smallcell numeric                     E-mean  7.0000000000
#> 23:  celltype.squamous numeric                     E-mean  7.0000000000
#> 24:           diagtime numeric                     E-mean  3.5000000000
#> 25:              karno numeric                     E-mean  1.0000000000
#> 26:              prior numeric                     E-mean  6.0000000000
#> 27:                trt numeric                     E-mean  5.0000000000
#>                   name    type                     method         value

threshold = 0.5
nselect = sum(fval[["value"]] >= threshold, na.rm = TRUE)
nselect
#> [1] 15

pat-s added the type-bug label Dec 6, 2019

This was referenced Dec 6, 2019

Ensemble filters do not select the top ranked features #2685

Closed

Fix thresholding in filterFeatures() for ensemble filters #2700

Merged

pat-s closed this as completed in #2700 Dec 7, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Threshold tuning needs adjustment for ensemble filters #2699

Threshold tuning needs adjustment for ensemble filters #2699

pat-s commented Dec 6, 2019

Threshold tuning needs adjustment for ensemble filters #2699

Threshold tuning needs adjustment for ensemble filters #2699

Comments

pat-s commented Dec 6, 2019