Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Threshold tuning needs adjustment for ensemble filters #2699

Closed
pat-s opened this issue Dec 6, 2019 · 0 comments · Fixed by #2700
Closed

Threshold tuning needs adjustment for ensemble filters #2699

pat-s opened this issue Dec 6, 2019 · 0 comments · Fixed by #2700
Labels

Comments

@pat-s
Copy link
Member

pat-s commented Dec 6, 2019

#2685 (comment)

library(survival)
library(mlr)
#> Loading required package: ParamHelpers

data(veteran)
set.seed(24601)
vet.task <- makeSurvTask(id = "VET", data = veteran, target = c("time", "status"))
vet.task <- createDummyFeatures(vet.task)

cox.lrn <- makeLearner(cl="surv.coxph", id = "coxph", predict.type="response")
fval =generateFilterValuesData(vet.task, 
                             method = list("E-mean", c("univariate.model.score", "randomForestSRC_importance")),
                             more.args=list("univariate.model.score"=list(perf.learner=cox.lrn), "randomForestSRC_importance"=list(ntree=100))
)
fval = fval$data
fval
#>                   name    type                     method         value
#>  1:              prior numeric randomForestSRC_importance -0.0034653967
#>  2:                trt numeric randomForestSRC_importance -0.0002310563
#>  3:           diagtime numeric randomForestSRC_importance  0.0007997677
#>  4:                age numeric randomForestSRC_importance  0.0020504755
#>  5:     celltype.large numeric randomForestSRC_importance  0.0084291392
#>  6:     celltype.adeno numeric randomForestSRC_importance  0.0088886429
#>  7:  celltype.squamous numeric randomForestSRC_importance  0.0111077710
#>  8: celltype.smallcell numeric randomForestSRC_importance  0.0137876876
#>  9:              karno numeric randomForestSRC_importance  0.1285040870
#> 10:              prior numeric     univariate.model.score  0.4085623679
#> 11:                trt numeric     univariate.model.score  0.4474747475
#> 12:                age numeric     univariate.model.score  0.4882100750
#> 13:     celltype.large numeric     univariate.model.score  0.5371655104
#> 14:           diagtime numeric     univariate.model.score  0.5669050051
#> 15:  celltype.squamous numeric     univariate.model.score  0.5669882101
#> 16:     celltype.adeno numeric     univariate.model.score  0.5731948566
#> 17: celltype.smallcell numeric     univariate.model.score  0.5972083749
#> 18:              karno numeric     univariate.model.score  0.6612903226
#> 19:                age numeric                     E-mean  6.5000000000
#> 20:     celltype.adeno numeric                     E-mean  7.0000000000
#> 21:     celltype.large numeric                     E-mean  2.0000000000
#> 22: celltype.smallcell numeric                     E-mean  7.0000000000
#> 23:  celltype.squamous numeric                     E-mean  7.0000000000
#> 24:           diagtime numeric                     E-mean  3.5000000000
#> 25:              karno numeric                     E-mean  1.0000000000
#> 26:              prior numeric                     E-mean  6.0000000000
#> 27:                trt numeric                     E-mean  5.0000000000
#>                   name    type                     method         value

threshold = 0.5
nselect = sum(fval[["value"]] >= threshold, na.rm = TRUE)
nselect
#> [1] 15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant