Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow bagging wrapper to work with no features in feature selection #1814

Closed
larskotthoff opened this issue May 27, 2017 · 2 comments
Closed

Comments

@larskotthoff
Copy link
Sponsor Member

larskotthoff commented May 27, 2017

Example from https://stackoverflow.com/questions/44208492/mlr-package-r-feature-selection-sequential-forward-search-error-must-have-at-l:

d <- data.frame(a = rnorm(1000, mean = 1),
                    b = rnorm(1000, mean = 2),
                    c = rnorm(1000, mean = 3),
                    target = as.factor(rbinom(1000, 1, prob = 0.5)))

t <- makeClassifTask(data = d,
                     target = 'target',
                     positive = '1')

logreg.lrn <- makeLearner('classif.logreg')
logreg_bagged.lrn <- makeBaggingWrapper(logreg.lrn)

cntrl.sfs <- makeFeatSelControlSequential(method = "sfs",
                                          alpha = 0.01,
                                          max.features = 10,
                                          maxit = 3)

logreg_bagged_featsel.lrn <- makeFeatSelWrapper(logreg_bagged.lrn,
                                                resampling = makeResampleDesc('CV',
                                                                              iters = 3),
                                                measures = mmce,
                                                control = cntrl.sfs)

train(logreg_bagged_featsel.lrn, t)

gives

Assertion on '.newdata' failed: Must have at least 1 cols, but has 0 cols.

Should at least mention in documentation that this isn't supported.

@annette987
Copy link

I have another example of this. Using sequential forward selection with glmnet doesn't work as glmnet expects at least two variables in the matrix passed to it (see code and error below). So you cannot start with an empty model. Is it possible to modify the FeatSelWrapper so that it can start with one or more variables already in the model - i.e. variables that must be included ?

library(survival)
library(mlr)

data(veteran)
task_id = "MAS"
surv.task <- makeSurvTask(id = task_id, data = veteran, target = c("time", "status"))
ridge.lrn <- makeLearner(cl="surv.cvglmnet", id = "ridge", predict.type="response", alpha = 0, nfolds=5)

inner = makeResampleDesc("CV", iters=5, stratify=TRUE)	# Tuning
outer = makeResampleDesc("CV", iters=5, stratify=TRUE)	# Tuning

ctrl.sfs = makeFeatSelControlSequential(method="sfs", maxit=20L, max.features=20)
wrap.sfs.lrn = makeFeatSelWrapper(
  ridge.lrn,
  resampling = inner, 
  measures = list(cindex), 
  control = ctrl.sfs,
  show.info = FALSE
)

model_id = 'ridge.sfs.featsel'
learners = list(wrap.sfs.lrn)
bmr = benchmark(learners, surv.task, outer, list(cindex), show.info = TRUE, models=TRUE, keep.extract = TRUE)
Task: MAS, Learner: ridge.featsel
Resampling: cross-validation
Measures:             cindex    
Error in glmnet(x, y, weights = weights, offset = offset, lambda = lambda,  : 
  x should be a matrix with 2 or more columns

@pat-s pat-s changed the title Bagging wrapper doesn't work with no features Allow bagging wrapper to work with no features in feature selection Dec 28, 2019
@pat-s
Copy link
Member

pat-s commented Dec 28, 2019

Both issues are very niche and I'll label this as "wontfix".

The second one is even model specific and even more niche.
If this is really needed, I am happy to review a PR which solves this.

Otherwise, please file a request in {mlr3} to add support for this.

@pat-s pat-s closed this as completed Dec 28, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants