-
-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sometimes we reduce variables to only 1 factor level - but not all learners can work with such variables #21
Comments
Daniel, doesnt this only concern PREDICTION? And if so, I think I already took care of this in mlr in a general way so it should not happen anymore? |
Yes, 2) does not make real sense. I think, we allready discussed this a while ago. I will test it again on monday with Karin |
If you test it, whatever the result, please add a uni test! |
Tested and unit test added. We found both test cases, that work fine and that fail. Look at e0bab5e |
Have a look at these examples. The first one succeeds, the second one produces an error. The only difference is the second numeric param in the second function. library(mlrMBO)
f1 = function(x)
ifelse(x$disc1 == "a", 2 * x$num1 - 1, 1 - x$num2)
ps1 = makeParamSet(
makeNumericParam("num1", lower = -2, upper = 1),
makeNumericParam("num2", lower = -1, upper = 2),
makeDiscreteParam("disc1", values = c("a", "b"))
)
f2 = function(x)
ifelse(x$disc1 == "a", 2 * x$num1 - 1, 1 - x$num1)
ps2 = makeParamSet(
makeDiscreteParam("disc1", values = c("a", "b")),
makeNumericParam("num1", lower = 0, upper = 1)
)
ctrl = makeMBOControl(iters = 2, init.design.points = 10, infill.opt.focussearch.points = 100)
lrn = makeLearner("regr.kknn")
mbo(f1, ps1, learner = lrn, control = ctrl)
mbo(f2, ps2, learner = lrn, control = ctrl) |
What is the status here? At the moment both examples of Daniels previous post work fine. |
I'm not sure if I am missing something at the moment, but have a look at this example: library(mlrMBO)
par.set = makeParamSet(
makeNumericVectorParam("x", len = 5, lower = 0, upper = 1),
makeDiscreteParam("z", values = 1:10)
)
f = function(x) sum(x$x) + as.numeric(x$z)
learner = makeBaggingWrapper(makeLearner("regr.lm"), 2L)
learner = setPredictType(learner, "se")
control = makeMBOControl( init.design.points = 5L, iters = 2L, save.on.disk.at = numeric(0L))
control = setMBOControlInfill(control, crit = "ei")
res = mbo(f, par.set, learner = learner, control = control) |
Ok, this one fails. |
I also found some cases which fail or produce warnings: library(mlrMBO)
fun = function(x) {
ifelse(x$disc1 == "a", (x$num1-0.3)^2*(x$num1+2)*(x$num1+4)*(x$num1+0.1), (x$num1+0.2)^2*(x$num1-1.1)^2)
}
ps = makeParamSet(
makeDiscreteParam("disc1", values = c("a", "b")),
makeNumericParam("num1", lower = 0, upper = 1)
)
learner1 = makeBaggingWrapper(makeLearner("regr.lm"), bw.iters = 10L)
learner1 = setPredictType(learner1, "se")
learner2 = makeBaggingWrapper(makeLearner("regr.blackboost"), bw.iters = 10L)
learner2 = setPredictType(learner2, "se")
learner3 = makeBaggingWrapper(makeLearner("regr.mob"), bw.iters = 10L)
learner3 = setPredictType(learner3, "se")
learner4 = makeBaggingWrapper(makeLearner("regr.crs"), bw.iters = 10L)
learner4 = setPredictType(learner4, "se")
controlMBO = makeMBOControl(init.design.points = 10,
iters = 5, save.on.disk.at = numeric(0L))
setMBOControlInfill(controlMBO, crit = "lcb", opt = "focussearch")
set.seed(2274)
mbo(fun, ps, learner = learner1, control = controlMBO)
set.seed(2274)
mbo(fun, ps, learner = learner2, control = controlMBO)
set.seed(2274)
mbo(fun, ps, learner = learner3, control = controlMBO)
set.seed(2274)
mbo(fun, ps, learner = learner4, control = controlMBO) |
@jakobbossek If so we can close |
learner3 triggered something mob specific and should be fixed now. |
learner2 caused an error because of no |
The warning we see in learner1 seems to be simply bad luck. In the bagging we select the "disc1" only in rows where it is "b". We possibly want to stratify on the factors, but this might be hard.... |
An easier option here would be to simply learn on the non-constant features. But then we get problems in predict. We can handle this via a preproc wrapper. We simply store what was constant in training (+removed) and remove it also in prediction. Maybe this is best for now. |
We agreed, that we want to check the initial design if all factor level are present. If not an error is thrown. This is an easy fast fix. |
How does this solve the problem in cases where we use bagging / a bagging wrapper? |
ping |
i think somebody needs to summarize the status here. i do have a general solution for all of these problems, i think, using the new vtreat package. but we cannot do that now, and should do that in mlr |
Just rerun the examples
1 and 3 fail, 2 runs, 4 gives warnings. So it depends on the learner and a general fix has to be in mlr, I'm not sure if we can solve it directly in MBO |
Now all example works and I struggle to reproduce something. This was maybe resolved during code cleanup (I've added some If someone has a working example, please post. |
I would suggest we close here until someone can produce a similar problem again. ok? @berndbischl @jakobbossek @mllg @jakob-r Otherwise @berndbischl can/should have a look |
Agreed. |
Example with kknn:
I can think of 3 possible solutions:
The text was updated successfully, but these errors were encountered: