Sometimes we reduce variables to only 1 factor level - but not all learners can work with such variables #21

danielhorn · 2014-01-30T10:53:10Z

Example with kknn:

library(mlrMBO)
fun = function(x)
  sin(x$num2) + ifelse(x$disc1 == "a", sin(x$num1), 0)
ps = makeParamSet(
  makeDiscreteParam("disc1", values = c("a", "b")),
  makeNumericParam("num1", lower = 0, upper = 1, 
                   requires = quote(disc1 == "a")),
  makeNumericParam("num2", lower = 0, upper = 1)
)

res = mbo(fun, ps,
           learner = makeBaggingWrapper(makeLearner("regr.kknn"), 10L, predict.type = "se"), 
           control = makeMBOControl( init.design.points = 20,
                                     iters = 10,
                                     infill.crit = "ei"))

I can think of 3 possible solutions:

Guarentee inside mlrMBO (in the focus search), that every variable has at least 2 factor level
Inside mlrMBO befor learning the model, remove variables with only 1 level
Force the user to use a preproc wrapper for their learner, which removes variables with only 1 factor level

berndbischl · 2014-03-28T19:40:26Z

Daniel, doesnt this only concern PREDICTION?
So what you say in 2) makes no real sense? ("learning")
Because in focussearch we never train a model?

And if so, I think I already took care of this in mlr in a general way so it should not happen anymore?

danielhorn · 2014-03-28T19:46:16Z

Yes, 2) does not make real sense. I think, we allready discussed this a while ago.

I will test it again on monday with Karin

berndbischl · 2014-03-28T19:47:12Z

If you test it, whatever the result, please add a uni test!

danielhorn · 2014-03-31T09:56:51Z

Tested and unit test added.

We found both test cases, that work fine and that fail. Look at e0bab5e

danielhorn · 2014-03-31T15:10:13Z

Have a look at these examples. The first one succeeds, the second one produces an error. The only difference is the second numeric param in the second function.

library(mlrMBO)

f1 = function(x)
  ifelse(x$disc1 == "a", 2 * x$num1 - 1, 1 - x$num2)
ps1 = makeParamSet(
  makeNumericParam("num1", lower = -2, upper = 1),
  makeNumericParam("num2", lower = -1, upper = 2),
  makeDiscreteParam("disc1", values = c("a", "b"))
)

f2 = function(x)
  ifelse(x$disc1 == "a", 2 * x$num1 - 1, 1 - x$num1)
ps2 = makeParamSet(
  makeDiscreteParam("disc1", values = c("a", "b")),
  makeNumericParam("num1", lower = 0, upper = 1)
)

ctrl = makeMBOControl(iters = 2, init.design.points = 10, infill.opt.focussearch.points = 100)
lrn = makeLearner("regr.kknn")
mbo(f1, ps1, learner = lrn, control = ctrl)
mbo(f2, ps2, learner = lrn, control = ctrl)

jakobbossek · 2014-08-13T21:44:40Z

What is the status here? At the moment both examples of Daniels previous post work fine.

danielhorn · 2014-08-14T09:59:32Z

I'm not sure if I am missing something at the moment, but have a look at this example:

library(mlrMBO)
par.set = makeParamSet(
  makeNumericVectorParam("x", len = 5, lower = 0, upper = 1),
  makeDiscreteParam("z", values = 1:10)
)
f = function(x) sum(x$x) + as.numeric(x$z)
learner = makeBaggingWrapper(makeLearner("regr.lm"), 2L)
learner = setPredictType(learner, "se")
control =  makeMBOControl( init.design.points = 5L, iters = 2L, save.on.disk.at = numeric(0L))
control = setMBOControlInfill(control, crit = "ei")
res = mbo(f, par.set, learner = learner, control = control)

jakobbossek · 2014-08-14T10:03:38Z

Ok, this one fails.

KarinSchork · 2014-08-18T18:58:46Z

I also found some cases which fail or produce warnings:

library(mlrMBO)

fun = function(x) {
  ifelse(x$disc1 == "a", (x$num1-0.3)^2*(x$num1+2)*(x$num1+4)*(x$num1+0.1), (x$num1+0.2)^2*(x$num1-1.1)^2)
}
ps = makeParamSet(
  makeDiscreteParam("disc1", values = c("a", "b")),
  makeNumericParam("num1", lower = 0, upper = 1)
)

learner1 = makeBaggingWrapper(makeLearner("regr.lm"), bw.iters = 10L)
learner1 = setPredictType(learner1, "se")
learner2 = makeBaggingWrapper(makeLearner("regr.blackboost"), bw.iters = 10L)
learner2 = setPredictType(learner2, "se")
learner3 = makeBaggingWrapper(makeLearner("regr.mob"), bw.iters = 10L)
learner3 = setPredictType(learner3, "se")
learner4 = makeBaggingWrapper(makeLearner("regr.crs"), bw.iters = 10L)
learner4 = setPredictType(learner4, "se")


controlMBO = makeMBOControl(init.design.points = 10, 
  iters = 5, save.on.disk.at = numeric(0L))
setMBOControlInfill(controlMBO, crit = "lcb", opt = "focussearch")

set.seed(2274)
mbo(fun, ps, learner = learner1, control = controlMBO)
set.seed(2274)
mbo(fun, ps, learner = learner2, control = controlMBO)
set.seed(2274)
mbo(fun, ps, learner = learner3, control = controlMBO)
set.seed(2274)
mbo(fun, ps, learner = learner4, control = controlMBO)

berndbischl · 2014-12-11T11:25:04Z

@jakobbossek
Please check if this is all tested and works.

If so we can close

mllg · 2014-12-11T19:27:35Z

learner3 triggered something mob specific and should be fixed now.

jakobbossek · 2014-12-11T19:30:07Z

learner2 caused an error because of no propose.time argument in extras on model fail. Is fixed now.

berndbischl · 2014-12-12T02:59:21Z

The warning we see in learner1 seems to be simply bad luck. In the bagging we select the "disc1" only in rows where it is "b".

We possibly want to stratify on the factors, but this might be hard....

berndbischl · 2014-12-12T03:01:43Z

An easier option here would be to simply learn on the non-constant features. But then we get problems in predict. We can handle this via a preproc wrapper. We simply store what was constant in training (+removed) and remove it also in prediction. Maybe this is best for now.

jakob-r · 2016-02-10T09:41:57Z

We agreed, that we want to check the initial design if all factor level are present. If not an error is thrown. This is an easy fast fix.

jakobbossek · 2016-02-24T12:22:33Z

How does this solve the problem in cases where we use bagging / a bagging wrapper?
Even if all factors levels are covered for all discrete parameters in the initial design the training sets which are generated for bagging might be awkward and contain only a single factor level.

jakobbossek · 2016-09-12T09:01:34Z

ping

berndbischl · 2016-09-12T09:08:37Z

i think somebody needs to summarize the status here.
so we still have examples that fails? how important are they?

i do have a general solution for all of these problems, i think, using the new vtreat package. but we cannot do that now, and should do that in mlr

ja-thomas · 2016-10-26T12:03:49Z

Just rerun the examples


f = function(x) {
  ifelse(x$disc1 == "a", (x$num1-0.3)^2*(x$num1+2)*(x$num1+4)*(x$num1+0.1), (x$num1+0.2)^2*(x$num1-1.1)^2)
}
ps = makeParamSet(
  makeDiscreteParam("disc1", values = c("a", "b")),
  makeNumericParam("num1", lower = 0, upper = 1)
)


fun = makeSingleObjectiveFunction(fn = f, par.set = ps, has.simple.signature = FALSE) 

learner1 = makeBaggingWrapper(makeLearner("regr.lm"), bw.iters = 10L)
learner1 = setPredictType(learner1, "se")
learner2 = makeBaggingWrapper(makeLearner("regr.blackboost"), bw.iters = 10L)
learner2 = setPredictType(learner2, "se")
learner3 = makeBaggingWrapper(makeLearner("regr.mob"), bw.iters = 10L)
learner3 = setPredictType(learner3, "se")
learner4 = makeBaggingWrapper(makeLearner("regr.crs"), bw.iters = 10L)
learner4 = setPredictType(learner4, "se")


controlMBO = makeMBOControl()
controlMBO = setMBOControlTermination(controlMBO, iters = 5)
controlMBO = setMBOControlInfill(controlMBO, crit = "cb", opt = "focussearch")

des = generateDesign(n = 10, ps)

set.seed(2274)
mbo(fun, des, learner = learner1, control = controlMBO) # fails
 > Error in `contrasts<-`(`*tmp*`, value = contr.funs[1 + isOF[nn]]) : 
  contrasts can be applied only to factors with 2 or more levels 

set.seed(2274)
mbo(fun, des, learner = learner2, control = controlMBO) # works
set.seed(2274)
mbo(fun, des, learner = learner3, control = controlMBO) # fails
>  Error in trainLearner.regr.mob(.learner = list(id = "regr.mob", type = "regr",  : 
  Failed to fit party::mob. Some coefficients are estimated as NA 

set.seed(2274)
mbo(fun, des, learner = learner4, control = controlMBO) # warnings
> There were 50 or more warnings (use warnings() to see the first 50)
> warnings()
> Warning messages:
> 1: In krscvNOMAD(xz = xz, y = y, degree.max = degree.max,  ... :
   optimal degree equals search maximum (3): rerun with larger degree.max

1 and 3 fail, 2 runs, 4 gives warnings.

So it depends on the learner and a general fix has to be in mlr, I'm not sure if we can solve it directly in MBO

mllg · 2016-11-22T10:31:50Z

Now all example works and I struggle to reproduce something. This was maybe resolved during code cleanup (I've added some drop=FALSE, replaced sapply with vapply etc.).

If someone has a working example, please post.

ja-thomas · 2016-12-19T09:03:22Z

I would suggest we close here until someone can produce a similar problem again.

ok? @berndbischl @jakobbossek @mllg @jakob-r

Otherwise @berndbischl can/should have a look

mllg · 2017-01-02T10:07:24Z

Agreed.

danielhorn mentioned this issue Mar 5, 2014

Some learners produce errors with discrete Params #26

Closed

jakobbossek added the important label Apr 24, 2014

berndbischl self-assigned this Apr 25, 2014

danielhorn mentioned this issue Jun 12, 2014

NOW: run tests, make check and all examples #49

Closed

berndbischl assigned jakobbossek and unassigned berndbischl Dec 11, 2014

berndbischl added this to the v0.1 milestone Feb 10, 2016

berndbischl assigned berndbischl and unassigned jakobbossek Feb 10, 2016

mllg closed this as completed Jan 2, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sometimes we reduce variables to only 1 factor level - but not all learners can work with such variables #21

Sometimes we reduce variables to only 1 factor level - but not all learners can work with such variables #21

danielhorn commented Jan 30, 2014 •

edited by jakob-r

Loading

berndbischl commented Mar 28, 2014

danielhorn commented Mar 28, 2014

berndbischl commented Mar 28, 2014

danielhorn commented Mar 31, 2014

danielhorn commented Mar 31, 2014 •

edited by jakob-r

Loading

jakobbossek commented Aug 13, 2014

danielhorn commented Aug 14, 2014

jakobbossek commented Aug 14, 2014

KarinSchork commented Aug 18, 2014 •

edited by jakob-r

Loading

berndbischl commented Dec 11, 2014

mllg commented Dec 11, 2014

jakobbossek commented Dec 11, 2014

berndbischl commented Dec 12, 2014

berndbischl commented Dec 12, 2014

jakob-r commented Feb 10, 2016

jakobbossek commented Feb 24, 2016

jakobbossek commented Sep 12, 2016

berndbischl commented Sep 12, 2016

ja-thomas commented Oct 26, 2016

mllg commented Nov 22, 2016

ja-thomas commented Dec 19, 2016

mllg commented Jan 2, 2017

Sometimes we reduce variables to only 1 factor level - but not all learners can work with such variables #21

Sometimes we reduce variables to only 1 factor level - but not all learners can work with such variables #21

Comments

danielhorn commented Jan 30, 2014 • edited by jakob-r Loading

berndbischl commented Mar 28, 2014

danielhorn commented Mar 28, 2014

berndbischl commented Mar 28, 2014

danielhorn commented Mar 31, 2014

danielhorn commented Mar 31, 2014 • edited by jakob-r Loading

jakobbossek commented Aug 13, 2014

danielhorn commented Aug 14, 2014

jakobbossek commented Aug 14, 2014

KarinSchork commented Aug 18, 2014 • edited by jakob-r Loading

berndbischl commented Dec 11, 2014

mllg commented Dec 11, 2014

jakobbossek commented Dec 11, 2014

berndbischl commented Dec 12, 2014

berndbischl commented Dec 12, 2014

jakob-r commented Feb 10, 2016

jakobbossek commented Feb 24, 2016

jakobbossek commented Sep 12, 2016

berndbischl commented Sep 12, 2016

ja-thomas commented Oct 26, 2016

mllg commented Nov 22, 2016

ja-thomas commented Dec 19, 2016

mllg commented Jan 2, 2017

danielhorn commented Jan 30, 2014 •

edited by jakob-r

Loading

danielhorn commented Mar 31, 2014 •

edited by jakob-r

Loading

KarinSchork commented Aug 18, 2014 •

edited by jakob-r

Loading