Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error when default argument is passed to par.vals in learner xgboost #1191

Closed
philippstats opened this issue Aug 24, 2016 · 17 comments
Closed

Comments

@philippstats
Copy link

The default of argument missing in xgboost is set to 0 in par.set and NA_real_ in par.vals.

When I pass the default par.vals to par.vals:
makeLearner("classif.xgboost", par.vals = list(missing = NA_real_))
This error occurs:
Error in setHyperPars2.Learner(learner, insert(par.vals, args)) : NA is not feasible for parameter 'missing'!
It seems like makeNumericLearnerParam does not accept NA_real_ as numeric value.

Also not working:
makeLearner("classif.xgboost", par.vals = list(missing = NA))

@larskotthoff
Copy link
Sponsor Member

missing needs to be a number and not NA. It's set to NA "as this is how mlr expects missing values to be encoded" (see the note).

Does that answer your question?

@berndbischl
Copy link
Sponsor Member

this can now be solved, very fortunately !, with @jakob-r 's new special.vals option in PH.

thx for detecting this.

@berndbischl
Copy link
Sponsor Member

missing needs to be a number and not NA. It's set to NA "as this is how mlr expects missing values to be encoded" (see the note).

Does that answer your question?

it does not really answer the question. we define a default value. then this is not settable, pro-actively, by mlr. ergo: this is stupid, and @philippstats has detected a real problem.

the problem is, that sometimes some param allow certain values, that dont "fit" into the formal param type of PH. very often, but not always, these are also the default values. examples:
a num param, but stuff like NA, NULL, "auto", and so on.
PH now has an options special.vals, for param definitions, that allows to list these extra values, and defines them as feasible as well. we should have done this years ago. thx to @jakob-r we have it now.

@larskotthoff
Copy link
Sponsor Member

Why would you ever want to set these special values?

@berndbischl
Copy link
Sponsor Member

berndbischl commented Aug 24, 2016

Why would you ever want to set these special values?

i dont get what you mean. because they are valid settings of a parameter? and you want to control that param from mlr? and not have mlr forbid you to set these values?
i tried to explain it above with some examples?

@larskotthoff
Copy link
Sponsor Member

At least in this example it's not a valid value. if you set NA in a case where missing is used, it would cause an error from xgboost, wouldn't it? And if you set it when the parameter isn't used, there's no need to set it to that specific value. The default is just something internal as far as I can see.

Do you have an example where it would make sense to set a parameter to a value that's not allowed by its definition?

@berndbischl
Copy link
Sponsor Member

here is another example:
lets say you have knn. it has param "k". mlr defines this as integer. maybe lower = 1, upper = inf, or whatever.
now this knn package also as a heuristic to set k automatically. in this case, k = "auto" is what you need to pass. before this special.vals thing from @jakob-r this was not possible. so you could not select that option. the only way out was to make the Param "untyped".

@berndbischl
Copy link
Sponsor Member

Do you have an example where it would make sense to set a parameter to a value that's not allowed by its definition?

i have just posted one (the knn). but for xgboost it is the SAME. NA is a VALID SETTING of the param "missing". you are somehow not getting the point here. xgboost has a param "missing". you can set it to numbers, like 0, or 99. or NA_real. but the latter, mlr forbids, completely arbitrarily.

@larskotthoff
Copy link
Sponsor Member

Ok, fair enough. The xgboost documentation doesn't say that NA is a valid value though.

@larskotthoff
Copy link
Sponsor Member

The special values should be documented somewhere.

@berndbischl
Copy link
Sponsor Member

Ok, fair enough. The xgboost documentation doesn't say that NA is a valid value though.

well, in this specific example, the xgboost "missing", xgboost really sucks. they also changed this back and forth (i had to consider / fix this in general because somebody reported some problems there.)
at a certain point i also wasnt sure what xgboost does exactly here.

but they say one should basically select a certain numeric value that encodes "missingness". but in R NA_real is such a numeric entry. so their docs are kinda ok in this regard, now.

the real problem is something else! their default value is missing = NULL. this has actually 2 problems.
a) what the NULL means is undocumented in xgboost (and i dont know what it really means, even now)
b) it is a PERFECT example of such a "special val". because NULL is not something (in PH) that is feasible for a NumericParam.

@berndbischl
Copy link
Sponsor Member

The special values should be documented somewhere.

where? isnt this documented in PH? or do you mean in the "learner API" in the mlr tutorial appendix?

@larskotthoff
Copy link
Sponsor Member

In the tutorial.

But in particular the special values that are allowed/make sense should be documented.

@philippstats
Copy link
Author

My use case was:
I tune some parameters for xgboost and obtain the best parameters. I also have fix/default parameters.
I want to pass both (tuned and fix) to makeLearner to create a learner. But this does not work since the default parameter missing=NA is not valid.

@berndbischl
Copy link
Sponsor Member

My use case was:
I tune some parameters for xgboost and obtain the best parameters. I also have fix/default parameters.
I want to pass both (tuned and fix) to makeLearner to create a learner. But this does not work since the default parameter missing=NA is not valid.

thx. yup i understand that.

@berndbischl
Copy link
Sponsor Member

this is fixed now in PR #1225

it looks like this:

      makeNumericLearnerParam(id = "missing", default = NULL, tunable = FALSE, when = "both",
        special.vals = list(NA, NA_real_, NULL)),

@ja-thomas
Copy link
Contributor

Then we can close here?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants