Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

centered bols has trouble with newdata #70

Closed
carlganz opened this issue Feb 14, 2017 · 3 comments
Closed

centered bols has trouble with newdata #70

carlganz opened this issue Feb 14, 2017 · 3 comments
Assignees
Labels

Comments

@carlganz
Copy link

library(mboost)

# example from docs
data("bodyfat", package = "TH.data")

mod1 <- mboost(DEXfat ~ btree(age) + bols(waistcirc, center=TRUE) + bbs(hipcirc),
              data = bodyfat)

mod2 <- mboost(DEXfat ~ btree(age) + bols(waistcirc, center=FALSE) + bbs(hipcirc),
               data = bodyfat)

predict(mod1, bodyfat)
# errors
predict(mod2, bodyfat)
# no errors
predict(mod1)
# no errors

I'm guessing that the levels of waistcirc change when centered so the levels in newdata don't match the model even though the data is the same as the data used to build the model.

@hofnerb hofnerb added the bug label Feb 15, 2017
@hofnerb hofnerb self-assigned this Feb 15, 2017
@hofnerb
Copy link
Member

hofnerb commented Feb 15, 2017

The problem is different from what you think, center = TRUE doesn't work for bols (anymore). However, bols can take multiple variables and computes the least squares (or penalized least squares) solution for these variables:

bols(x1, x2) is essentially equivalent to a base-learner of the form lm(u ~ x1 + x2), where u is the negative gradient.
bols(x1, x2, intercept = FALSE) is essentially equivalent to a base-learner of the form lm(u ~ x1 + x2 - 1), where u is the negative gradient.

What you do is that you specify another variable by accident. In the first model it seems that it is treated as an intercept (as as.numeric(TRUE) is equal to 1), while in the second case you add a constant variable equal to zero which makes no sense at all.

As you are the second person running into this problem, we need to make sure that an error is thrown in that case.

@hofnerb
Copy link
Member

hofnerb commented Feb 15, 2017

@carlganz You stated that the example was taken from the docs. Does this include the (wrong) usage of center in bols or only the general model? I could not find any occurrence.

@carlganz
Copy link
Author

It wasn't taken from docs. Sorry if I made it seem that way. Thanks for the clarification.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants