Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

helper function #17

Closed
lauravana opened this issue Sep 15, 2015 · 7 comments
Closed

helper function #17

lauravana opened this issue Sep 15, 2015 · 7 comments

Comments

@lauravana
Copy link

@lauravana lauravana commented Sep 15, 2015

I am currently using the ctm package from R Forge, which also implements model boosting when dealing with ordinal outcomes (i.e., ordered factors). The predict.ctm() function delivers an error probably because of:

function check_newdata() in mboost/R/helpers.R/
line: if (!all(sapply(newdata[nm], class) == sapply(mf, class)))

This will give an error when the outcome variable is of class "ordered" "factor".

@sbrockhaus
Copy link
Member

@sbrockhaus sbrockhaus commented Sep 16, 2015

I think I have the same problem with check_newdata() when using bl1 %X% bl2 with a variable that has more than one class and unsing predict() with newdata. I think the error occurs whenever you have more than one variable in the base-learner and at least one of those variables has more than one class. Consider the following minimal example (based on help of mboost):

library(mboost)
data("volcano", package = "datasets")
vol <- as.vector(volcano)
x1 <- 1:nrow(volcano)
x2 <- 1:ncol(volcano)
x <- expand.grid(x1, x2)
# generate a variable with more than one class
x$factorz <- I(gl(2, 2, length=nrow(x)))

modx <- mboost(vol ~ bbs(Var2, df = 3, knots = 10) %X%
                 bols(factorz, df = 3), data = x,
               control = boost_control(nu = 0.25))

# try to predict the data, gives error...
test <- predict(modx, newdata=x)

# ... as this comparison in check_newdata() is not possible, 
# even if the data is the same
sapply(x, class) == sapply(x, class)
@sbrockhaus
Copy link
Member

@sbrockhaus sbrockhaus commented Sep 16, 2015

This is an ugly hack, but it workes for my example. I simply override the current version of check_newdata() by a function that does not check the class of the variables in newdata at all!
Just run this when loading mboost:

my_check_newdata <- function(newdata, blg, mf, to.data.frame = TRUE) {
  nm <- names(blg)
  if (!all(nm %in% names(newdata)))
    stop(sQuote("newdata"),
         " must contain all predictor variables,",
         " which were used to specify the model.")
  if (!class(newdata) %in% c("list", "data.frame"))
    stop(sQuote("newdata"), " must be either a data.frame or a list")
  if (any(duplicated(nm)))  ## removes duplicates
    nm <- unique(nm)
  #if (!all(sapply(newdata[nm], class) == sapply(mf, class)))
  #  warning("Some variables in ", sQuote("newdata"),
  #          " do not have the same class as in the original data set",
  #          call. = FALSE)
  ## subset data
  mf <- newdata[nm]
  if (is.list(mf) && to.data.frame)
    mf <- as.data.frame(mf)
  return(mf)
}

library(mboost)
## write my version of check_newdata() into the namespace of mboost
assignInNamespace("check_newdata", my_check_newdata, ns="mboost", envir=as.environment("package:mboost"))

# check whether it worked
mboost:::check_newdata
@lauravana
Copy link
Author

@lauravana lauravana commented Sep 16, 2015

Thanks! I did a similar hack by replacing

if (!all(sapply(newdata[nm], class) == sapply(mf, class)))
with
if (!all(sapply(1:length(mf), function(i) is(nd[nm][[i]], class(mf[[i]])) & is(mf[[i]], class(nd[nm][[i]])))))

@sbrockhaus
Copy link
Member

@sbrockhaus sbrockhaus commented Sep 16, 2015

Thank you very much, I think your hack is nicer than mine. I just had to replace nd with newdata to make it work, i.e. replacing
if (!all(sapply(newdata[nm], class) == sapply(mf, class)))
with
if (!all(sapply(1:length(mf), function(i) is(newdata[nm][[i]], class(mf[[i]])) & is(mf[[i]], class(newdata[nm][[i]])))))

hofnerb added a commit that referenced this issue Sep 16, 2015
@hofnerb
Copy link
Member

@hofnerb hofnerb commented Sep 16, 2015

Thanks for the bug report.

I've used a different (but similar) fix as I've already modified check_newdata in pkg/mboostPatch/R/helpers.R. Please use

library("devtools")
install_github("hofnerb/mboost/pkg/mboostPatch")
library("mboost") 

and check your code with this package. Please let me know if the error persists.

@lauravana
Copy link
Author

@lauravana lauravana commented Sep 16, 2015

Thanks for the fix. Everything works fine for me!

@hofnerb hofnerb closed this Sep 16, 2015
@sbrockhaus
Copy link
Member

@sbrockhaus sbrockhaus commented Sep 16, 2015

Thanks a lot, for me it works as well!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
3 participants
You can’t perform that action at this time.