Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upbols for factors with unobserved levels breaks #47
Comments
|
Reading your issue, I just wondered what happens within cvrisk() when in a fold a factor level is empty. I constructed the following example: set.seed(123)
z <- factor(sample(1:5, 100, replace = TRUE), levels = 1:6)
y <- rnorm(100)
m <- mboost(y ~ bols(z))
## Create resampling folds
myfolds <- cv(model.weights(m), "kfold")
# In the first fold, set all observations with factor level 1 to 0
# thus, in this fold this factor level is empty
myfolds[ z == 1 , 1] <- 0
## cvrisk does not work for first fold
cv1 <- cvrisk(m, folds = myfolds)
## fit the model of the first fold by hand
## works fine by dropping factor level
y_fold1 <- y[myfolds[ ,1] == 1]
z_fold1 <- z[myfolds[ ,1] == 1]
m_fold1 <- mboost(y_fold1 ~ bols(z_fold1))
## try to fit the same model using weights, breaks with error
m_fold1 <- mboost(y ~ bols(z), weights = myfolds[ , 1])
## Error in solve.default(XtX, crossprod(X, y), LINPACK = FALSE) :
## system is computationally singular: reciprocal condition number = 2.43337e-18Do you think this is a problem? |
|
@davidruegamer does this change in the |
|
@sbrockhaus you mean the bootstrapped "confidence intervals" for which resampling is done on subject-level? I actually did the |
|
We modified Regarding confidence intervals:
Currently, the following code breaks: ### check confidents intervals for factors with very small level frequencies
z <- factor(c(sample(1:5, 100, replace = TRUE), 6), levels = 1:6)
y <- rnorm(101)
mod <- mboost(y ~ bols(z))
confint(mod) |
|
I moved this to a new issue as it touches a similar yet distinct problem. The original issue was solved with the update. |
Thus, perhaps we should use
droplevels()withinbolsand issue a warning if any levels are dropped.