Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

L1 search and I() terms #404

Closed
fweber144 opened this issue Apr 5, 2023 · 0 comments · Fixed by #408
Closed

L1 search and I() terms #404

fweber144 opened this issue Apr 5, 2023 · 0 comments · Fixed by #408
Labels
bug Bugs.

Comments

@fweber144
Copy link
Collaborator

During an L1 search, I() terms may cause an error:

N <- 41L
K <- 5L
K_fac <- 4L
set.seed(457324)
dat <- data.frame(
  y = rnorm(N),
  xcat = gl(n = K, k = floor(N / K), length = N,
            labels = paste0("gr", seq_len(K))),
  xfac = sample(gl(n = K_fac, k = floor(N / K_fac), length = N,
                   labels = paste0("fgr", seq_len(K_fac)))),
  xlog = sample(rep_len(c(TRUE, FALSE), length.out = N))
)
levels(dat$xfac) <- c(levels(dat$xfac),
                      paste0("fgr", (K_fac + 1L):(K_fac + 2L)))
dat$xcat <- as.character(dat$xcat)

library(rstanarm)
rfit <- stan_glm(y ~ xcat + xfac + I(!xlog),
                 data = dat,
                 seed = 1140350788,
                 chains = 1, iter = 500,
                 refresh = 0)

library(projpred)
# debug(projpred:::search_L1)
# debug(projpred:::collapse_contrasts_solution_path)
cvvs <- cv_varsel(rfit,
                  ### The issue does not occur with forward search:
                  # method = "forward",
                  ###
                  nclusters = 1,
                  nclusters_pred = 1,
                  seed = 46782345)

giving

Error in str2lang(x) : <text>:1:20: unexpected numeric constant
1: . ~ xfac + I(!xlog)TRUE
                       ^

The issue seems to be that collapse_contrasts_solution_path() does not escape all special symbols for regular expressions (only +):

projpred/R/formula.R

Lines 757 to 783 in a6ee4f9

collapse_contrasts_solution_path <- function(formula, path, data) {
tt <- terms(formula)
terms_ <- attr(tt, "term.labels")
for (term in terms_) {
# TODO: In the following model.matrix() call, allow user-specified contrasts
# to be passed to argument `contrasts.arg`. The `contrasts.arg` default
# (`NULL`) uses `options("contrasts")` internally, but it might be more
# convenient to let users specify contrasts directly. At that occasion,
# contrasts should also be tested thoroughly (not done until now).
x <- model.matrix(as.formula(paste("~ 1 +", term)), data = data)
if (length(attr(x, "contrasts")) == 0) {
next
}
x <- x[, colnames(x) != "(Intercept)", drop = FALSE]
path <- Reduce(
function(current, pattern) {
pattern <- gsub("\\+", "\\\\+", pattern)
list(current[[1]],
gsub(pattern, current[[1]], current[[2]]))
},
x = colnames(x),
init = list(term, path)
)
path <- unique(path[[length(path)]])
}
return(path)
}
This might be related to #183, perhaps also #182.

@fweber144 fweber144 added the bug Bugs. label Apr 5, 2023
fweber144 added a commit to fweber144/projpred that referenced this issue May 3, 2023
…hars()` for

escaping all possible regexp-special characters.
fweber144 added a commit to fweber144/projpred that referenced this issue May 3, 2023
fweber144 added a commit to fweber144/projpred that referenced this issue May 3, 2023
@fweber144 fweber144 mentioned this issue May 3, 2023
fweber144 added a commit that referenced this issue May 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Bugs.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant