R implementation of COMBSS (Continuous Optimisation for Best Subset Selection) for generalised linear models.
COMBSS reformulates the NP-hard discrete subset selection problem as a
continuous optimisation over the hypercube [0, 1]^p, solved via a
Frank-Wolfe homotopy algorithm. The inner ridge problem is solved with
glmnet. Supports linear (Gaussian),
binary logistic, and multinomial logistic regression.
From GitHub:
# install.packages("remotes")
remotes::install_github("benoit-liquet/combss")library(combss)
set.seed(1)
n <- 200; p <- 30
beta <- c(3, 2, 1.5, 1, 0.5, rep(0, p - 5))
x <- matrix(rnorm(n * p), n, p)
y <- as.numeric(x %*% beta + rnorm(n) * 0.5)fit <- combss(x, y, family = "gaussian", q = 10)
fit$subset_list # selected features for k = 1, ..., 10
or use summary function
summary(fit)
COMBSS fit
family: gaussian
n, p: 200, 30
q: 10
lam_ridge: 0
(no validation data; subset_list only)
Subset path:
k= 1 features: 1
k= 2 features: 1,2
k= 3 features: 1,2,3
k= 4 features: 1,2,3,4
k= 5 features: 1,2,3,4,5
k= 6 features: 1,2,3,4,5,22
k= 7 features: 1,2,3,4,5,11,22
k= 8 features: 1,2,3,4,5,11,18,22
k= 9 features: 1,2,3,4,5,11,13,18,22
k=10 features: 1,2,3,4,5,11,13,18,19,22
family = "linear" is accepted as an alias for "gaussian".
ybin <- as.numeric(plogis(x %*% beta) > 0.5)
itr <- 1:140; iva <- 141:200
fit <- combss(x[itr, ], ybin[itr],
x_val = x[iva, ], y_val = ybin[iva],
family = "binomial", q = 15)fit$subset # best subset by validation accuracy
[1] 1 2 3 4 5 6 8 13 21 22 26
fit$accuracy # validation accuracy at best k
[1] 0.9666667
fit <- combss(x, ymulti, family = "multinomial", q = 20)cv <- combss_cv(x, y, family = "gaussian", q = 10)
cv$best_lambdaprint(fit),summary(fit)coef(fit, k)— selected feature indices for subset sizekpredict(fit, newx, x_train, y_train, k)— refit on chosen subset and predict
- Moka, Liquet, Zhu & Muller (2024). COMBSS: best subset selection via continuous optimization. Statistics and Computing.
- Mathur, Liquet, Muller & Moka (2026). Parsimonious Subset Selection for Generalized Linear Models with Biomedical Applications. arXiv preprint.
- Sarat Moka
- Anant Mathur
- Benoit Liquet (maintainer)
- Python implementation:
combsson PyPI (source)
GPL-3