Nonconvex Regularized Robust Regression via I-LAMM (Iterative Local Adaptive Majorize-Minimization) Algorithm
This package employs the I-LAMM algorithm to solve regularized Huber regression. The choice of penalty functions includes the l1-norm, the smoothly clipped absolute deviation (SCAD) and the minimax concave penalty (MCP). Tuning parameter λ is chosen by cross-validation, and τ (for Huber loss) is calibrated either by cross-validation or via a tuning-free principle. As a by-product, this package also produces regularized least squares estimators, including the Lasso, SCAD and MCP.
Assume that the observed data (Y, X) follow a linear model Y = X β + ε, where Y is an n-dimensional response vector, X is an n × d design matrix, β is a sparse vector and ε is an n-vector of noise variables whose distributions can be asymmetric and/or heavy-tailed. The package will compute the regularized Huber regression estimator.
With this package, the simulation results in Section 5 of this paper can be reporduced.
We are wrapping up the package and will submit it to CRAN soon.
Install ILAMM
from GitHub:
install.packages("devtools")
library(devtools)
devtools::install_github("XiaoouPan/ILAMM")
library(ILAMM)
Help on the functions can be accessed by typing ?
, followed by function name at the R command prompt.
For example, ?ncvxHuberReg
will present a detailed documentation with inputs, outputs and examples of the function ncvxHuberReg
.
The package ILAMM
is implemented in Rcpp
and RcppArmadillo
, so the following error messages might appear when you first install it (we'll keep updating common error messages with feedback from users):
-
Error: "...could not find build tools necessary to build ILAMM": For Windows you need Rtools, for Mac OS X you need to install Command Line Tools for XCode. See this link for details.
-
Error: "library not found for -lgfortran/-lquadmath": It means your gfortran binaries are out of date. This is a common environment specific issue.
There are five functions, all of which are based on the I-LAMM algorithm.
ncvxReg
: Nonconvex regularized regression (Lasso, SCAD, MCP).ncvxHuberReg
: Nonconvex regularized Huber regression (Huber-Lasso, Huber-SCAD, Huber-MCP).cvNcvxReg
: K-fold cross-validation for nonconvex regularized regression.cvNcvxHuberReg
: K-fold cross-validation for nonconvex regularized Huber regression.tfNcvxHuberReg
: Tuning-free nonconvex regularized Huber regression.
Here we generate data from a sparse linear model Y = X β + ε, where β is sparse and ε consists of indepedent coordinates from a log-normal distribution, which is asymmetric and heavy-tailed.
library(ILAMM)
n = 50
d = 100
set.seed(2018)
X = matrix(rnorm(n * d), n, d)
beta = c(rep(2, 3), rep(0, d - 3))
Y = X %*% beta + rlnorm(n, 0, 1.2) - exp(1.2^2 / 2)
First, we apply the Lasso to fit a linear model on (Y, X) as a benchmark. It can be seen that the cross-valided Lasso produces an overfitted model with many false positives.
fitLasso = cvNcvxReg(X, Y, penalty = "Lasso")
betaLasso = fitLasso$beta
Next, we apply two non-convex regularized least squares methods, SCAD and MCP, to the data. Non-convex penalties reduce the bias introduced by the l1 penalty.
fitSCAD = cvNcvxReg(X, Y, penalty = "SCAD")
betaSCAD = fitSCAD$beta
fitMCP = cvNcvxReg(X, Y, penalty = "MCP")
betaMCP = fitMCP$beta
We further apply Huber regression with non-convex penalties to fit (Y, X): Huber-SCAD and Huber-MCP. With heavy-tailed sampling, we can see evident advantages of Huber-SCAD and Huber-MCP over their least squares counterparts, SCAD and MCP.
fitHuberSCAD = cvNcvxHuberReg(X, Y, penalty = "SCAD")
betaHuberSCAD = fitHuberSCAD$beta
fitHuberMCP = cvNcvxHuberReg(X, Y, penalty = "MCP")
betaHuberMCP = fitHuberMCP$beta
Finally, we demonstrate non-convex regularized Huber regression with τ calibrated via a tuning-free procedure. This function is computationally more efficient, because the cross-validation is only applied to choosing the regularization parameter. More details of the tuning-free procedure can be found in Wang et al., 2018.
fitHuberSCAD.tf = tfNcvxHuberReg(X, Y, penalty = "SCAD")
betaHuberSCAD.tf = fitHuberSCAD.tf$beta
fitHuberMCP.tf = tfNcvxHuberReg(X, Y, penalty = "MCP")
betaHuberMCP.tf = fitHuberMCP.tf$beta
We summarize the performance of the above methods with a table including true positive (TP), false positive (FP), true positive rate (TPR), false positive rate (FPR), l1 error and l2 error below. These results can easily be reproduced.
Method | TP | FP | TPR | FPR | l1 error | l2 error |
---|---|---|---|---|---|---|
Lasso | 3 | 17 | 1 | 0.175 | 5.014 | 1.356 |
SCAD | 3 | 3 | 1 | 0.031 | 1.219 | 0.741 |
MCP | 3 | 0 | 1 | 0 | 1.156 | 0.795 |
Huber-SCAD | 3 | 1 | 1 | 0.010 | 0.710 | 0.402 |
Huber-MCP | 3 | 0 | 1 | 0 | 0.611 | 0.354 |
TF-Huber-SCAD | 3 | 1 | 1 | 0.010 | 0.710 | 0.402 |
TF-Huber-MCP | 3 | 0 | 1 | 0 | 0.611 | 0.354 |
To obtain more reliable results, users can run the above simulation repeatedly on datasets with larger scales and take average over the summary statistics.
Function cvNcvxHuberReg
is slower than the others because it carries out a two-dimensional grid search to choose both λ and τ via cross-validation.
GPL (>= 2)
C++11
Xiaoou Pan xip024@ucsd.edu, Qiang Sun qsun@utstat.toronto.edu, Wen-Xin Zhou wez243@ucsd.edu
Xiaoou Pan xip024@ucsd.edu
Eddelbuettel, D. and Francois, R. (2011). Rcpp: Seamless R and C++ integration. J. Stat. Softw. 40(8) 1-18. Paper
Eddelbuettel, D. and Sanderson, C. (2014). RcppArmadillo: Accelerating R with high-performance C++ linear algebra. Comput. Statist. Data Anal. 71 1054-1063. Paper
Fan, J. and Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. J. Amer. Statist. Assoc. 96 1348-1360. Paper
Fan, J., Li, Q. and Wang, Y. (2017). Estimation of high dimensional mean regression in the absence of symmetry and light tail assumptions. J. R. Stat. Soc. Ser. B. Stat. Methodol. 79 247-265. Paper
Fan, J., Liu, H., Sun, Q. and Zhang, T. (2018). I-LAMM for sparse learning: Simultaneous control of algorithmic complexity and statistical error. Ann. Statist. 46 814-841. Paper
Huber, P. J. (1964). Robust estimation of a location parameter. Ann. Math. Statist. 35 73-101. Paper
Pan, X., Sun, Q. and Zhou, W.-X. (2019). Iteratively reweighted l1-penalized robust regression. Preprint. Paper.
Sanderson, C. and Curtin, R. (2016). Armadillo: A template-based C++ library for linear algebra. J. Open Source Softw. 1 26. Paper
Sun, Q., Zhou, W.-X. and Fan, J. (2019) Adaptive Huber regression, J. Amer. Statist. Assoc. 0 1-12. Paper
Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B. Stat. Methodol. 58 267–288. Paper
Wang, L., Zheng, C., Zhou, W. and Zhou, W.-X. (2018). A new principle for tuning-free Huber regression. Preprint. Paper
Zhang, C.-H. (2010). Nearly unbiased variable selection under minimax concave penalty. Ann. Statist. 38 894–942. Paper