pyAutoARR

Auto Adaptive Robust Regression Python Package

Description

This python package implements the Alternating Gradient Descent, Alternating Gradient Descent with Barzilai-Borwein Method and Alternating Gradient Descent with Backtracking Method. It also includes the Huber Mean Estimation, Huber Covariance Matrix Estimation, Huber Regression and Adaptive Huber Regression from R library FarmTest, written by Xiaoou Pan.

Installation

This python package can be installed on Windows, Mac and Linux.

Install pyAutoAdaptiveRobustRegression with pip:

pip install pyAutoAdaptiveRobustRegression

Requirements on Operating Systems

For Windows:

There is no requirement for Windows. The armadillo and openblas libraries have already included.

For Mac:

brew install armadillo

For Linux:

apt install armadillo openblas

Common Error Messages

Some common error messages along with their solutions are collected below, and we'll keep updating them based on users' feedback:

Error: 6): Symbol not found: ___addtf3 Referenced from: /usr/local/opt/gcc/lib/gcc/11/libquadmath.0.dylib Expected in: /usr/lib/libSystem.B.dylib in /usr/local/opt/gcc/lib/gcc/11/libquadmath.0.dylib

Solution: After running brew config and brew doctor, found out the problem was due to that gcc is not linked. Running sudo chown -R $(whoami) /usr/local/lib/gcc and then brew link gcc solved the problem. (more details)

Functions

There is one function from Do we need to estimate the variance in robust mean estimation?:

autoarr_mean: Auto Adaptive Robust Regression Mean Estimation

There are four functions from A new principle for tuning-free Huber regression:

tfhuber_mean: Tuning-Free Huber Mean Estimation
tfhuber_cov: Tuning-Free Huber Covariance Matrix Estimation
tfhuber_reg: Tuning-Free Huber Regression
cv_tfhuber_lasso: K-fold Cross-Validated Tuning-Free Huber-Lasso Regression

Examples

First, we present an example of mean estimation about Huber and Alternating Gradient Descent related methods. We generate data from a log-normal distribution, which is asymmetric and heavy-tailed.

# Import libraries
import numpy as np
import pyAutoAdaptiveRobustRegression as arr

# Mean estimation
n = 1000
X=np.random.lognormal(0,1.5,n)-np.exp(1.5**2/2)
huber_mean_result = arr.huber_mean(X)
agd_result = arr.agd(X)
agd_bb_result = arr.agd_bb(X)

Second, for each setting, we generate an independent sample of size n = 100 and compute four mean estimators: the Sample Mean, the Huber estimator, the Alternating Gradient Descent estimator, and the Alternating Gradient Descent with Barzilai-Borwein Method. Figure 1 displays the α-quantile of the estimation error, with α ranging from 0.5 to 1 based on 2000 simulations.

The four mean estimators perform almost identically for the normal data. For the heavy-tailed skewed distributions, the deviation of the sample mean from the population mean grows rapidly with the confidence level, in striking contrast to the DA-Huber estimator, the Alternating Gradient Descent estimator, and the Alternating Gradient Descent with Barzilai-Borwein Method.

Figure 1: Estimation error versus confidence level for the sample mean, the DA-Huber, and the Alternating Gradient Descent estimator, and the Alternating Gradient Descent with Barzilai-Borwein estimator based on 2000 simulations

Finally, in Figure 2, we examine the 99%-quantile of the estimation error versus a distribution parameter measuring the tail behavior and the skewness. That is, for normal data we let σ vary between 1 and 4; for skewed generalized t distributions, we increase the shape parameter q from 2.5 to 4; for the lognormal and Pareto distributions, the shape parameters σ and α vary from 0.25 to 2 and 1.5 to 3, respectively.

The DA-Huber, the Alternating Gradient Descent estimator, and the Alternating Gradient Descent with Barzilai-Borwein estimator show substantial improvement in the deviations from the population mean because the distribution tends to have heavier tails and becomes more skewed.

Figure 2: Empirical 99%-quantile of the estimation error versus a parameter measuring the tails and skewness for the sample mean, the DA-Huber, and the Alternating Gradient Descent estimator, and the Alternating Gradient Descent with Barzilai-Borwein estimator

License

MIT

Author(s)

Yichi Zhang yichi.zhang@worc.ox.ac.uk, Qiang Sun qiang.sun@utoronto.ca

References

Sun, Q. (2021). Do we need to estimate the variance in robust mean estimation? Paper

Bose, K., Fan, J., Ke, Y., Pan, X. and Zhou, W.-X. (2020). FarmTest: An R package for factor-adjusted robust multiple testing. R. J. 12 372-387. Paper

Fan, J., Ke, Y., Sun, Q. and Zhou, W.-X. (2019). FarmTest: Factor-adjusted robust multiple testing with approximate false discovery control. J. Amer. Statist. Assoc. 114 1880-1893. Paper

Sun, Q., Zhou, W.-X. and Fan, J. (2020). Adaptive Huber regression. J. Amer. Stat. Assoc. 115 254-265. Paper

Wang, L., Zheng, C., Zhou, W. and Zhou, W.-X. (2020). A new principle for tuning-free Huber regression. Stat. Sinica to appear. Paper

Name		Name	Last commit message	Last commit date
Latest commit History 43 Commits
.vscode		.vscode
cpp		cpp
example		example
pyAutoAdaptiveRobustRegression		pyAutoAdaptiveRobustRegression
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
build.bat		build.bat
build.sh		build.sh
setup.cfg		setup.cfg
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.vscode

.vscode

cpp

cpp

example

example

pyAutoAdaptiveRobustRegression

pyAutoAdaptiveRobustRegression

.gitignore

.gitignore

LICENSE

LICENSE

README.md

README.md

build.bat

build.bat

build.sh

build.sh

setup.cfg

setup.cfg

setup.py

setup.py

Repository files navigation

pyAutoARR

Description

Installation

Requirements on Operating Systems

Common Error Messages

Functions

Examples

License

Author(s)

References

About

Releases

Packages

Languages

License

YichiZhang-Oxford/pyAutoAdaptiveRobustRegression

Folders and files

Latest commit

History

Repository files navigation

pyAutoARR

Description

Installation

Requirements on Operating Systems

Common Error Messages

Functions

Examples

License

Author(s)

References

About

Resources

License

Stars

Watchers

Forks

Languages