Robust estimation methods for the mean vector and covariance matrix from data (possibly containing NAs) under multivariate heavy-tailed distributions such as angular Gaussian, Cauchy, and Student's t. Additionally, a factor model structure can be specified for the covariance matrix.
# install stable version from CRAN install.packages("fitHeavyTail") # install development version from GitHub devtools::install_github("dppalomar/fitHeavyTail")
To get help:
library(fitHeavyTail) help(package = "fitHeavyTail") ?fit_mvt
fitHeavyTail in publications:
To illustrate the simple usage of the package
fitHeavyTail, let's start by generating some multivariate data under a Student's t distribution with significant heavy tails:
library(mvtnorm) # package for multivariate t distribution N <- 10 # number of variables T <- 80 # number of observations nu <- 4 # degrees of freedom for tail heavyness set.seed(42) mu <- rep(0, N) U <- t(rmvnorm(n = round(0.3*N), sigma = 0.1*diag(N))) Sigma_cov <- U %*% t(U) + diag(N) # covariance matrix with factor model structure Sigma_scatter <- (nu-2)/nu * Sigma_cov X <- rmvt(n = T, delta = mu, sigma = Sigma_scatter, df = nu) # generate data
We can first estimate the mean vector and covariance matrix via the traditional sample estimates (i.e., sample mean and sample covariance matrix):
mu_sm <- colMeans(X) Sigma_scm <- cov(X)
Then we can compute the robust estimates via the package
library(fitHeavyTail) fitted <- fit_mvt(X)
We can now compute the estimation errors and see the big improvement:
sum((mu_sm - mu)^2) #>  0.2857323 sum((fitted$mu - mu)^2) #>  0.1404855 sum((Sigma_scm - Sigma_cov)^2) #>  5.861138 sum((fitted$cov - Sigma_cov)^2) #>  4.107825
To get a visual idea of the robustness, we can plot the shapes of the covariance matrices (true and estimated ones) projected on two dimensions. Observe how the heavy-tailed estimation follows the true one more closely than the sample covariance matrix:
For more detailed information, please check the vignette.
README file: GitHub-readme.