Skip to content
Switch branches/tags
Go to file

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time

Being Robust (in High Dimensions) Can Be Practical

A MATLAB implementation of Being Robust (in High Dimensions) Can Be Practical from ICML 2017.


This project requires installation of the following packages:

  • Fast SVD and PCA: Used for faster computation of SVD for some large matrices.
  • Novembre_etal_2008_misc: A repository containing data from Genes Mirror Geography in Europe. Used for our semi-synthetic experiments. Extract the following three files to robust-filter/genomicData:
    • POPRES_08_24_01.EuroThinFinal.LD_0.8.exLD.out0-PCA.eigs
    • POPRES_08_24_01.EuroThinFinal.LD_0.8.exLD.out0-PCA.eval
    • POPRESID_Color.txt


The following packages are optional, and are only used for comparison of our algorithm with alternative estimators:

Explanation of Files

This repository contains several algorithms -- we identify files relevant to our algorithm, and those which are only used for comparison with alternatives.

Our Filter algorithms' files

The following are files which are used in our algorithms' implementations, and can all be found in the filterCode subdirectory.

  • filterGaussianMean.m: Our algorithm for estimating the mean of a Gaussian
  • filterGaussianCov.m: Our algorithm for estimating the covariance of a Gaussian
    • findMaxPoly.m: Finds the structured degree-two polynomial which is maximized by the data
    • flatten.m and sharpen.m: Convert between matrix and vector representations
  • filterGaussianCovTuned.m: A version of our covariance estimation algorithm which is tuned to select hyperparameters
  • mahalanobis.m: Compute Mahalanobis rescaling of a matrix

Test files, for demonstrating performance of our estimators for mean and covariance. These can be found in the testCode subdirectory:

  • testGaussianMean.m: Tests our mean estimation algorithm
  • testGaussianCov.m: Tests our covariance estimation algorithm
  • testGenomicData.m: Tests our covariance estimation algorithm on semi-synthetic genome dataset

Other algorithms' files

The following are files which are used for comparison with other competing algorithms, and can be found in the comparisonCode subdirectory:

Algorithms for estimating the mean of a Gaussian:

  • ransacGaussianMean.m: A RANSAC-based method
  • geoMedianGaussianMean.m: Geometric median
  • pruneGaussianMean.m: Coordinate-wise median, followed by naive pruning of distant points

Algorithms for estimating the covariance of a Gaussian:

  • pruneGaussianCov.m: Naive pruning of distant points
  • ransacMVE.m: A RANSAC-based method
    • MVE.m: Approximating the MVE for a small dataset
  • ADPCP.m: Principal Component Pursuit by Alternating Directions, from Robust Principal Component Analysis?
    • specThresh.m and shrinkage.m: Singular value thresholding and shrinkage operators
    • norm_nuc.m: Compute the nuclear norm of a matrix

Comparison files, for evaluating and comparing the performance of estimators for mean and covariance:

  • testGeoMedian.m: Demonstrates that the geometric median incurs an O(sqrt(d)) loss in accuracy
  • compareMeanEstimators.m: Compares mean estimation algorithms
  • compareCovEstimators.m: Compares covariance estimation algorithms
  • compareGenomicData.m: Compares covariance estimation algorithms to semi-synthetic genome dataset


Figures in the paper can be approximately reproduced by running the following scripts, which are in the comparisonCode subdirectory:

  • Figure 1: compareMeanEstimators.m
  • Figure 2: compareCovEstimators.m
  • Figures 3 and 4: compareGenomicData.m


This repository is an implementation of our paper Being Robust (in High Dimensions) Can Be Practical from ICML 2017, authored by Ilias Diakonikolas, Gautam Kamath, Daniel M. Kane, Jerry Li, Ankur Moitra, and Alistair Stewart.

See also our original theory paper, Robust Estimators in High Dimensions without the Computational Intractability, which appeared in FOCS 2016.

If you use our code or paper, we ask that you please cite:

  author        = {Diakonikolas, Ilias and Kamath, Gautam and Kane, Daniel M. and Li, Jerry and Moitra, Ankur and Stewart, Alistair},
  title         = {Being Robust (in High Dimensions) Can Be Practical},
  booktitle     = {Proceedings of the 34th International Conference on Machine Learning},
  series        = {ICML '17},
  year          = {2017},
  pages         = {999--1008},
  publisher     = {JMLR, Inc.}


[ICML 2017] Robust estimation of mean and covariance in high dimensions



No releases published


No packages published