A matlab library for CUR matrix decomposition
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
algorithm_comparison_experiments
algorithms
aux_for_algorithm delete unnecessary Nov 5, 2014
aux_for_experiments
datasets
io
output
oversampling_experiments
plots
.gitignore
readme

readme

CUR Matrix Decomposition

The CUR matrix decomposition approximates an arbitrary data matrix by selecting a subset of columns and a subset of rows to form a low-rank approximation. This method overcomes a fundamental drawback of standard PCA analysis: that the principal components and the loading vectors are dense. Dense components and dense loadings suffer from
two main disadvantages: a loss of sparsity and reduced interpretability. The CUR decomposition is a product of three matrices: two (C and R) are tall and skinny and preserve the sparsity of the data matrix, while the third (U) is a relatively small dense matrix. Thus the CUR approxi- mation is cheaper to work with and store. The CUR decomposition can also be easier to interpret than standard PCA because the rows and columns selected can highlight special influential combinations of the data variables. This mat not be possible with a dense approximation.

This matlab package provides following algorithms for CUR matrix decomposition:
1) Naive uniform sampling algorithm
2) Subspace sampling algorithm
3) Near optimal column selection based algorithm 
4) Determinisitc unweighted column selection based algorithm
5) Randomized unweighted column selection based algorithm

./algorithms
This folder includes algorithms described above

./algorithm_comparison_experiments
This folder includes several secripts to compare different algorithms on various datasets

Some code and datasets are from
http://users.cms.caltech.edu/~gittens/nystrombestiary/
A. Gittens and M. W. Mahoney. Revisiting the Nystro ̈m method for improved large-scale ma- chine learning. arXiv preprint arXiv:1303.1849, 2013.

Code of Near optimal column selection based algorithm is from
https://sites.google.com/site/zjuwss/
S. Wang and Z. Zhang. Improving CUR matrix decomposition and the Nystro ̈m approximation via adaptive sampling. Journal of Machine Learning Research, 14(1):2729–2769, 2013.