PTD-CCA: Sparse Tensor Canonical Correlation Analysis (STCCA) via penalised tensor decomposition

PTD-CCA is an unsupervised dimensionality reduction method for 2 or more views/data modalities. The algorithm is a straightforward extension of the penalised matrix decomposition CCA (PMD-CCA) proposed by Witten et al. (2009) to more than 2 views and maximises the "higher-order covariance" between the linear projections X_m*w_m where each X_m is data matrix and w_m a vector of coefficients. It reduces to PMD-CCA if there are just 2 views.

EXAMPLE

Set up some synthetic data

a = [ones(20,1); -ones(20,1); zeros(60,1)];
b = [zeros(60,1); -ones(20,1); ones(20,1)];
c = [ones(20,1); zeros(60,1); -ones(20,1)];
d = [-ones(10,1); ones(10,1); zeros(60,1); -ones(10,1); ones(10,1)];
Z = rand(100,4); Z = Z./sum(Z,2);
X1 = normrnd(Z(:,1)*a',0.1);
X2 = normrnd(Z(:,2)*b',0.1);
X3 = normrnd(Z(:,3)*c',0.1);
X4 = normrnd(Z(:,4)*d',0.1);
X = {X1;X2;X3;X4};

Run PTDCCA with "intermediate" sparsity and compare the model to the ground truth.

W = PTDCCA(X,0.5);
wtrue = [a,b,c,d];
figure
for m=1:4
    subplot(2,4,m);bar(wtrue(:,m));title(sprintf('True w_%d',m))
    xlabel('variable');ylabel('coefficient')
    subplot(2,4,4+m);bar(W{m});title(sprintf('Inferred w_%d',m))
end

Sparsity can also be set for each view separately:

c = [0.05,0.25,0.75,1];
W = PTDCCA(X,c);
for m=1:4
    subplot(1,4,m);bar(W{m});title(sprintf('c = %.2f',c(m)))
    xlabel('variable');ylabel('coefficient')
end

To calculate multiple canonical variable tuples, use the name-value input D

W = PTDCCA(X,0.5,'D',3);

If you've ran the examples above, you may have noticed the function takes some time to return. Most of the running time is in fact spent calculating the cross-covariance tensor which is used to initialise the algorithm. This can be avoided by using a random initialisation instead:

W = PTDCCA(X,0.5,'initType','random');

The covariance tensor also takes up a lot of memory, and if the dimensions of the data are high enough, might exceed the largest allowed array size. If this happens, PTDCCA defaults to the random initialisation.

References

Witten, Daniela M., Robert Tibshirani, and Trevor Hastie. "A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis." Biostatistics 10.3 (2009): 515-534.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
PTDCCA.m		PTDCCA.m
PTDCCApath.m		PTDCCApath.m
PTDfromInit.m		PTDfromInit.m
README.md		README.md
crosscovten.m		crosscovten.m
lathauwerSVDs.m		lathauwerSVDs.m

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PTD-CCA: Sparse Tensor Canonical Correlation Analysis (STCCA) via penalised tensor decomposition

EXAMPLE

References

About

Releases

Packages

Languages

htpusa/PTD-CCA

Folders and files

Latest commit

History

Repository files navigation

PTD-CCA: Sparse Tensor Canonical Correlation Analysis (STCCA) via penalised tensor decomposition

EXAMPLE

References

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages