Skip to content

leesael/SNeCT

master
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
 
 
img
 
 
 
 
 
 
 
 
 
 
 
 

SNeCT

Motivation: How do we integratively analyze large-scale multi-platform genomic data that are high dimensional and sparse? Furthermore, how can we incorporate prior knowledge, such as the association between genes, in the analysis systematically?

Method: To solve this problem, we propose a Scalable Network Constrained Tucker decomposition method we call SNeCT. SNeCT adopts parallel stochastic gradient descent approach on the proposed parallelizable network constrained optimization function. SNeCT decomposition is applied to tensor constructed from large scale multi-platform multi-cohort cancer data, PanCan12, constrained on a network built from PathwayCommons database.

Results: The decomposed factor matrices are applied to stratify cancers, to search for top-k similar patients, and to illustrate how the matrices can be used for personalized interpretation. In the stratification test, combined twelve-cohort data is clustered to form thirteen subclasses. The thirteen subclasses have a high correlation to tissue of origin in addition to other interesting observations, such as clear separation of OV cancers to two groups, and high clinical correlation within subclusters formed in cohorts BRCA and UCEC. In the top-k search, a new patient’s genomic profile is generated and searched against existing patients based on the factor matrices. The similarity of the top-k patient to the query is high for 23 clinical features, including estrogen/progesterone receptor statuses of BRCA patients with average precision value ranges from 0.72 to 0.86 and from 0.68 to 0.86, respectively. We also provide an illustration of how the factor matrices can be used for interpretable personalized analysis of each patient.

scheme_img

Overview

Paper

SNeCT: Integrative cancer data analysis via large scale network constrained tensor decomposition
Dongjin Choi, Lee Sael
[PDF, Supplementary material, Slides]

Code

See the code directory.

Data

Name Structure Size Number of Entries Download
PanCan12 Patient - Gene - Platform 4,555 × 14,351 × 5 183,211,020 DOWN
Pathway Gene - Gene 14,351 × 14,351 665,429 DOWN

Experiments

Download code and data for experiments in the paper.

About

SNeCT: Scalable network constrained Tucker decomposition for integrative multi-platform data analysis

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published