Skip to content

gywns6287/gmcNet

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

gmcNet: gene module clustering Network in WGCNA

Model summary

To identify desired gene module in WGCNA, we proposed the gmcNet. gmcNet is a GNN-based clsutering algorithm, which can cluster genes according to the co-expression topology (genes in the same module should be strongly connected) and to the single-level expression (genes in the same module should have similar expression patterns). The key innovation of gmcNet is incorporating the single-expression of genes with co-expression of their neighbor genes.

Model Input

gmcNet requries four inputs to implement unsupervised clustering. Let, is the number of genes and is the number of expression sample.

  1. : Single-expression features of genes.
  2. : Topological overlap matrix, which is created using the topological overlap measure between genes.
  3. : Topological overlap matrix, which is created only with gene pairs of positive correlation coefficient.
  4. : Topological overlap matrix, which is created only with gene pairs of neagtive correlation coefficient.

Network structure

gmcNet includes a co-expression pattern recognizer (CEPR) and module classifier.

AnyConv com__fig8 (1)

CEPR : With massage passing operation, CEPR generates the embedding feature , which accounts for single-epxression and two diffrent co-expressions in dimension.

Module classifier : Given CEPR-embedding feature , the module classifier computes module-assignment probability using a multi-layer perceptron (MLP), where is the number of modules. Finally, th-row of corresponds to module assifnment probability of gene . In other words, gene belongs to module if is the maximum value of the th-row of .

Implementation

1. Preparing

our models were implemented by tensorflow 2.3 in Python 3.8.6

1.1. Requirements

Requirements can be installed through the following command in your shell.

pip install -r [CODE PATH]/requirements.txt

1.2. Input Data

expr : gene expression data. A text file with a header line, and then one line per sample with +1 columns. The first column is gene name and others are expression values. An example file format is in data folder as sample.txt.

TOM (optional) : If you already created TOM through the R library WGCNA, you can use them for gmcNet. The three TOMs (, , ), required to implement gmcNet, must be located in one folder with the name of (whole.txt, positive.txt, negative.txt), repectively. TOM files must include -rows and -columns, and then the th-column of th-row is the topological overlap measure of gene and . You can find an example files in out/TOMs folder.

1.3. Configuration

Before excute gmcNet, you shuld set the configuration at main.py.

'betas' :  smoothing parameter for (whole, positive, negative) networks
'save_TOM' : save TOM or not in output path
'save_embed' : save embedding features or not in output path
'n_cluster' : number of cluster (k)
'epochs' : trainning epochs
'lr' : trainning learning rate
'mp_layers' : number of message passing layers
'CEPR_features' : CEPR_embedding demesions
'lambda' : balancing hyper-parameter
'Lo_thr' : orthogonal threshold
'tune_epoch' : first tunning epochs, which prevent the empty modules
'tune_lr' : learning rate for first tunning
'device' : used GPU device. if you don't use GPU, then write False

2. Execution

2.1. Without TOM file

python main.py --expr [expr] --out [out]
  1. [expr] : expr file path.
  2. [out] : Path for saving the results.

2.2. With TOM file

python main.py --expr [expr] --TOM [TOM] --out [out]
  1. [expr] : expr file path.
  2. [TOM] : Path for TOM folder including three diffrent TOM files (whole.txt, positive.txt, negative.txt).
  3. [out] : Path for saving the results.

2.3. Example-Without TOM file

python main.py --expr data/sample.txt --out out 

2.4. Example-With TOM file

python main.py --expr data/sample.txt --TOM out/TOMs --out out 

About

gene module clustering network

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages