Skip to content

LiLabAtVT/CoReg

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

33 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CoReg v1.0.1

License: GPL v2

Introduction

CoReg is a computational tool for identifying co-regulatory genes in large-scale networks. We provide CoReg as an R package, which can be applied in any organisms with genome-scale regulatory network data.

Installation

1. Install CoReg package through github

To install CoReg through github, first you will need to install 'devtools' package for R:

install.packages("devtools")

Load the devtools package:

library(devtools)

Install CoReg package through github:

install_github("LiLabAtVT/CoReg")

Quick start

This tutorial gives an example of identifying co-regulatory modules and performing rewiring simulation in Arabidopsis transcription network using CoReg package

1. Load package and network

Load CoReg package in R:

library(CoReg)

Load the arabidopsis example network(use networkFromFile() function if you want to load you own network for analysis):

data(athNet)

Below is an example of using networkFromFile() to load other network. Note that network file should be formatted as two-column edge list. First column represents the ID of transcription factors and second column represents the ID of target genes.

athNet<-networkFromFile("araNet.csv",",")

Alternatively, you can convert a two-column data frame into igraph graph object using networkFromEdgeList(). The first column of the data frame should contain the ID of transcription factors and second column their corresponding target genes.

networkFromEdgeList(edgelist)

2. Identify co-regulatory modules

Identify co-regulatory modules in arabidopsis network:

athRes<-CoReg(athNet)

Show module finding result

athRes$module

Show similarity matrix

athRes$similarity_matrix

3. Perform rewiring simulation analysis on Arabidopsis network

a. Rewiring recall score

rewSim() function performs rewiring simulation on Arabidopsis network and get rewiring recall score. CoReg will be compared to other three module-finding methods: label propagation, edge betweeness and walk trap. 50 nodes will be duplicated and rewiring probabilities will be 0.3 and 0.5.

simRes<-rewSim(athNet,nDup = 50, dDup = 10, c(0.3,0.5),c("lp","wt","eb"),2)

Show simulation result

print(simRes)
simRes$evalResult

Plot the rewiring recall score curves

plot(simRes)

b. auROC analysis

computeAuROC() function computes simulation-based AUC value and ROC curves. The network simulation process is the same as networkSim(). The AUC and ROC for CoReg + each of three similarity indices are computed:

auROCres <- computeAuROC(athNet,nDup=50,dDup=10,rewProb=0.5,simMethods=c("jaccard","geometric","invlogweighted","wt"))

Show AUC values

auROCres$AUC

Plot ROC curves

plot(auROCres)

4. Perform evaluation of different module-finding methods on simulated network

a. Compute Normalized Mutual Information (NMI) score

netSimAndEval calls function generateSimNet() to generate simulated network(s) with pre-specified modular structure, and then different module-finding algorithms are run to identify modules. The correlation between pre-specified modules and algorithm identified modules is calculated using NMI score. The following example generates a simulated network with 5 pre-specified modules. Each module has 10 regulators and each regulator has 20 targets. There are 100 other nodes in the network which do not have outgoing edges (auxiliary nodes). The co-regulation probability for the simulated network is 0.5. After network is constructed, four clustering methods (as specified by testMethods) will be run to identify modules in the simulated network.

re<-netSimAndEval(10,5,20,100,0.5,testMethods=c("coregJac","lp","wt","eb"))

See the summary of simulated network(s)

summary(re)

Plot the evaluation result

plot(re)

b. Generate a simulated network

CoReg also provides the functionality of generating the simulated network for other use. The simulated network is returned as an edge list, represented by a two column matrix in R. In the example below, we use the same paramters as we have in the previous example to generate a simulated network

re<-generateSimNet(10,5,20,100,0.5)

# This is the edge list
re$el

# This is the ground-truth module partition
re$modulePartition

Citation

Please cite the following paper if CoReg is used in your publication:
Song Q, Grene R, Heath LS, Li S: Identification of regulatory modules in genome scale transcription regulatory networks. BMC Syst Biol 2017, 11:140.
https://bmcsystbiol.biomedcentral.com/articles/10.1186/s12918-017-0493-2